MySQL High Availability Preamble

This document assumes:
  • You have credentials to access mysql-10[246] and InfoBlox.
  • The high-availability architecture is one master with two slaves using asynchronous replication
  • the three hosts in the cluster are,, and
  • is the hostname of the service that clients connect to and has the same IP address as the master. (by design, clients cannot connect to the slaves).

Find which servers are up and identify master and slaves

  • From anywhere, host will return an IP address which should correspond to one of: mysql-102.cs, mysql-104.cs, mysql-106.cs.
    • This is what the world thinks is the master. If things are operating normally, you can stop here.
    • If you are diagnosing problems, please proceed to determine the status of master and slaves.
  • linux.cscf# for i in mysql-102.cs mysql-104.cs mysql-106.cs ; do echo ; echo $i ; ssh $i "mysqladmin status processlist" ; done
    • Normally, all 3 hosts have a mysqld running with significant uptime.
    • The master processlist should include two slave threads, one for each slave server.
    • Each slave processlist should include two replication threads, one reading events from the master and one executing them.
  • Need to know which of mysql-10[2,4,6] is current master (if any) and/or which slaves are currently/recently slaving to.
  • If no current master, most recent master needs to be unambiguously powered down before continuing. Proceeding without verified removal of recent master risks corruption of data!
  • master, slaves - mysql> show slave status\G and look for Master_Host value. On slaves, Master_Host is recognized master. On master, Master_Host should return Empty_set
  • If slaves report different Master_Host values, stop troubleshooting and report outage to responsible CSCF staff. Continuing risks data loss.
  • If master as reported consistently by all slaves is not reachable, follow as in below Restoring Broken Old Master After Failover : # lxc-info --name and verify State: STOPPED. If not stopped, force container stop lxc-stop and recheck state. Do not proceed until former master is verified stopped.
Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r2 - 2016-05-16 - DanielAllen
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback