Problem Diagnosis

There are various failure modes of the MySQL cluster; this section will help identify which recovery method to use. The scenarios below are listed in decreasing order of emergency.

Nagios reports Master is down

The automated Nagios check may report that the master is down. This is likely an emergency- all operations that rely on our mysql database will fail until this is fixed. See MySQLHAMasterFailure for recovery steps.

Applications are failing to connect to mysql.cs

Production applications could fail with messages such as "Cannot connect to database mysql.cs.uwaterloo.ca". This is likely an emergency- unless there are network issues with those applications, you can assume all operations that rely on our mysql database will fail until this is fixed. See MySQLHAMasterFailure for recovery steps.

Nagios reports Master is out of sync with both slaves

The automated Nagios check may report that the master is out of sync with both slaves. This is an intermediate-level emergency; recovery can wait until the next business day. You may treat this situation as if both of the slaves are down. See MySQLHASlaveFailure for recovery steps.

Nagios reports Master is out of sync with a slave

The automated Nagios check may report that the master is out of sync with one slave. This is not a time-criticial emergency, and recovery can wait until the next business day. You may treat this situation as if the slave is down. See MySQLHASlaveFailure for recovery steps.

Nagios reports Slave is down

The automated Nagios check may report that a slave is down. This is not a time-criticial emergency, and recovery can wait until the next business day. See MySQLHASlaveFailure for recovery steps.

Nagios reports Master is out of sync with a slave

The automated Nagios check may report that the master is out of sync with one slave. This is not a time-criticial emergency, and recovery can wait until the next business day. You may treat this situation as if the slave is down. See MySQLHASlaveFailure for recovery steps.
Topic revision: r2 - 2016-04-12 - DanielAllen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback