Normal operation
Nagios should accurately report that the cluster is operating normally. In case you would like additional verification, the following are manual diagnostics to be certain of normal operation.
On denoted systems {master, slaves}:
- master, slaves -
# systemctl status mysql
(or service mysql status
on older servers) shows mysql service is running.
- master, slaves -
# ps -lfp `cat /run/mysqld/mysqld.pid`
shows mysqld is running.
- master, slaves -
# lsof -i | grep LISTEN | grep mysql
shows mysqld listening on the correct interface(s).
- master, slaves -
# mysql -e 'show variables like "read_only"'
shows read_only is OFF for master and ON for slaves.
- master, slaves -
# mysqladmin status
shows reasonable values. Almost all queries on a slave correspond to write operations on the master.
- master, slaves -
# mysqladmin processlist | sed 's/^| //;s/ | / /g;s/[-+| ]*$//'
on the master shows a slave thread for each slave (plus lots of client threads). On each slave it shows an I/O thread and an execution thread.
- master -
# mysql -e 'show master status'
shows the binlog file and position. Other fields should be empty.
- slaves -
# mysql -e 'show slave status\G'
shows approximately 40 lines of output. Especially useful are:
Slave_IO_State: Waiting for master to send event
Master_Host: ...
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Last_Error:
Seconds_Behind_Master: 0
- master, slaves -
# ls -ld `grep -hr '^[[:blank:]]*log_bin[[:blank:]]*=' /etc/mysql | sed 's/[^=]*=//'`*
lists the binary log files. binlog.index
is updated when the log is rolled, typically at least once per day. The most recent file is appended with every database write. In our current environment that's typically every second.
- master, slaves -
# ls -l /var/log/mysql/
lists the error log files. They usually contain just periodic status info from cron and messages about backups. When mysqld is restarted, they contain a standard sequence of shutdown and startup messages. There should be no errors and almost no warnings. Use # tail /var/log/mysql/error.log
to check the most recent status. Use # zgrep -Ehv 'Status:|Vitals:|TopTableRows:|Warning.*REPLACE INTO `percona`.`checksums`' /var/log/mysql/error.log*
to ignore status messages and pt-table-checksum warnings.
- master -
# pt-table-checksum --defaults-file=/etc/mysql/pt_checksum.cnf --replicate=percona.checksums --ignore-databases mysql --no-check-binlog-format
checks that the master and slaves have the same data. In our current environment this takes 2 hours to run.