Slave Failure

If a slave has failed, we perform a restore-from-backups from the master.

In this example, we use mysql-104 as the example slave to recover, and mysql-102 as the example master.

The backup-machine listed below is linux.cscf.

The ${MYSQL_DATA_DIR} is /data/mysql

0.0 On the slave to recover, stop mysql.

root@mysql-104:~# service mysql stop

0.1 Verify MySQL is actually stopped. The following should return nothing.

root@mysql-104:~# ps -ef | grep '[m]ysql'

1. Clean up default databases/tables created by MySQL package installation.

root@mysql-104:~# rm -rf ${MYSQL_DATA_DIR}; mkdir -p ${MYSQL_DATA_DIR}

2. Create local space for storing backups in /tmp.

root@mysql-104:~# mkdir -p /tmp/mysql-restore
root@mysql-104:~# chmod 700 /tmp/mysql-restore

3. Verify that last backup of the node configuration data is suitable. Simple sanity checks include checking the filesize, creation time etc.

root@backup-machine:/backup/mysql/mysql-104.cs# ls -ld | grep 'etcmysql.tar' | tail -n 1

4. Copy remote slave configuration backup into local directory.

root@backup-machine:/backup/mysql/mysql-104.cs# rsync -a $(ls | grep 'etcmysql.tar' | tail -n 1) root@mysql-104:/tmp/mysql-restore/

5. Verify that last backup of the CURRENT MASTER NODE is the backup you want. Simple sanity checks include checking the filesize, creation time etc.

root@backup-machine:/backup/mysql/mysql-master# ls -ld *-innobackupex | tail -n 1

6. Copy remote master backup into local directory.

root@backup-machine:/backup/mysql/mysql-master# rsync -a $(ls | grep 'innobackupex' | tail -n 1) root@mysql-104:/tmp/mysql-restore/

7. Create temporary holding directory for configuration data.

root@mysql-104:~# mkdir /tmp/mysql-restore/config
root@mysql-104:~# chmod 700 /tmp/mysql-restore/config

8. Unpack slave configuration data.

root@mysql-104:~# tar -xf /tmp/mysql-restore/*-etcmysql.tar -C /tmp/mysql-restore/config/

9. Remove any existing MySQL configuration on the system.

root@mysql-104:~# rm -rf /etc/mysql/{.*,*}

10. Restore slave configuration data.

root@mysql-104:~# mv /tmp/mysql-restore/config/etc/mysql/{.*,*} /etc/mysql/

11. Prepare percona innobackup. This is "destructive" and will modify the backups you've copied over.

root@mysql-104:~# innobackupex --use-memory=2G --apply-log /tmp/mysql-restore/*-innobackupex

Should look more or less as follows:

root@mysql-104:/tmp/mysql-restore/config/etc/mysql# innobackupex --use-memory=2G --apply-log /tmp/mysql-restore/*-innobackupex
160315 19:09:17 innobackupex: Starting the apply-log operation

IMPORTANT: Please check that the apply-log run completes successfully.
           At the end of a successful apply-log run innobackupex
           prints "completed OK!".

innobackupex version 2.3.3 based on MySQL server 5.6.24 Linux (x86_64) (revision id: 525ca7d)
xtrabackup: cd to /tmp/mysql-restore/mysql-102-1458034021-innobackupex
xtrabackup: This target seems to be not prepared yet.
xtrabackup: xtrabackup_logfile detected: size=2097152, start_lsn=(320319246)
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 1
xtrabackup:   innodb_log_file_size = 2097152
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 1
xtrabackup:   innodb_log_file_size = 2097152
xtrabackup: Starting InnoDB instance for recovery.
xtrabackup: Using 2147483648 bytes for buffer pool (set by --use-memory parameter)
InnoDB: Using atomics to ref count buffer pool pages
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Memory barrier is not used
InnoDB: Compressed tables use zlib 1.2.8
InnoDB: Using CPU crc32 instructions
InnoDB: Initializing buffer pool, size = 2.0G
InnoDB: Completed initialization of buffer pool
InnoDB: Highest supported file format is Barracuda.
InnoDB: The log sequence numbers 319143451 and 319143451 in ibdata files do not match the log sequence number 320319246 in the ib_logfiles!
InnoDB: Database was not shutdown normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages
InnoDB: from the doublewrite buffer...
InnoDB: 128 rollback segment(s) are active.
InnoDB: Waiting for purge to start
InnoDB: 5.6.24 started; log sequence number 320319246
xtrabackup: Last MySQL binlog file position 838665, file name /var/log/mysql/binlog.000018

xtrabackup: starting shutdown with innodb_fast_shutdown = 1
InnoDB: FTS optimize thread exiting.
InnoDB: Starting shutdown...
InnoDB: Shutdown completed; log sequence number 320320828
xtrabackup: using the following InnoDB configuration for recovery:
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 50331648
InnoDB: Using atomics to ref count buffer pool pages
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Memory barrier is not used
InnoDB: Compressed tables use zlib 1.2.8
InnoDB: Using CPU crc32 instructions
InnoDB: Initializing buffer pool, size = 2.0G
InnoDB: Completed initialization of buffer pool
InnoDB: Setting log file ./ib_logfile101 size to 48 MB
InnoDB: Setting log file ./ib_logfile1 size to 48 MB
InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
InnoDB: New log files created, LSN=320320828
InnoDB: Highest supported file format is Barracuda.
InnoDB: 128 rollback segment(s) are active.
InnoDB: Waiting for purge to start
InnoDB: 5.6.24 started; log sequence number 320321036
xtrabackup: starting shutdown with innodb_fast_shutdown = 1
InnoDB: FTS optimize thread exiting.
InnoDB: Starting shutdown...
InnoDB: Shutdown completed; log sequence number 320326703
160315 19:09:21 completed OK!
root@mysql-104:/tmp/mysql-restore/config/etc/mysql#

12. Restore the prepared backup.

root@mysql-104:~# innobackupex --copy-back /tmp/mysql-restore/*-innobackupex

Should start and end as follows:

root@mysql-104:/etc/mysql# innobackupex --copy-back /tmp/mysql-restore/*-innobackupex
160315 19:11:16 innobackupex: Starting the copy-back operation

IMPORTANT: Please check that the copy-back run completes successfully.
           At the end of a successful copy-back run innobackupex
           prints "completed OK!".

innobackupex version 2.3.3 based on MySQL server 5.6.24 Linux (x86_64) (revision id: 525ca7d)
160315 19:11:16 [01] Copying ib_logfile0 to /data/mysql/ib_logfile0
160315 19:11:16 [01]        ...done
160315 19:11:16 [01] Copying ib_logfile1 to /data/mysql/ib_logfile1
160315 19:11:16 [01]        ...done
160315 19:11:16 [01] Copying ibdata1 to /data/mysql/ibdata1
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./blah/db.opt to /data/mysql/blah/db.opt
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./blah/tags.MYD to /data/mysql/blah/tags.MYD
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./blah/tags.MYI to /data/mysql/blah/tags.MYI
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./blah/tags.frm to /data/mysql/blah/tags.frm 
[...]
160315 19:11:17 [01] Copying ./sbtest/db.opt to /data/mysql/sbtest/db.opt
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./sbtest/sbtest1.frm to /data/mysql/sbtest/sbtest1.frm
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./xtrabackup_info to /data/mysql/xtrabackup_info
160315 19:11:17 [01]        ...done
160315 19:11:17 [01] Copying ./xtrabackup_binlog_pos_innodb to /data/mysql/xtrabackup_binlog_pos_innodb
160315 19:11:17 [01]        ...done
160315 19:11:17 completed OK!
root@mysql-104:/etc/mysql# 

13. Fix database file ownership and permissions.

root@mysql-104:~# chown -R mysql:mysql ${MYSQL_DATA_DIR}
root@mysql-104:~# chmod 700 ${MYSQL_DATA_DIR}

14. Remove spurious innodb logfiles.

root@mysql-104:~# rm ${MYSQL_DATA_DIR}/ib_logfile*

15. Restart MySQL service

root@mysql-104:~# service mysql restart

16. Retrieve master log position to start slaving from.

root@mysql-104:~# cat /tmp/mysql-restore/*-innobackupex/xtrabackup_binlog_info

17. Issue CHANGE MASTER command.

root@mysql-104:~# mysql -e 'CHANGE MASTER TO MASTER_HOST="${CURRENT_MASTER_FQDN}",MASTER_USER="mysql-104_slave", MASTER_PASSWORD="the_password", MASTER_LOG_FILE="${LOG_FILE_NAME}", MASTER_LOG_POS=${LOG_FILE_POS;'

18. Begin slaving.

root@mysql-104:~# mysql -e 'SLAVE START;'

19. Verify slaving.

root@mysql-104:~# mysql -e 'SHOW SLAVE STATUS\G;'

...should return: Slave_IO_Running: Yes, Slave_SQL_Running: Yes, Last_Error: (blank)

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: mysql-102.cs.uwaterloo.ca
                  Master_User: mysql-104_slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000024
          Read_Master_Log_Pos: 49711
               Relay_Log_File: mysqld-relay-bin.000009
                Relay_Log_Pos: 49854
        Relay_Master_Log_File: binlog.000024
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 49711
              Relay_Log_Space: 50154
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 102
1 row in set (0.00 sec)

If Seconds_Behind_Master: is not 0, wait a few seconds and re-run the command until it is 0, to ensure that it catches up with the master properly.

Topic revision: r2 - 2016-04-12 - DanielAllen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback