Skipping Errors in MySQL Asynchronous Replication
When a replica halts because the SQL thread encounters an error, you can resume replication by skipping the problematic event(s). Two common approaches are available.
Methods to Skip Errors
1) Skip a specific number of events
Use SQL_SLAVE_SKIP_COUNTER to advance the SQL thread past N events.
- Skip one event:
mysql> STOP SLAVE SQL_THREAD;
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
mysql> START SLAVE SQL_THREAD;
You may also use STOP SLAVE and START SLAVE, but stopping only the SQL thread is sufficient.
2) Configure the replica to ignore certain error codes
Add slave_skip_errors to the mysqld configuration to ignore all or specific errors and keep replication runing.
# /etc/my.cnf
[mysqld]
# Ignore specific error codes
# slave-skip-errors = 1062,1053,1146
# Or ignore all errors (not recommended)
# slave-skip-errors = all
Restart MySQL to the chenge to take effect.
Example: Reproducing and Skipping a Failure
Environment with master–replica configured:
- Master: 192.168.247.128
- Replica: 192.168.247.130
- MySQL: 5.6.14
- binlog-do-db = mydb
On the master, run the following to provoke an error on the replica:
mysql> USE mysql;
mysql> CREATE TABLE t1 (id INT);
mysql> USE mydb;
mysql> INSERT INTO mysql.t1 SELECT 1;
On the replica, check status:
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.247.128
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000017
Read_Master_Log_Pos: 2341
Relay_Log_File: DBtest1-relay-bin.000011
Relay_Log_Pos: 494
Relay_Master_Log_File: mysql-bin.000017
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1146
Last_Error: Error 'Table 'mysql.t1' doesn't exist' on query. Default database: 'mydb'. Query: 'insert into mysql.t1 select 1'
Exec_Master_Log_Pos: 1919
Seconds_Behind_Master: NULL
...
Because only mydb is included in the binary log (binlog-do-db = mydb), the statement "use mydb; insert into mysql.t1 …" is logged and replicaetd, but the table mysql.t1 is not present on the replica, causing error 1146.
Inspect the master’s binary log to find how many event to skip:
mysql> SHOW BINLOG EVENTS IN 'mysql-bin.000017' FROM 1919\G
*************************** 1. row ***************************
Log_name: mysql-bin.000017
Pos: 1919
Event_type: Query
Server_id: 1
End_log_pos: 1999
Info: BEGIN
*************************** 2. row ***************************
Log_name: mysql-bin.000017
Pos: 1999
Event_type: Query
Server_id: 1
End_log_pos: 2103
Info: use `mydb`; insert into mysql.t1 select 1
*************************** 3. row ***************************
Log_name: mysql-bin.000017
Pos: 2103
Event_type: Xid
Server_id: 1
End_log_pos: 2134
Info: COMMIT /* xid=106 */
*************************** 4. row ***************************
Log_name: mysql-bin.000017
Pos: 2134
Event_type: Query
Server_id: 1
End_log_pos: 2213
Info: BEGIN
*************************** 5. row ***************************
Log_name: mysql-bin.000017
Pos: 2213
Event_type: Query
Server_id: 1
End_log_pos: 2310
Info: use `mydb`; insert into t1 select 9
*************************** 6. row ***************************
Log_name: mysql-bin.000017
Pos: 2310
Event_type: Xid
Server_id: 1
End_log_pos: 2341
Info: COMMIT /* xid=107 */
The failing transaction consists of the INSERT and its COMMIT, so skip two events.
Skip and resume replication:
mysql> STOP SLAVE SQL_THREAD;
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 2;
mysql> START SLAVE SQL_THREAD;
Verify that the replica is running:
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.247.128
Master_User: repl
Master_Port: 3306
Master_Log_File: mysql-bin.000017
Read_Master_Log_Pos: 3613
Relay_Master_Log_File: mysql-bin.000017
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Exec_Master_Log_Pos: 3613
Seconds_Behind_Master: 0
Last_SQL_Errno: 0
Last_SQL_Error:
...