mysql版本5.7,故障原因是主库添加了一个用户,而此时从库也有这个用户,所以导致同步的事务卡在了创建用户的地方,报错信息类似这样:
[root@mysql.sock][(none)]>show slave status\G;
Last_IO_Error:
Last_SQL_Errno: 1396
Last_SQL_Error: Error 'Operation CREATE USER failed for 'opuser'@'10.10.10.181'' on query. Default database: ''. Query: 'CREATE USER 'opuser'@'10.10.10.181' IDENTIFIED WITH 'mysql_native_password' AS '*A7F30BB92681B6D632E371A2BC8E92FA1F6CDA4C''
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2403
Master_UUID: a6f2c069-1f76-11ef-8547-063e77894ca5
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 240601 03:08:34
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a6f2c069-1f76-11ef-8547-063e77894ca5:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-141888
Executed_Gtid_Set: 2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-4
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
解决办法是可以跳过所有1396错误代码的错误,或者是跳过自定的gtid,这里就跳过gtid为例,跳过gtid的方法是将gtid_purged这个值设置为需要跳过的gtid,但是gtid_purged只有为空的情况下才能设置,所以要先将gtid_purged设置为空,操作之前需要将我们要跳过的gtid全部记录下来
1、记录master上需要跳过的gtid
在master上执行
[root@mysql.sock][(none)]>show global variables like '%gtid%';
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
| binlog_gtid_simple_recovery | ON |
| enforce_gtid_consistency | ON |
| gtid_executed | a6f2c069-1f76-11ef-8547-063e77894ca5:1-2,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-143032 |
| gtid_executed_compression_period | 1000 |
| gtid_mode | ON |
| gtid_owned | |
| gtid_purged | eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2 |
| session_track_gtids | OFF |
+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------+
记录下gtid_executed和gtid_purged中的gtid值
2、记录slave上需要跳过的gtid
在slave上执行
[root@mysql.sock][(none)]>show global variables like '%gtid%';
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------+
| binlog_gtid_simple_recovery | ON |
| enforce_gtid_consistency | ON |
| gtid_executed | 2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-4 |
| gtid_executed_compression_period | 1000 |
| gtid_mode | ON |
| gtid_owned | |
| gtid_purged | 2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-4 |
| session_track_gtids | OFF |
+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------+
同样记录下gtid_executed和gtid_purged中的gtid值
到此我们收集到的gtid值如下:
# master
a6f2c069-1f76-11ef-8547-063e77894ca5:1-2,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-143032,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2
# slave
2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-4,
2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-4
将master和slave合并起来并去重之后得到这些
a6f2c069-1f76-11ef-8547-063e77894ca5:1-2,
eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2,
fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-143032,
2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
# fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-4这个包含在了fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-143032里面,所以只保留一个即可
3、跳过指定gtid
接下来全部是在slave上操作
reset master;可以将gtid_purged的值清空
[root@mysql.sock][(none)]>reset master;
Query OK, 0 rows affected (0.01 sec)
[root@mysql.sock][(none)]>
[root@mysql.sock][(none)]>show global variables like '%gtid%';
+----------------------------------+-------+
| Variable_name | Value |
+----------------------------------+-------+
| binlog_gtid_simple_recovery | ON |
| enforce_gtid_consistency | ON |
| gtid_executed | |
| gtid_executed_compression_period | 1000 |
| gtid_mode | ON |
| gtid_owned | |
| gtid_purged | |
| session_track_gtids | OFF |
+----------------------------------+-------+
8 rows in set (0.00 sec)
[root@mysql.sock][(none)]>
设置gtid_purged需要跳过的gtid
[root@mysql.sock][(none)]>stop slave;
Query OK, 0 rows affected (0.01 sec)
[root@mysql.sock][(none)]>reset slave all;
Query OK, 0 rows affected (0.02 sec)
[root@mysql.sock][(none)]>set global gtid_purged='a6f2c069-1f76-11ef-8547-063e77894ca5:1-2,
'> fe8c6f3a-1bf2-11ef-9150-06aa72156277:1-143032,
'> 2ccb3148-1bf3-11ef-b50c-0a1976ca006f:1-889,
'> eca2ddf0-6097-11eb-8bee-0aed26cb74ee:1-2‘;
Query OK, 0 rows affected (0.00 sec)
[root@mysql.sock][(none)]>
[root@mysql.sock][(none)]>CHANGE MASTER TO MASTER_HOST='10.10.10.180',master_port=3306,MASTER_USER='repl',MASTER_PASSWORD='XXXXXXXXXXXXXXXX', master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
[root@mysql.sock][(none)]>start slave;
Query OK, 0 rows affected (0.01 sec)
[root@mysql.sock][(none)]>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.10.10.180
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000007
Read_Master_Log_Pos: 33760917
Relay_Log_File: relay.000002
Relay_Log_Pos: 43785
Relay_Master_Log_File: binlog.000007
Slave_IO_Running: Yes
Slave_SQL_Running: Yes