Home > Tech > Content

Common Causes of Data Discrepancies in MySQL Replication

Tech May 18 2

MySQL replication relies on logical SQL statement replication. During this process, the executed SQL statements are validated, but the overall structure (such as tables and databases) is not checked. In current high availability architectures, the load and performance of a single node can affect the speed of Relay Log replay. If this exceeds capacity, delays are inevitable.

In such cases, during master-slave switching, data discrepancies may occur, potentially leading to severe data inconsistencies. For core enterprise systems, data inconsistency is a critical issue.

Causes of Data Discrepancies

Data inconsistency in master-slave replication can arise from various factors. Based on known cases, the following are common causes:

1. Write Operations on the Slave

This can happen in multiple scenarios. For example, if the VIP used by the front-end application experiences network heartbeat issues, it might switch between the master and slave nodes, causing writes on both sides—commonly referred to as a split-brain scenario.
In dual-master setups, both nodes may perform write operations for load balancing or business requirements. However, if the replication process is interrupted or delayed, both nodes may write the same data, resulting in replication failure and increasing the risk of data inconsistency.
Manual operations on the slave node, such as accidental modifications, can also cause discrepancies.

2. Non-Row Format Binary Logs

Binary logs can be in statement or mixed format, which may lead to unintended changes in data due to SQL statements being applied differently.

3. Stored Procedures or Triggers

Stored procedures or triggers may produce different results due to their local execution characteristics, especially with time-based or auto-increment fields.

auto increment, datetime, timestamp

4. Mismatched sql_mode Settings

Differences in sql_mode settings can lead to varying data handling. For example, the STRICT_TRANS_TABLES mode enforces strict validation, and mismatched settings can result in data truncation and inconsistency.

mysql> CREATE TABLE `t1` (
       `id` int NOT NULL,
       `name` varchar(10) ,
       PRIMARY KEY (`id`)
     ) ENGINE=InnoDB ;
Query OK, 0 rows affected (0.01 sec)

mysql> show variables like '%sql_mode%';
+---------------+-----------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                                                 |
+---------------+-----------------------------------------------------------------------------------------------------------------------+
| sql_mode      | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION |
+---------------+-----------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> set sql_mode="";
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1(id,name) values(1,"hello beijing nice to meet you");
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from t1;
+----+------------+
| id | name       |
+----+------------+
|  1 | hello beij |
+----+------------+
1 row in set (0.00 sec)

5. Skipping Replication Errors

Using options like slave_skip_errors, sql_slave_skip_counter, or gtid_next can skip transactions, wich means they are not applied on the slave, potentially leading to data loss.

sql_slave_skip_counter: Skips N events after the current one in position-based replication, useful for recovering from errors caused by specific SQL statements.
gtid_next: Manually skips transactions when there are GTID conflicts, ensuring precise GTID information and transaction numbers.
slave_skip_errors: Skips errors listed in the configuration, allowing replication to continue but risking data inconsistency.

Example configuration:

[mysqld]
slave_skip_errors=1062,1053,1032

6. Double 1 Configuration

The sync_binlog and innodb_flush_log_at_trx_commit parameters control how logs are written to disk. If these are not set to 1, data inconsistency can occur in case of crashes.

7. Semi-Synchronous Replication

Improper configuration of semi-synchronous replication can introduce data inconsistency risks. The paramter rpl_semi_sync_master_wait_point determines whether the master waits for an acknowledgment after committing or before committing.

8. Replication Filtering Rules

Replication filtering rules such as ignore/do/rewrite can cause issues when cross-schema operations are performed.

9. binlog_row_image Settings

The binlog_row_image parameter defines what data is recorded in the binary log. It can be set to FULL, MINIMAL, or NOBLOB.

Value	Description
full	Records all before and after images regardless of primary key constraints.
minimal	Retains only primary key columns in the before image and modified columns in the after image if a primary key exists. Otherwise, retains all before image and modified columns in the after image.
noblob	Excludes text/blob columns from before and after images if a primary key exists. Otherwise, retains all before and after images.

10. sql_log_bin Parameter

The sql_log_bin parameter controls whether changes are logged in the binary log. If disabled, changes won't be replicated to the slave.

mysql> SET @@SESSION.SQL_LOG_BIN= 0;
Query OK, 0 rows affected (0.00 sec)
mysql> update t1 set age=10 where id =4;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql> SET @@SESSION.SQL_LOG_BIN= 1;

11. Schema Inconsistencies

Discrepancies in table structures, such as differences in primary keys, unique indexes, or column types, can lead to unexpected issues.

12. Foreign Key Constraints

If foreign_key_checks is disabled on the slave, foreign key validation may not occur during SQL playback.

13. Backup Consistency Loss

Inconsistent backups, such as those taken without using single-transaction or master-data options, can result in replication issues.

14. Version Mismatches

Incompatibilities between the master and slave versions can lead to functional mismatches, especially when the master uses features unsupported by the slave.

15. Internal Bugs

MySQL bugs can also cause replication inconsistencies. For example, even if the slave processes are runing, replication may appear stuck.

Mitigating Data Discrepancies

Despite widespread use of MySQL, data inconsistencies still occur. These can stem from individual conditions or combinations of multiple factors. Timely detection and resolution are essential.

Preventing Inconsistencies

- Avoid multi-node writes; ensure business logic prevents conflicts.
- Avoid using `slave_skip_errors` unless absolutely necessary.
- Use double 1 settings for data safety.
- Set the slave to read-only mode.
- Enable semi-synchronous replication with `after_sync` mode.
- Use row-based binary logs and `binlog_row_image` set to `FULL`.
- Be cautious when promoting a slave to master during replication delays.
- Avoid cross-schema operations in replication filters.
- Limit the use of stored procedures and triggers that depend on special fields.
- Enforce strict `sql_mode` settings.
- Ensure consistent table structures between master and slave.
- Guarantee consistency during backup and recovery.

Detecting Inconsistencies

Regular monitoring and periodic consistency checks are crucial for identifying discrepancies early.

Fixing Data Discrepancies

1. Rebuild the Slave

Rebuilding the slave is slow and may impact query performance. However, it's often the most reliable solution when inconsistencies occur.

2. Use Percona Toolkit

Tools like pt-table-checksum and pt-table-sync help detect and resolve inconsistencies quickly without interrupting replication.

3. Manual Fixes

Manual repairs require temporarily stopping replication and addressing inconsistencies based on error messages.

Other than these, regular backups are essential. If all other methods fail, restoring from a backup is the final option.

Tags: MySQL replication

Back to List

Prev: Implementing Spring HTTP Invoker for Remote Service Communication

Next: Java Thread Creation Methods

Fading Coder

Common Causes of Data Discrepancies in MySQL Replication

Causes of Data Discrepancies

1. Write Operations on the Slave

2. Non-Row Format Binary Logs

3. Stored Procedures or Triggers

4. Mismatched sql_mode Settings

5. Skipping Replication Errors

6. Double 1 Configuration

7. Semi-Synchronous Replication

8. Replication Filtering Rules

9. binlog_row_image Settings

10. sql_log_bin Parameter

11. Schema Inconsistencies

12. Foreign Key Constraints

13. Backup Consistency Loss

14. Version Mismatches

15. Internal Bugs

Mitigating Data Discrepancies

Preventing Inconsistencies

Detecting Inconsistencies

Fixing Data Discrepancies

1. Rebuild the Slave

2. Use Percona Toolkit

3. Manual Fixes

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor