MySQL Data Migration Techniques in Version 8.0.25
This article focuses on logical data migration approaches in MySQL 8.0.25, particularly for small to medium-sized datasets (under 100MB).
Recently, I installed MySQL 8.0.25 64-bit on Windows.
It's evident that Oracle hasn't invested as much in MySQL as they have in their flagship Oracle database, which has nearly 40 years of development. MySQL has only seen significant growth in the past decade.
Oracle offers excellent tools like impdp and expdp for data migration. While MySQL doesn't have equivalent tools yet, mysqldump has become increasingly sophisticated and might someday match Oracle's offerings.
Using mysqldump with mysqlimport
Previuosly, MySQL was criticized for its limitations in logical data migration, but improvements have been made. For instance, using mysqldump with source or load commands can accelerate the migration/replication process.
The -e or --extended-insert option is particularly useful:
-e, --extended-insert
Use multiple-row INSERT syntax that include several
VALUES lists.
(Defaults to on; use --skip-extended-insert to disable.)
In MySQL 8.0.25, this option is enabled by default (though the exact version when this became the default is uncertain).
This option significantly speeds up data import operations.
However, this isn't the fastest method available. Similar to Oracle's sqlldr, MySQL offers the LOAD DATA command or mysqlimport for faster imports.
The following mysqldump options allow exporting data to text format:
--fields-terminated-by=name
Fields in the output file are terminated by the given
string.
--fields-enclosed-by=name
Fields in the output file are enclosed by the given
character.
--fields-optionally-enclosed-by=name
Fields in the output file are optionally enclosed by the
given character.
--fields-escaped-by=name
Fields in the output file are escaped by the given
character.
To import the exported text files, refer to: https://blog.csdn.net/u012815136/article/details/88953289
Or consult the official documentation:
Here's the LOAD DATA command syntax:
LOAD DATA
[LOW_PRIORITY | CONCURRENT] [LOCAL]
INFILE 'file_name'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[PARTITION (partition_name [, partition_name] ...)]
[CHARACTER SET charset_name]
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
[IGNORE number {LINES | ROWS}]
[(col_name_or_user_var
[, col_name_or_user_var] ...)]
[SET col_name={expr | DEFAULT}
[, col_name={expr | DEFAULT}] ...]
For bulk migration of multiple tables, batch scripting is necessary. If you're familiar with Python, you can write a reusable script and install the Python environment, similar to Java development.
mysqldump and mysqlimport Practice
Exporting and importing text files:
Export the rap10.table_keywordvalue table, with | as column delimiter, ` as field enclosure, and newline as row separator:
mysqldump -h localhost -u root -p -P 7799 --databases rap10 --tables table_keywordvalue --tab=d:\temp\mysql\dump --fields-terminated-by="|" --fields-enclosed-by="`" --lines-terminated-by=0x0d0a
This generates two files in d:\temp\mysql\dump. The SQL portion contains:
DROP TABLE IF EXISTS `table_keywordvalue`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `table_keywordvalue` (
`id` int NOT NULL AUTO_INCREMENT,
`custom_id` bigint unsigned NOT NULL,
`table_id` int NOT NULL COMMENT 'Custom table ID',
`db_table_name` varchar(100) NOT NULL COMMENT 'Custom table database name in cxxx_*** format - e.g., c101_students',
`user_id` int NOT NULL COMMENT 'User ID who added this record',
`last_optime` varchar(19) DEFAULT NULL COMMENT 'Last modification time - can be creation or modification time',
`keywordvalue` varchar(900) DEFAULT NULL COMMENT 'Keyword value',
UNIQUE KEY `idx_table_keywordvalue_id` (`id`),
KEY `idx_custom_id` (`custom_id`),
KEY `idx_tablekeyvalue_tableid` (`table_id`)
) ENGINE=InnoDB AUTO_INCREMENT=55830 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='Auto form keywords for accelerating pending task queries';
The text content looks like:
`241`|`264`|`117`|`C117_scores`|`1`|`2019/12/27 15:28:20`|`adfaf`
`243`|`267`|`117`|`C117_scores`|`1`|`2019/12/27 16:01:52`|`test`
Create a batch script (my.bat) to test:
@echo off
for /f "tokens=1-3 delims=-/ " %%1 in ("%date%") do set ddd=%%1/%%2/%%3
for /f "tokens=1-4 delims=.: " %%1 in ("%time%") do set tttt=%%1:%%2:%%3
Set DT=%ddd% %tttt%
echo Start time: %DT%
mysqlimport -h localhost -u"root" -p"123" -P 7799 --fields-terminated-by="|" --fields-enclosed-by="`" --lines-terminated-by=0x0d0a mysqldata d:\temp\mysql\dump\table_keywordvalue.txt
mysql -h localhost -u"root" -p"123" -P 7799 <check.sql
for /f "tokens=1-3 delims=-/ " %%1 in ("%date%") do set ddd=%%1/%%2/%%3
for /f "tokens=1-4 delims=.: " %%1 in ("%time%") do set tttt=%%1:%%2:%%3
Set DT=%ddd% %tttt%
echo End time: %DT%
@echo on
The check.sql content:
use mysqldata;
select count(*) from table_keywordvalue;
exit
Note: "exit" must be followed by a newline.
After executing my.bat in the d:\temp\mysql\dump directory, the results are:
D:\temp\mysql\dump>my.bat
Start time: 2021/09/15 22:26:40
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqldata.table_keywordvalue: Records: 22782 Deleted: 0 Skipped: 0 Warnings: 0
mysql: [Warning] Using a password on the command line interface can be insecure.
count(*)
22782
End time: 2021/09/15 22:26:43
The total execution time is approximately 3-4 seconds, significantly faster than executing 22,782 individual SQL statements!
mysqlimport has many parameters, and understanding them all would require a small book.
While mysqldump with mysqlimport works well for one or two large tables, the process can be cumbersome. MySQL has introduced a tool similar to Oracle's expdp/impdp: mysqlpump.
Compared to the dump+import combination, the pump tool is more convenient. While mysqlimport requires scripts for multiple files, these scripts are easier to write on Linux.
mysqlpump
This tool is similar to mysqldump but offers improvements:
- Parallel processing
- Better selection of objects to export: dataabses, tables, accounts, procedures, views
- When exporting accounts, it generates account management statements rather than INSERT statements into system tables
- Output compression
- Progress indicators (estimated)
- Allows loading data before creating secondary indexes to speed up the data loading process
Here's an export example:
mysqlpump -h localhost -u"root" -p"123" -P 7799 rap10 table_keywordvalue --extended-insert=103 --compress --default-parallelism=5 --no-create-db --no-create-info --defer-table-indexes --result-file=d:\temp\mysql\pump\p.sql
The export process is fast, taking about 2-3 seconds.
The output resembles mysqldump's, though a limitation is that views cannot be excluded.
-- Dump created by MySQL pump utility, version: 8.0.25, Win64 (x86_64)
-- Dump start time: Thu Sep 16 23:05:53 2021
-- Server version: 8.0.25
SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE;
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET @@SESSION.SQL_LOG_BIN= 0;
SET @OLD_TIME_ZONE=@@TIME_ZONE;
SET TIME_ZONE='+00:00';
SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT;
SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS;
SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION;
SET NAMES utf8mb4;
INSERT INTO xxxxxxxxxxx
....
...
-- View creation statements
..... (omitted for brevity)
SET TIME_ZONE=@OLD_TIME_ZONE;
SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT;
SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS;
SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;
SET SQL_MODE=@OLD_SQL_MODE;
Several parameters can enhance performance, depending on your environment:
- compress
- extended-insert
- default-parallelism
An example for importing data will be provided later.
Comparing these tools, MySQL still has some catching up to do with Oracle. Oracle offers nearly perfect logical backup and recovery tools like expdp and impdp, while MySQL (at least with its built-in tools) hasn't reached that level yet.