Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

MySQL Data Migration Techniques in Version 8.0.25

Tech May 16 1

This article focuses on logical data migration approaches in MySQL 8.0.25, particularly for small to medium-sized datasets (under 100MB).

Recently, I installed MySQL 8.0.25 64-bit on Windows.

It's evident that Oracle hasn't invested as much in MySQL as they have in their flagship Oracle database, which has nearly 40 years of development. MySQL has only seen significant growth in the past decade.

Oracle offers excellent tools like impdp and expdp for data migration. While MySQL doesn't have equivalent tools yet, mysqldump has become increasingly sophisticated and might someday match Oracle's offerings.

Using mysqldump with mysqlimport

Previuosly, MySQL was criticized for its limitations in logical data migration, but improvements have been made. For instance, using mysqldump with source or load commands can accelerate the migration/replication process.

The -e or --extended-insert option is particularly useful:

-e, --extended-insert
  Use multiple-row INSERT syntax that include several
  VALUES lists.
  (Defaults to on; use --skip-extended-insert to disable.)

In MySQL 8.0.25, this option is enabled by default (though the exact version when this became the default is uncertain).

This option significantly speeds up data import operations.

However, this isn't the fastest method available. Similar to Oracle's sqlldr, MySQL offers the LOAD DATA command or mysqlimport for faster imports.

The following mysqldump options allow exporting data to text format:

--fields-terminated-by=name
  Fields in the output file are terminated by the given
  string.

--fields-enclosed-by=name
  Fields in the output file are enclosed by the given
  character.

--fields-optionally-enclosed-by=name
  Fields in the output file are optionally enclosed by the
  given character.

--fields-escaped-by=name
  Fields in the output file are escaped by the given
  character.

To import the exported text files, refer to: https://blog.csdn.net/u012815136/article/details/88953289

Or consult the official documentation:

Here's the LOAD DATA command syntax:

LOAD DATA
  [LOW_PRIORITY | CONCURRENT] [LOCAL]
  INFILE 'file_name'
  [REPLACE | IGNORE]
  INTO TABLE tbl_name
  [PARTITION (partition_name [, partition_name] ...)]
  [CHARACTER SET charset_name]
  [{FIELDS | COLUMNS}
    [TERMINATED BY 'string']
    [[OPTIONALLY] ENCLOSED BY 'char']
    [ESCAPED BY 'char']
  ]
  [LINES
    [STARTING BY 'string']
    [TERMINATED BY 'string']
  ]
  [IGNORE number {LINES | ROWS}]
  [(col_name_or_user_var
    [, col_name_or_user_var] ...)]
  [SET col_name={expr | DEFAULT}
    [, col_name={expr | DEFAULT}] ...]

For bulk migration of multiple tables, batch scripting is necessary. If you're familiar with Python, you can write a reusable script and install the Python environment, similar to Java development.

mysqldump and mysqlimport Practice

Exporting and importing text files:

Export the rap10.table_keywordvalue table, with | as column delimiter, ` as field enclosure, and newline as row separator:

mysqldump -h localhost -u root -p -P 7799 --databases rap10  --tables table_keywordvalue  --tab=d:\temp\mysql\dump --fields-terminated-by="|" --fields-enclosed-by="`" --lines-terminated-by=0x0d0a

This generates two files in d:\temp\mysql\dump. The SQL portion contains:

DROP TABLE IF EXISTS `table_keywordvalue`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `table_keywordvalue` (
  `id` int NOT NULL AUTO_INCREMENT,
  `custom_id` bigint unsigned NOT NULL,
  `table_id` int NOT NULL COMMENT 'Custom table ID',
  `db_table_name` varchar(100) NOT NULL COMMENT 'Custom table database name in cxxx_*** format - e.g., c101_students',
  `user_id` int NOT NULL COMMENT 'User ID who added this record',
  `last_optime` varchar(19) DEFAULT NULL COMMENT 'Last modification time - can be creation or modification time',
  `keywordvalue` varchar(900) DEFAULT NULL COMMENT 'Keyword value',
  UNIQUE KEY `idx_table_keywordvalue_id` (`id`),
  KEY `idx_custom_id` (`custom_id`),
  KEY `idx_tablekeyvalue_tableid` (`table_id`)
) ENGINE=InnoDB AUTO_INCREMENT=55830 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='Auto form keywords for accelerating pending task queries';

The text content looks like:

`241`|`264`|`117`|`C117_scores`|`1`|`2019/12/27 15:28:20`|`adfaf`
`243`|`267`|`117`|`C117_scores`|`1`|`2019/12/27 16:01:52`|`test`

Create a batch script (my.bat) to test:

@echo off
for /f "tokens=1-3 delims=-/ " %%1 in ("%date%") do set ddd=%%1/%%2/%%3
for /f "tokens=1-4 delims=.: " %%1 in ("%time%") do set tttt=%%1:%%2:%%3
Set DT=%ddd% %tttt%
echo Start time: %DT%
mysqlimport -h localhost -u"root" -p"123" -P 7799   --fields-terminated-by="|" --fields-enclosed-by="`" --lines-terminated-by=0x0d0a  mysqldata d:\temp\mysql\dump\table_keywordvalue.txt
mysql -h localhost -u"root" -p"123" -P 7799 <check.sql
for /f "tokens=1-3 delims=-/ " %%1 in ("%date%") do set ddd=%%1/%%2/%%3
for /f "tokens=1-4 delims=.: " %%1 in ("%time%") do set tttt=%%1:%%2:%%3
Set DT=%ddd% %tttt%
echo End time: %DT%
@echo on

The check.sql content:

use mysqldata;
select count(*) from table_keywordvalue;
exit

Note: "exit" must be followed by a newline.

After executing my.bat in the d:\temp\mysql\dump directory, the results are:

D:\temp\mysql\dump>my.bat
Start time: 2021/09/15 22:26:40
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqldata.table_keywordvalue: Records: 22782 Deleted: 0 Skipped: 0 Warnings: 0
mysql: [Warning] Using a password on the command line interface can be insecure.
count(*)
22782
End time: 2021/09/15 22:26:43

The total execution time is approximately 3-4 seconds, significantly faster than executing 22,782 individual SQL statements!

mysqlimport has many parameters, and understanding them all would require a small book.


While mysqldump with mysqlimport works well for one or two large tables, the process can be cumbersome. MySQL has introduced a tool similar to Oracle's expdp/impdp: mysqlpump.

Compared to the dump+import combination, the pump tool is more convenient. While mysqlimport requires scripts for multiple files, these scripts are easier to write on Linux.


mysqlpump

This tool is similar to mysqldump but offers improvements:

  1. Parallel processing
  2. Better selection of objects to export: dataabses, tables, accounts, procedures, views
  3. When exporting accounts, it generates account management statements rather than INSERT statements into system tables
  4. Output compression
  5. Progress indicators (estimated)
  6. Allows loading data before creating secondary indexes to speed up the data loading process

Here's an export example:

mysqlpump -h localhost -u"root" -p"123" -P 7799 rap10 table_keywordvalue --extended-insert=103 --compress --default-parallelism=5  --no-create-db --no-create-info --defer-table-indexes  --result-file=d:\temp\mysql\pump\p.sql

The export process is fast, taking about 2-3 seconds.

The output resembles mysqldump's, though a limitation is that views cannot be excluded.

-- Dump created by MySQL pump utility, version: 8.0.25, Win64 (x86_64)
-- Dump start time: Thu Sep 16 23:05:53 2021
-- Server version: 8.0.25

SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE;
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
SET @@SESSION.SQL_LOG_BIN= 0;
SET @OLD_TIME_ZONE=@@TIME_ZONE;
SET TIME_ZONE='+00:00';
SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT;
SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS;
SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION;
SET NAMES utf8mb4;
INSERT INTO  xxxxxxxxxxx
....
...
-- View creation statements
..... (omitted for brevity)
SET TIME_ZONE=@OLD_TIME_ZONE;
SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT;
SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS;
SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;
SET SQL_MODE=@OLD_SQL_MODE;

Several parameters can enhance performance, depending on your environment:

  • compress
  • extended-insert
  • default-parallelism

An example for importing data will be provided later.

Comparing these tools, MySQL still has some catching up to do with Oracle. Oracle offers nearly perfect logical backup and recovery tools like expdp and impdp, while MySQL (at least with its built-in tools) hasn't reached that level yet.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.