OceanBase学习之---MySQL迁移数据到OceanBase

发布时间 2023-09-15 11:05:34作者: 天涯客1224

案例说明:
通过mysqldump及datax,从MySQL迁移数据到OceanBase数据库。

一、数据库环境
1、MySQL环境

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.23    |
+-----------+
1 row in set (0.00 sec)

mysql> show variables like '%chara%';
+--------------------------+----------------------------------+
| Variable_name            | Value                            |
+--------------------------+----------------------------------+
| character_set_client     | utf8                             |
| character_set_connection | utf8                             |
| character_set_database   | utf8                             |
| character_set_filesystem | binary                           |
| character_set_results    | utf8                             |
| character_set_server     | utf8                             |
| character_set_system     | utf8                             |
| character_sets_dir       | /usr/local/mysql/share/charsets/ |
+--------------------------+----------------------------------+
8 rows in set (0.01 sec)

2、OceanBase环境

[admin@node203 obproxy-4.2.0.0]$ obclient -h127.0.0.1 -P2883 -uroot@ob_mysql -p  -Doceanbase  -A
Enter password:
Welcome to the OceanBase.  Commands end with ; or \g.
Your OceanBase connection id is 15
Server version: OceanBase_CE 4.2.0.0 (r100010022023081817-0bdf1c0c5674e88c5ae9a8d0ae4f8077465d7fae) (Built Aug 18 2023 17:32:49)
Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
obclient [oceanbase]> select version();
+------------------------------+
| version()                    |
+------------------------------+
| 5.7.25-OceanBase_CE-v4.2.0.0 |
+------------------------------+
1 row in set (0.001 sec)

obclient [oceanbase]> use mysql;
Database changed
obclient [mysql]> show variables like '%chara%';
+--------------------------+---------+
| Variable_name            | Value   |
+--------------------------+---------+
| character_set_client     | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database   | utf8mb4 |
| character_set_filesystem | binary  |
| character_set_results    | utf8mb4 |
| character_set_server     | utf8mb4 |
| character_set_system     | utf8mb4 |
+--------------------------+---------+
7 rows in set (0.004 sec)

# 配置root用户口令
obclient [mysql]> alter user 'root'@'%' identified by 'beijing';
Query OK, 0 rows affected (0.059 sec)

二、mysqldump迁移数据到OceanBase
通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。

1、导出mysql数据

# 创建测试数据
mysql> use prod;
Database changed

mysql> create table emp( empno int primary key ,ename varchar(10),sal int , deptno int);
Query OK, 0 rows affected (0.03 sec)
mysql> create table dept (deptno int primary key ,dname varchar(10), location varchar(10));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into dept values (10,'caiwu','beijing'),(20,'shichang','shanghai'),(30,'jishu','guangzhou');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0
mysql>  insert into emp values(1001,'tom',3000,10),(1002,'jerry',5000,20),(1003,'rose',3000,30);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from emp;
+-------+-------+------+--------+
| empno | ename | sal  | deptno |
+-------+-------+------+--------+
|  1001 | tom   | 3000 |     10 |
|  1002 | jerry | 5000 |     20 |
|  1003 | rose  | 3000 |     30 |
+-------+-------+------+--------+
3 rows in set (0.00 sec)

mysql> select * from dept;
+--------+----------+-----------+
| deptno | dname    | location  |
+--------+----------+-----------+
|     10 | caiwu    | beijing   |
|     20 | shichang | shanghai  |
|     30 | jishu    | guangzhou |
+--------+----------+-----------+
3 rows in set (0.00 sec)

# 通过mysqldump导出数据
[root@node202 bin]# ./mysqldump -uroot -p --databases prod > ~/prod.sql
Enter password:

2、将数据导入到OceanBase

# 连接OceanBase数据库
obclient [test]> use prod;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
obclient [prod]> show tables;
+----------------+
| Tables_in_prod |
+----------------+
| t1             |
1 row in set (0.002 sec)

# 导入数据到数据库
obclient [prod]> source /home/admin/prod.sql
Query OK, 0 rows affected (0.000 sec)
.......
Query OK, 0 rows affected (0.000 sec)

# 查看导入数据
obclient [prod]> show tables;
+----------------+
| Tables_in_prod |
+----------------+
| dept           |
| emp            |
| t1             |
+----------------+
3 rows in set (0.002 sec)

obclient [prod]> select * from emp;
+-------+-------+------+--------+
| empno | ename | sal  | deptno |
+-------+-------+------+--------+
|  1001 | tom   | 3000 |     10 |
|  1002 | jerry | 5000 |     20 |
|  1003 | rose  | 3000 |     30 |
+-------+-------+------+--------+
3 rows in set (0.002 sec)

obclient [prod]> select * from dept;
+--------+----------+-----------+
| deptno | dname    | location  |
+--------+----------+-----------+
|     10 | caiwu    | beijing   |
|     20 | shichang | shanghai  |
|     30 | jishu    | guangzhou |
+--------+----------+-----------+
3 rows in set (0.003 sec)

---如上所示,数据从mysql导入到OceanBase数据库成功。

三、通过datax从MySQL离线导入数据到OceanBase

1、datax简介
DataX 是阿里云 DataWorks 数据集成的开源版本,是阿里巴巴集团内被广泛使用的离线数据同步工具/平台。DataX 实现了包括 MySQL、Oracle、SQLserver、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、Hologres、DRDS 、OceanBase 等各种异构数据源之间高效的数据同步功能。

2、DataX 框架设计
DataX 采用 Framework + Plugin 架构,将数据源读取和写入抽象称为 Reader/Writer 插件,纳入到整个同步框架中。

3、datax部署安装
1)datax 下载地址:http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
2)解压datax
[root@node203 Oceanbase]# tar zxvf datax.tar.gz

3)datax目录结构

[admin@node203 datax]$ tree -L 1 --filelimit 30
.
├── bin
├── conf
├── job
├── lib
├── log
├── log_perf
├── plugin
├── script
└── tmp

4、测试datax是否安装成功

[root@node203 bin]# python datax.py ../job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-09-12 14:23:55.924 [main] INFO  VMInfo - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl
2023-09-12 14:23:55.929 [main] INFO  Engine - the machine info  =>

        osInfo: Oracle Corporation 20 20.0.2+9-78
        jvmInfo:        Linux amd64 3.10.0-327.el7.x86_64
        cpu num:        8

        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:     -0.00G
        maxFileDescriptorCount: -1
        currentOpenFileDescriptorCount: -1

        GC Names        [G1 Young Generation, G1 Old Generation, G1 Concurrent GC]

        MEMORY_NAME                    | allocation_size                | init_size
        CodeHeap 'profiled nmethods'   | 117.21MB                       | 2.44MB
        G1 Old Gen                     | 1,024.00MB                     | 973.00MB
        G1 Survivor Space              | -0.00MB                        | 0.00MB
        CodeHeap 'non-profiled nmethods' | 117.21MB                       | 2.44MB
        Compressed Class Space         | 1,024.00MB                     | 0.00MB
        Metaspace                      | -0.00MB                        | 0.00MB
        G1 Eden Space                  | -0.00MB                        | 51.00MB
        CodeHeap 'non-nmethods'        | 5.57MB                         | 2.44MB


2023-09-12 14:23:55.958 [main] INFO  Engine -
{
        "content":[
                {
                        "reader":{
                                "name":"streamreader",
                                "parameter":{
                                        "column":[
                                                {
                                                        "type":"string",
                                                        "value":"DataX"
                                                },
                                                {
                                                        "type":"long",
                                                        "value":19890604
                                                },
                                                {
                                                        "type":"date",
                                                        "value":"1989-06-04 00:00:00"
                                                },
                                                {
                                                        "type":"bool",
                                                        "value":true
                                                },
                                                {
                                                        "type":"bytes",
                                                        "value":"test"
                                                }
                                        ],
                                        "sliceRecordCount":100000
                                }
                        },
                        "writer":{
                                "name":"streamwriter",
                                "parameter":{
                                        "encoding":"UTF-8",
                                        "print":false
                                }
                        }
                }
        ],
        "setting":{
                "errorLimit":{
                        "percentage":0.02,
                        "record":0
                },
                "speed":{
                        "byte":10485760
                }
        }
}
.......
2023-09-12 14:24:06.078 [job-0] INFO  JobContainer -
任务启动时刻                    : 2023-09-12 14:23:55
任务结束时刻                    : 2023-09-12 14:24:06
任务总计耗时                    :                 10s
任务平均流量                    :          253.91KB/s
记录写入速度                    :          10000rec/s
读出记录总数                    :              100000
读写失败总数                    :                   0

5、创建datax配置模版

[root@node203 bin]# python ./datax.py -r mysqlreader -w oceanbasev10writer
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
.......
{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "column": [],
                        "connection": [
                            {
                                "jdbcUrl": [],
                                "table": []
                            }
                        ],
                        "password": "",
                        "username": "",
                        "where": ""
                    }
                },
                "writer": {
                    "name": "oceanbasev10writer",
                    "parameter": {
                        "column": [],
                        "connection": [
                            {
                                "jdbcUrl": "",
                                "table": []
                            }
                        ],
                        "obWriteMode": "",
                        "password": "",
                        "username": ""
                    }
                }
            }
        ],
        "setting": {
            "speed": {
                "channel": ""
            }
        }
    }
}

---如上所示,可以根据配置模版,修改创建job的配置文件。

6、查看job配置文件

[root@node203 job]# cat myjob.json
{
    "job":{
    "entry": {
      "jvm": "-Xms1024m -Xmx1024m"
    },

        "setting": {
            "speed": {
                "channel": 4
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.1
            }
        },
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "username": "root",
                        "password": "beijing",
                        "column": [
                            "*"
                        ],
                        "connection": [
                            {
                                "table": [
                                    "tb1"
                                ],
                                "jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/prod?useUnicode=true&characterEncoding=utf8&useSSL=false"]
                            }
                        ]
                    }
                },

                "writer": {
                    "name": "oceanbasev10writer",
                    "parameter": {
                        "obWriteMode": "insert",
                        "column": [
                            "*"
                        ],
                        "preSql": [
                            "truncate table tb1"
                        ],
                        "connection": [
                            {
                                "jdbcUrl": "||_dsc_ob10_dsc_||obdemo:ob_mysql||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/prod?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
                                "table": [
                                    "tb1"
                                ]
                            }
                        ],
                        "username": "root",
                        "password":"beijing",
                        "writerThreadCount":10,
                        "batchSize": 1000,
                        "memstoreThreshold": "0.9"
                    }
                }
            }
        ]
    }
}

7、执行离线数据同步

Tips:离线迁移是对静态数据的迁移,在迁移时,源端数据库不能有数据写入。

[root@node203 datax]# python bin/datax.py job/myjob.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


2023-09-15 10:23:04.587 [main] INFO  VMInfo - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl
2023-09-15 10:23:04.595 [main] INFO  Engine - the machine info  =>

        osInfo: Oracle Corporation 20 20.0.2+9-78
        jvmInfo:        Linux amd64 3.10.0-327.el7.x86_64
        cpu num:        8

        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:     -0.00G
        maxFileDescriptorCount: -1
        currentOpenFileDescriptorCount: -1

        GC Names        [G1 Young Generation, G1 Old Generation, G1 Concurrent GC]

        MEMORY_NAME                    | allocation_size                | init_size
        CodeHeap 'profiled nmethods'   | 117.21MB                       | 2.44MB
        G1 Old Gen                     | 1,024.00MB                     | 973.00MB
        G1 Survivor Space              | -0.00MB                        | 0.00MB
        CodeHeap 'non-profiled nmethods' | 117.21MB                       | 2.44MB
        Compressed Class Space         | 1,024.00MB                     | 0.00MB
        Metaspace                      | -0.00MB                        | 0.00MB
        G1 Eden Space                  | -0.00MB                        | 51.00MB
        CodeHeap 'non-nmethods'        | 5.57MB                         | 2.44MB


2023-09-15 10:23:04.617 [main] INFO  Engine -
.......

执行完成:

8、查看目标库oceanbase迁移数据

源端数据:

目标端数据:

四、迁移故障问题

1、执行迁移出现以下故障

[admin@node203 bin]$ python /home/soft/Oceanbase/datax/bin/datax.py /home/soft/Oceanbase/datax/job/myjob.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.

2023-09-11 19:26:34.926 [main] WARN  ConfigParser - 插件[mysqlreader,oceanbasev10writer]加载失败,1s后重试... Exception:Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/home/soft/Oceanbase/datax/plugin/reader/._drdsreader/plugin.json]不存在. 请检查您的配置文件.
2023-09-11 19:26:35.934 [main] ERROR Engine -

经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/home/soft/Oceanbase/datax/plugin/reader/._drdsreader/plugin.json]不存在. 请检查您的配置文件.
        at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26)
        at com.alibaba.datax.common.util.Configuration.from(Configuration.java:95)
        .......

2、解决方案
如下图所示,删除reader和writer目录下的隐藏文件: