KingbaseES 使用sys_bulkload远程导入

发布时间 2023-05-09 19:41:51作者: KINGBASE研究院

前言

sys_bulkload 常见场景是本地导入数据,也可以在远程运行 sys_bulkload ,对数据库上的CSV 文件进行导入。远程导入数据时候需要注意,csv文件和ctl文件所在服务器。以下举例展示整个远程导入的过程。

测试环境V8R6C7

演示目的将数据从 IP2 所在服务器导入到IP3远程服务器上。
IP3服务器上操作:
测试前创建扩展插件
create extention sys_bulkload;

创建测试表
create table testu(id int primary key, info text, crt_time timestamp);


在IP3服务器上创建ctl文件,/home/kingbase7/ 目录为IP3服务器kingbase用户家目录。
vi test.ctl

TABLE = testu
INPUT = /home/kingbase7/test.csv
TYPE = CSV
SKIP = 2
LIMIT = 5
WRITER = BUFFERED
PROCESSOR_COUNT = 3


创建数据源
vi test.csv

1,29b35ff06c949e7e442c929e1df86396,2017-10-08 10:52:47.746062
2,06fde814525395de5ab85f6d92b04e87,2017-10-08 10:52:47.746573
3,c93f02e8677c9cd7c906c6ad5dbd450e,2017-10-08 10:52:47.746627
4,6541700070ae3d051f965fcef43baf45,2017-10-08 10:52:47.746835
5,3d7e7246016acaa842526b6614d0edf5,2017-10-08 10:52:47.746869

将test.ctl 移动到IP2远程服务器上,注意这里不需要将csv文件移动到IP2服务器上,因为input记录的csv文件位置在IP3服务器上的kingbase7目录下
scp -r /home/kingbase7/test.ctl kingbase@192.168.56.2:/home/kingbase/


**在 IP2远程服务器上执行sys_bulkload导入命令。注意ctl文件一定确保存在IP2服务器上,csv文件一定确保在IP3 服务器上。**否则报错找不到文件或目录
[kingbase@localhost ~]$ sys_bulkload -h 192.168.56.3 -d TEST /home/kingbase/test.ctl -U SYSTEM -W 123456 -p 2907
NOTICE: BULK LOAD START
NOTICE: BULK LOAD START
NOTICE: BULK LOAD START
NOTICE: BULK LOAD END
        3 Rows skipped.
        1 Rows successfully loaded.
        0 Rows not loaded due to parse errors.
        0 Rows not loaded due to duplicate errors.
        0 Rows replaced with new rows.
        log path: testu.log
        parse error path: /home/kingbase7/test.bad
        duplicate error path: /home/kingbase7/test.dupbad
        ctrl file path: /home/kingbase/test.ctl
        data file path: /home/kingbase7/test.csv
        Run began on 2023-03-10 14:21:48.033174+08
        Run ended on 2023-03-10 14:21:48.033661+08
NOTICE: BULK LOAD END
        2 Rows skipped.
        1 Rows successfully loaded.
        0 Rows not loaded due to parse errors.
        0 Rows not loaded due to duplicate errors.
        0 Rows replaced with new rows.
        log path: testu.log
        parse error path: /home/kingbase7/test.bad
        duplicate error path: /home/kingbase7/test.dupbad
        ctrl file path: /home/kingbase/test.ctl
        data file path: /home/kingbase7/test.csv
        Run began on 2023-03-10 14:21:48.035869+08
        Run ended on 2023-03-10 14:21:48.036169+08
NOTICE: BULK LOAD END
        4 Rows skipped.
        1 Rows successfully loaded.
        0 Rows not loaded due to parse errors.
        0 Rows not loaded due to duplicate errors.
        0 Rows replaced with new rows.
        log path: testu.log
        parse error path: /home/kingbase7/test.bad
        duplicate error path: /home/kingbase7/test.dupbad
        ctrl file path: /home/kingbase/test.ctl
        data file path: /home/kingbase7/test.csv
        Run began on 2023-03-10 14:21:48.034591+08
        Run ended on 2023-03-10 14:21:48.035150+08

在.3服务器上查询,数据导入完成。之前设置的skip表示跳过前两行数据。
TEST=# select * from testu;
 id |               info               |          crt_time
----+----------------------------------+----------------------------
  4 | 6541700070ae3d051f965fcef43baf45 | 2017-10-08 10:52:47.746835
  5 | 3d7e7246016acaa842526b6614d0edf5 | 2017-10-08 10:52:47.746869
  3 | c93f02e8677c9cd7c906c6ad5dbd450e | 2017-10-08 10:52:47.746627
(3 rows)

总结

使用sys_bulkload远程导入时候,需要将ctl文件放在远程服务器上,csv数据文件放在需要导入的目标数据库所在服务器上。