KingbaseES V8R3 备份恢复案例 -- sys_rman物理备份异机恢复

发布时间 2023-09-19 18:46:05作者: KINGBASE研究院

案例说明:
在生产环境通过sys_rman执行了物理备份后,需要在异机构建测试环境,本案例描述了通过物理备份异机恢复的详细过程及操作。

适用版本:
KingbaseES V8R3

节点信息:

[kingbase@node102 bin]$ cat /etc/hosts
......
192.168.1.101   node101    # 生产节点
192.168.1.102   node102    # 测试节点

一、生产库执行sys_rman物理备份

1、生产环境相关配置参数

# 开启归档
test=# show archive_mode ;
 archive_mode
--------------
 on
(1 row)

# 归档文件存储路径
test=# show archive_dest ;
       archive_dest
--------------------------
 /data/kingbase/arch/c290
(1 row)

# 归档配置
test=# show archive_command ;
                              archive_command
----------------------------------------------------------------------------
 test ! -f /data/kingbase/arch/c290/%f && cp %p /data/kingbase/arch/c290/%f
(1 row)

# wal日志配置
test=# show wal_level ;
 wal_level
-----------
 replica
(1 row)

2、执行sys_rman物理备份

1)备份初始化

[kingbase@node101 ~]$ mkdir -p /data/kingbase/bk/c290
[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ init

2)执行数据库全备

[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ -b full backup
INFO: validate: RTATKP backup and archive log files by CRC

[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ validate
INFO: validate: RTATKP backup and archive log files by CRC
INFO: backup validation completed successfully

3)执行增量备份

# 事务操作
prod=# create table t2 as select * from t1;
SELECT 10000

prod=# select count(*) from t2;
 count
-------
 10000
(1 row)

# 生成检查点(在恢复时,缩短recovery时间)。
prod=# select sys_switch_xlog();
 sys_switch_xlog
-----------------
 0/70000A0
(1 row)

prod=# checkpoint;
CHECKPOINT

# 执行正增量备份
[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ -b page backup
INFO: validate: RTATU1 backup and archive log files by CRC

4) 查看备份信息

[kingbase@node101 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ show
==========================================================================================================
ID       Recovery time        Mode          Current/Parent TLI  Time            Data  start_lsn  stop_lsn Status
==========================================================================================================
RTATU1   2023-04-18 14:54:03  PAGE           1 / 0              2s           628kB  0/9000028  0/A000078  OK
RTATKP   2023-04-18 14:48:27  FULL           1 / 0              2s            80MB  0/3000028  0/3000130  OK

5)查看备份文件信息

[kingbase@node101 c290]$ ls -lh
total 4.0K
drwx------ 4 kingbase kingbase 32 Apr 18 14:54 backups
-rw-r--r-- 1 kingbase kingbase 41 Apr 18 14:47 sys_rman.conf
lrwxrwxrwx 1 kingbase kingbase 25 Apr 18 15:40 wal -> /data/kingbase/arch/c290/

二、sys_rman执行异机恢复

Tips:
物理备份的恢复一般分为两个步骤

restore: 还原备份数据文件到data目录下
reocovery: 启动实例后从最近的检查点开始应用xlog日志到一致性状态后,开启数据库。

1、准备数据库环境

  1)在测试主机安装和生产主机相同的数据库版本
  2)创建相同的备份存储路径和xlog日志归档路径
  3)归档及wal日志配置和生产库相同

2、复制生产库备份到测试主机
[kingbase@node101 c290]$ scp -r * node102:/data/kingbase/bk/c290/

3、执行sys_rman恢复

1)restore备份到data目录下

# 备份测试库data目录
[kingbase@node102 c290]$ cd /opt/Kingbase/ES/C290/
[kingbase@node102 C290]$ mv data data.bk

# 创建data目录并授权
[kingbase@node102 bin]$ mkdir -p /opt/Kingbase/ES/C290/data
[kingbase@node102 bin]$ chmod 700 /opt/Kingbase/ES/C290/data

# 在测试库上查看备份信息
[kingbase@node102 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ show
==========================================================================================================
ID       Recovery time        Mode          Current/Parent TLI  Time            Data  start_lsn  stop_lsn Status
==========================================================================================================
RTATU1   2023-04-18 14:54:03  PAGE           1 / 0              2s           628kB  0/9000028  0/A000078  OK
RTATKP   2023-04-18 14:48:27  FULL           1 / 0              2s            80MB  0/3000028  0/3000130  OK

# 执行sys_rman restore
[kingbase@node102 bin]$ ./sys_rman -U system -W 123456 -d test -D /opt/Kingbase/ES/C290/data -B /data/kingbase/bk/c290/ restore
INFO: validate: RTATKP backup and archive log files by SIZE
INFO: validate: RTATU1 backup and archive log files by SIZE
INFO: restore complete. Recovery starts automatically when the Kingbase server is started.

如下图所示,执行restore:

2)启动测试库实例执行recovery

# 启动数据库实例
[kingbase@node102 bin]$ ./sys_ctl start -D ../../data
server starting
.......

# 查看sys_log日志
[kingbase@node102 sys_log]$ tail -1000 kingbase-2023-04-18_154713.log
2023-04-18 15:47:13 CST LOG:  database system was interrupted; last known up at 2023-04-18 14:54:01 CST
2023-04-18 15:47:13 CST LOG:  creating missing WAL directory "sys_xlog/archive_status"
2023-04-18 15:47:13 CST LOG:  starting archive recovery
2023-04-18 15:47:13 CST LOG:  restored log file "000000010000000000000009" from archive
2023-04-18 15:47:13 CST LOG:  redo starts at 0/9000028
2023-04-18 15:47:13 CST LOG:  redo wal segment count 1
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000A" from archive
2023-04-18 15:47:13 CST LOG:  consistent recovery state reached at 0/A000078
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000B" from archive
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000C" from archive
cp: cannot stat ‘/data/kingbase/bk/c290//wal/00000001000000000000000D’: No such file or directory
2023-04-18 15:47:13 CST LOG:  complete: 1/1
2023-04-18 15:47:13 CST LOG:  redo done at 0/C0000D0
2023-04-18 15:47:13 CST LOG:  last completed transaction was at log time 2023-04-18 14:54:03.704661+08
2023-04-18 15:47:13 CST LOG:  restored log file "00000001000000000000000C" from archive
cp: cannot stat ‘/data/kingbase/bk/c290//wal/00000002.history’: No such file or directory
2023-04-18 15:47:13 CST LOG:  selected new timeline ID: 2
2023-04-18 15:47:13 CST LOG:  archive recovery complete
cp: cannot stat ‘/data/kingbase/bk/c290//wal/00000001.history’: No such file or directory
2023-04-18 15:47:13 CST LOG:  MultiXact member wraparound protections are now enabled
2023-04-18 15:47:13 CST LOG:  autovacuum launcher started
2023-04-18 15:47:13 CST LOG:  database system is ready to accept connections
2023-04-18 15:47:13 CST LOG:  starting syslogical supervisor
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database TEST
2023-04-18 15:47:13 CST LOG:  manager worker [11755] at slot 0 generation 1 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database TEMPLATE1
2023-04-18 15:47:13 CST LOG:  manager worker [11757] at slot 0 generation 2 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database TEMPLATE2
2023-04-18 15:47:13 CST LOG:  manager worker [11758] at slot 0 generation 3 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database SAMPLES
2023-04-18 15:47:13 CST LOG:  manager worker [11759] at slot 0 generation 4 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database SECURITY
2023-04-18 15:47:13 CST LOG:  manager worker [11760] at slot 0 generation 5 detaching cleanly
2023-04-18 15:47:13 CST LOG:  starting syslogical database manager for database prod
2023-04-18 15:47:13 CST LOG:  manager worker [11761] at slot 0 generation 6 detaching cleanly

如下图所示,数据库执行reocvery操作:

三、测试库连接访问

[kingbase@node102 bin]$ ./ksql -U system -W 123456 test
ksql (V008R003C002B0290)
Type "help" for help.

test=# \l
                               List of databases
   Name    | Owner  | Encoding |   Collate   |    Ctype    | Access privileges
-----------+--------+----------+-------------+-------------+--------------------
 prod      | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 SAMPLES   | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 SECURITY  | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 TEMPLATE0 | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/SYSTEM         +
           |        |          |             |             | SYSTEM=CTcb/SYSTEM
 TEMPLATE1 | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/SYSTEM         +
           |        |          |             |             | SYSTEM=CTcb/SYSTEM
 TEMPLATE2 | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =Tc/SYSTEM        +
           |        |          |             |             | SYSTEM=CTcb/SYSTEM
 TEST      | SYSTEM | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
(7 rows)

test=# \c prod
You are now connected to database "prod" as user "system".
prod=# \d
                    List of relations
 Schema |             Name              | Type  | Owner
--------+-------------------------------+-------+--------
 PUBLIC | pathman_cache_stats           | view  | SYSTEM
 PUBLIC | pathman_concurrent_part_tasks | view  | SYSTEM
 PUBLIC | pathman_config                | table | SYSTEM
 PUBLIC | pathman_config_params         | table | SYSTEM
 PUBLIC | pathman_partition_list        | view  | SYSTEM
 PUBLIC | t1                            | table | SYSTEM
 PUBLIC | t2                            | table | SYSTEM
(7 rows)

prod=# select count(*) from t1;
 count
-------
 10000
(1 row)

prod=# select count(*) from t2;
 count
-------
 10000
(1 row)

---如上所示,测试库数据恢复到了最近的备份点。

三、总结
sys_rman物理备份支持异机恢复,操作过程相对比较简单;可以将生产库的备份目录建立nfs共享,然后在测试环境mount共享文件系统,不用再从生产主机将备份拷贝到测试主机。