KingbaseES V8R6 sys_squeeze 使用

发布时间 2023-05-09 19:41:51作者: KINGBASE研究院

sys_squeeze介绍

sys_squeeze是KingbaseES的一个扩展插件,该组件将提供人工调用命令实现对表dead tuple的清理工作。该组件在清理表空间的过程中,不会全程加排他锁,能保证业务运行期间尽可能不影响对目标表的访问。而 vacuum full也可实现死亡元组占用空间释放,但是缺点是会锁表,阻止业务进行。该插件的实现依赖于逻辑解码,因此使用该插件之前必须保证数据库wal_level等级设置为'logical'。

测试

sys_squeeze使用条件:wal_level = logical
max_replication_slots = 10 # minimum 1
shared_preload_libraries = 'sys_squeeze'
create extension sys_squeeze;

查看扩展插件

test=# \dx sys_squeeze;
                           List of installed extensions
    Name     | Version | Schema  |                  Description
-------------+---------+---------+------------------------------------------------
 sys_squeeze | 1.4     | squeeze | A tool to remove unused space from a relation.
(1 row)
创建测试表
test=# create table test(id int,c_name varchar(200),primary key(id));
CREATE TABLE

初始化数据
test=# insert into test select generate_series(1,4000000),'tom';
INSERT 0 4000000
查看表大小:169MB
TEST=# \dt+ test
                   List of relations
 Schema | Name | Type  | Owner  |  Size  | Description
--------+------+-------+--------+--------+-------------
 public | test | table | system | 138 MB |
(1 row)
使用squeeze手工清理语法:
函数接口squeeze.squeeze_table(opt1,opt2,opt3,opt4,opt5) 有5个可配置参数:

opt1: 填写所要清理的目标表的模式名,必须项

opt2: 填写所要清理的目标表名,必须项

opt3: 指定该表中已存在的index 名,若指定,会在新生成的表空间中按照此index顺序物理排列tuple,可选项,不指定填null

opt4: 指定将新生成的表置入指定的表空间。 可选项,若不指定填null,表示仍然使用原有表空间。

opt5: 将相应index置入指定的表空间。可选项,若不指定填null,表示仍然使用原有表空间。
更新数据
test=# update test set c_name = 'TRE-6' where id <2000000;
UPDATE 1999999

更新后表大小,表已膨胀
TEST=#  \dt+ test
                   List of relations
 Schema | Name | Type  | Owner  |  Size  | Description
--------+------+-------+--------+--------+-------------
 public | test | table | system | 223 MB |
(1 row)


更新数据后查看死亡元祖是1999999:
TEST=# select * from sys_stat_user_tables where relname='test';
 relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze |          last_va
cuum          |        last_autovacuum        | last_analyze |       last_autoanalyze        | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-----------------
--------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
 25746 | public     | test    |        6 |     20000000 |        1 |       1999999 |   4000000 |   5999997 |         0 |             0 |    4000668 |    1999999 |             1999999 | 2023-04-23 16:22
:54.344454+08 | 2023-04-23 16:26:50.718373+08 |              | 2023-04-23 16:26:53.339778+08 |            1 |                2 |             0 |                 3
(1 row)


示例执行squeeze:可以看到过程中生成了new table,new index,并行relfilenode改变

TEST=# SELECT squeeze.squeeze_table('public', 'test', null, null, null);
NOTICE:  Now begin to squeeze the table.
NOTICE:  Trying to setup logical decoding.
NOTICE:  This step needs acquire lock and may be block if there is long time not ending transaction,if this step is not done in a long time(e.g. 1min) please cancel the session and try again when the transaction is end.
NOTICE:  Setup logical decoding done.
NOTICE:  Now create transient table.
NOTICE:  New table 'test' locates at [base/12145/25751]
NOTICE:  Now cp the data to new created table.
NOTICE:  Now create index on the new created table.
NOTICE:  New index 'test_pkey' locates at [base/12145/25754].
NOTICE:  Now process the concurrent changes via logical decoding.
NOTICE:  The data has been moved to new table, now release the replication slot.
NOTICE:  Now swap the filenode.
NOTICE:  Delete the old table.
NOTICE:  The squeeze process is done.
 squeeze_table
---------------

(1 row)
表膨胀占用空间已被释放
TEST=# \dt+ test
                   List of relations
 Schema | Name | Type  | Owner  |  Size  | Description
--------+------+-------+--------+--------+-------------
 public | test | table | system | 154 MB |
(1 row)
squeeze后死亡元祖已清空
TEST=# select * from sys_stat_user_tables where relname='test';
 relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze |          last_va
cuum          |        last_autovacuum        | last_analyze |       last_autoanalyze        | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-----------------
--------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
 25746 | public     | test    |        5 |     16000000 |        1 |       1999999 |   4000000 |   3999998 |         0 |             0 |    4000668 |          0 |                   0 | 2023-04-23 16:22
:54.344454+08 | 2023-04-23 16:26:50.718373+08 |              | 2023-04-23 16:26:53.339778+08 |            1 |                2 |             0 |                 3
(1 row)

总结

sys_squeeze需要使用logical replication,所以需要设置足够的slots,而且必须注意可能与standby或者使用了逻辑复制功能争抢slots复制槽,要保证slots足够用。

虽然sys_squeeze可以自动收缩,但对于比较繁忙的数据库,建议不要在业务高峰期启用,避免对业务带来性能损耗风险。

注意sys_squeeze插件的使用条件,尤其确保表具有主键或唯一约束。这是处理sys_squeeze工作时的必要条件。