KingbaseES 语句like前匹配如何使用索引

发布时间 2023-05-09 19:41:51作者: KINGBASE研究院

前言

有现场同事反馈 sql语句 like 使用后缀通配符 % 不走索引。
至于执行计划没走索引的原因与KingbaseES数据库中的排序规则相关。

测试

测试环境:

KingbaseESV8R6C7

test=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 name   | text    |           |          |
Indexes:
    "test_id" btree (id)
    "test_name" btree (name)

insert into test select generate_series(1,1000000), generate_series(1,1000000)||'tt';

test=# select count(*) from test;
  count
----------
 1000000
(1 row)


执行计划没走索引
test=# explain analyze select * from test where name like '99999%';
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..11653.33 rows=100 width=12) (actual time=0.261..75.993 rows=11 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on test  (cost=0.00..10643.33 rows=42 width=12) (actual time=30.369..55.302 rows=4 loops=3)
         Filter: (name ~~ '99999%'::text)
         Rows Removed by Filter: 333330
 Planning Time: 0.051 ms
 Execution Time: 76.046 ms
(8 rows)


--关闭全表扫后,执行计划仍不走索引
test=# set enable_seqscan = off;
SET
test=# explain analyze select * from test where name like '99999%';
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=10000000000.00..10000017935.00 rows=100 width=12) (actual time=0.133..67.511 rows=11 loops=1)
   Filter: (name ~~ '99999%'::text)
   Rows Removed by Filter: 999989
 Planning Time: 0.077 ms
 Execution Time: 67.552 ms
(5 rows)




在KingbaseES数据库中的特性:只有在数据库 Collate 为 C 时,like 使用后缀通配符(%)时,才会用到索引。

如下所示,test库,上述为 Collate 为 zh_CN.UTF-8 时的示例,
test=# \l
                               List of databases
   Name    | Owner  | Encoding |   Collate   |    Ctype    | Access privileges
-----------+--------+----------+-------------+-------------+-------------------
 security  | system | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |
 template0 | system | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/system        +
           |        |          |             |             | system=CTc/system
 template1 | system | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/system        +
           |        |          |             |             | system=CTc/system
 test      | system | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =Tc/system       +
           |        |          |             |             | system=CTc/system+
           |        |          |             |             | u1=c/system      +
           |        |          |             |             | u2=c/system

(6 rows)


下面看下Collation为 C 时的测试示例

alter table test alter name type text COLLATE "c";

test=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 name   | text    | c         |          |
Indexes:
    "test_id" btree (id)
    "test_name" btree (name)


执行计划走索引
test=# explain analyze select * from test where name like '99999%';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=127.67..5733.52 rows=5000 width=36) (actual time=0.018..0.022 rows=11 loops=1)
   Filter: (name ~~ '99999%'::text)
   Heap Blocks: exact=2
   ->  Bitmap Index Scan on test_name  (cost=0.00..126.42 rows=5000 width=0) (actual time=0.011..0.011 rows=11 loops=1)
         Index Cond: ((name >= '99999'::text) AND (name < '9999:'::text))
 Planning Time: 0.183 ms
 Execution Time: 0.037 ms
(7 rows)
以上是通过修改表列的排序规则实现该列的索引的使用,还可以通过创建数据库时指定排序规则,或者创建表时指定排序规则。
参考博客园文档:https://www.cnblogs.com/kingbase/p/16743713.html



测试另一种方式:
1) 创建索引时,加上对应字段类型的操作符
--重建索引
test=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 name   | text    | en_US     |          |
Indexes:
    "test_id" btree (id)
    "test_name" btree (name)


test=# drop index test_name;
DROP INDEX
test=# create index test_name on test(name varchar_pattern_ops);
CREATE INDEX
test=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 name   | text    | en_US     |          |
Indexes:
    "test_id" btree (id)
    "test_name" btree (name varchar_pattern_ops)



执行计划可以走索引
test=#  explain analyze select * from test where name like '99999%';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=127.67..5733.52 rows=5000 width=36) (actual time=0.018..0.022 rows=11 loops=1)
   Filter: (name ~~ '99999%'::text)
   Heap Blocks: exact=2
   ->  Bitmap Index Scan on test_name  (cost=0.00..126.42 rows=5000 width=0) (actual time=0.011..0.011 rows=11 loops=1)
         Index Cond: ((name ~>=~ '99999'::text) AND (name ~<~ '9999:'::text))
 Planning Time: 0.218 ms
 Execution Time: 0.037 ms
(7 rows)

操作符text_pattern_ops、varchar_pattern_ops、bpchar_pattern_ops 分别支持类型 text、varchar 和 char 上的B树索引。
它们与默认操作符类的区别是值的比较是严格按照字符进行而不是根据其他相关的排序规则。
这使得这些操作符类适合于当一个数据库没有使用标准“C”区域时被使用在涉及模式匹配表达式(LIKE 或 POSIX 正则表达式)的查询中。




2)另一种创建索引方式:创建索引时指定排序规则
test=# drop index test_name;
DROP INDEX
test=# create index test_name on test(name collate "c");
CREATE INDEX
test=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 name   | text    | en_US     |          |
Indexes:
    "test_id" btree (id)
    "test_name" btree (name COLLATE c)

执行计划也可以走索引
test=#  explain analyze select * from test where name like '99999%';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=127.67..5733.52 rows=5000 width=36) (actual time=0.020..0.024 rows=11 loops=1)
   Filter: (name ~~ '99999%'::text)
   Heap Blocks: exact=2
   ->  Bitmap Index Scan on test_name  (cost=0.00..126.42 rows=5000 width=0) (actual time=0.012..0.013 rows=11 loops=1)
         Index Cond: ((name >= '99999'::text) AND (name < '9999:'::text))
 Planning Time: 0.153 ms
 Execution Time: 0.051 ms
(7 rows)


3)正则表达式写法走索引
test=# \d test
                Table "public.test"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |
 name   | text    | en_US     |          |
Indexes:
    "test_id" btree (id)
    "test_name" btree (name)


test=# explain analyze select * from test where name like '^99999';
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=115.17..5721.02 rows=5000 width=36) (actual time=0.033..0.033 rows=0 loops=1)
   Filter: (name ~~ '^99999'::text)
   ->  Bitmap Index Scan on test_name  (cost=0.00..113.92 rows=5000 width=0) (actual time=0.031..0.031 rows=0 loops=1)
         Index Cond: (name = '^99999'::text)
 Planning Time: 0.088 ms
 Execution Time: 0.048 ms
(6 rows)

总结

遇到like通配符,执行计划显示不走索引的情况,首先查看索引列,表,数据库对应的排序规则Collation,然后根据以上测试示例进行现场实测并变更排序规则。

相关文档:https://www.cnblogs.com/kingbase/p/16743713.html