目的
本文目的实现汉字首字母排序。
排序规则和字符集的关系如下。
select sys_encoding_to_char(collencoding) as encoding,collname,collcollate,collctype from sys_collation ;
按照UTF8字符集匹配中文排序规则如下。
select collcollate from sys_collation where sys_encoding_to_char(collencoding)='UTF8' and collcollate like '%zh%' group by collcollate;
查看test数据库当前字符集为默认字符集UTF8 ,Collate为 en_US.UTF-8,en_US表示英文语言环境,而我们的目的是按照中文排序。
TEST=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+--------+----------+-------------+-------------+-------------------
security | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/system +
| | | | | system=CTc/system
template1 | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/system +
| | | | | system=CTc/system
test | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(7 rows)
测试
test数据库中测试,指定排序为c无法在汉字字母中识别。
TEST=# \d t3
Table "public.t3"
Column | Type | Collation | Nullable | Default
--------+----------------------------+-----------+----------+---------
id | integer | | |
name | character varying(20 char) | | |
TEST=# select name from t3 order by name;
name
--------
不同】
不好
偶尔
啊
地平
地方
(6 rows)
TEST=# select name from t3 order by name collate "c";
name
--------
不同】
不好
偶尔
啊
地平
地方
(6 rows)
在sql后面只需要指定语言环境zh_CN,可实现按汉字首字母排序。
TEST=# select name from t3 order by name collate "zh_CN";
name
--------
啊
不好
不同】
地方
地平
偶尔
(6 rows)
还可以修改列的collate
但这会导致rewrite table,注意大表请谨慎操作。会耗时很久。
test=# alter table t3 alter name type character varying(20 char) collate "zh_CN";
ALTER TABLE
test=#
test=# select name from t3 order by name;
name
--------
啊
不好
不同】
地方
地平
偶尔
(6 rows)
GBK字符集数据库环境测试
test1数据库下做测试
TEST=# \l+
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
-----------+--------+----------+-------------+-------------+-------------------+--------+-------------+----------------------------------
----------
security | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 13 MB | sys_default |
template0 | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/system +| 13 MB | sys_default | unmodifiable empty database
| | | | | system=CTc/system | | |
template1 | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/system +| 13 MB | sys_default | default template for new database
| | | | | system=CTc/system | | |
test | system | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 824 MB | sys_default | default administrative connection
test1 | system | GBK | zh_CN.GBK | zh_CN.GBK | | 13 MB | sys_default |
|
(7 rows)
TEST=# \c test1 system
You are now connected to database "test1" as user "system".
test1=# \d t1
Table "public.t1"
Column | Type | Collation | Nullable | Default
--------+------+-----------+----------+---------
name | text | | |
因为test1数据库默认排序规则是zh_CN.GBK,所以无需在sql中指定collate即可完成汉字首字母排序。
test1=# select * from t1 order by name;
name
------
北大
的
个有
满五
模拟
哦平
(6 rows)
总结
如果字符集和collate不相匹配,我们可以在sql中使用collate "zh_CN",或者alter table 修改列的collate,实现汉字首字母排序。
- KingbaseESV8R6 KingbaseESV8 KingbaseESV 字母 8Rkingbaseesv8r6 kingbaseesv8 kingbaseesv字母 kingbaseesv8r6 kingbaseesv8r6 kingbaseesv8 kingbaseesv权限 kingbaseesv8r6 kingbaseesv8 kingbaseesv kbbench kingbaseesv8r6 kingbaseesv8 kingbaseesv pageinspect kingbaseesv8r6全局kingbaseesv8 kingbaseesv kingbaseesv8r6 kingbaseesv8 kingbaseesv参数 末端kingbaseesv8r6 kingbaseesv8 kingbaseesv kingbaseesv8r6 kingbaseesv8 kingbaseesv索引 kingbaseesv8r6 kingbaseesv8 kingbaseesv模式