mycat分库分表的分片方案

发布时间 2023-06-21 15:01:22作者: 数据库小白(专注)

1. mycat分库分表的分片方案

1.1. 一. 配置规则

1.1.1. 分片方式

1.1.2. 配置规则

TableRule :

  • name:分片规则标识名称(唯一)。
  • columns:分片表的列名。
  • algorithm:算法的名称。
<tableRule name="auto-sharding-long">
    <rule>
        <columns>id</columns>
        <algorithm>rang-long</algorithm>
    </rule>
</tableRule>

Function :

  • name:算法名称(唯一)。
  • class:具体的分片算法(需要指定算法的具体类)。
  • property:属性根据要求指定。
<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
    <property name="mapFile">autopartition-long.txt</property>
</function>

1.2. 二. 连续分片

1.2.1. 自定义数字范围分片

适用于明确知道分片字段的某个范围属于哪个分片,其字段为数字类型

schema.xml 配置

<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">
    <!--逻辑数据库设置-->
    <schema name="luckcat" checkSQLschema="false" sqlMaxLimit="100">
        <!--指定切分表和分表规则-->
        <table name="auto_sharding_long" primaryKey="id" dataNode="dn0,dn1,dn2" rule="auto-sharding-long"/>
    </schema>

    <!--逻辑节点与物理主机映射-->
    <dataNode name="dn0" dataHost="dh-1" database="db0"/>
    <dataNode name="dn1" dataHost="dh-1" database="db1"/>
    <dataNode name="dn2" dataHost="dh-1" database="db2"/>
    <dataNode name="dn3" dataHost="dh-1" database="db3"/>
    <dataNode name="dn4" dataHost="dh-1" database="db4"/>
    <dataNode name="dn5" dataHost="dh-1" database="db5"/>
    <dataNode name="dn6" dataHost="dh-1" database="db6"/>
    <dataNode name="dn7" dataHost="dh-1" database="db7"/>
    <dataNode name="dn8" dataHost="dh-1" database="db8"/>
    <dataNode name="dn9" dataHost="dh-1" database="db9"/>
    <dataNode name="dn10" dataHost="dh-1" database="db10"/>

    <!--物理主机-->
    <dataHost name="dh-1" maxCon="1000" minCon="10" balance="0" writeType="0" dbType="mysql" dbDriver="native"
              switchType="1" slaveThreshold="100">
        <heartbeat>select user()</heartbeat>
        <writeHost host="hostM1" url="localhost:3306" user="root" password="123456"/>
    </dataHost>
</mycat:schema>

rule.xml 配置

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mycat:rule SYSTEM "rule.dtd">
<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="auto-sharding-long">
        <rule>
            <columns>age</columns> <!--分表字段-->
            <algorithm>rang-long</algorithm> <!--算法-->
        </rule>
    </tableRule>
    <function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
        <property name="mapFile">autopartition-long.txt</property>
        <property name="defaultNode">0</property>
    </function>
</mycat:rule>

autopartition-long.txt 配置

# range start-end ,data node index
# K=1000,M=10000.
0-2M=0
2M-3M=1
3M-6M=2

test.sql

CREATE TABLE  auto_sharding_long  (`age`  int NOT NULL ,`db_nm`  varchar(20) NULL);
INSERT INTO auto_sharding_long (age,db_nm) VALUES (20000, database());
INSERT INTO auto_sharding_long (age,db_nm) VALUES (25000, database());
INSERT INTO auto_sharding_long (age,db_nm) VALUES (35000, database());
select * from auto_sharding_long;

1.2.2. 按日期分片

schema.xml

<table name="sharding_by_day" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-day" />

rule.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE mycat:rule SYSTEM "rule.dtd">
<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-day">
        <rule>
            <columns>create_time</columns>
            <algorithm>part-by-day</algorithm>
        </rule>
    </tableRule>
    <function name="part-by-day"
              class="io.mycat.route.function.PartitionByDate">
        <property name="dateFormat">yyyy-MM-dd</property>
        <property name="sBeginDate">2017-10-01</property>
        <!--默认从开始日期算起,每隔10天一个分区。-->
        <property name="sPartionDay">10</property>
    </function>
</mycat:rule>

test.sql

CREATE TABLE  sharding_by_day  (create_time   timestamp NULL ON UPDATE CURRENT_TIMESTAMP  ,`db_nm`  varchar(20) NULL);
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-01', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-10', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-11', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-21', database());
INSERT INTO sharding_by_day (create_time,db_nm) VALUES ('2017-10-31', database());
select * from sharding_by_day;

-- 以上测试结果中有个错误,因为配置的dataNode节点数为3,而2017-10-31为2017-10-01后的第四个10天中的第一天,因此需要至少4个dataNode,节点不够就报如上错误了。

1.2.3. 按单月小时分片

schema.xml

<table name="sharding_by_hour" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-hour" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-hour">
        <rule>
            <columns>sharding_col</columns>
            <algorithm>sharding-by-hour</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-hour" class="io.mycat.route.function.LatestMonthPartion">
        <property name="splitOneDay">3</property>
    </function>
</mycat:rule>

test.sql

-- 单月内按照小时拆分,最小粒度是小时,一天最多可以有24个分片,最少1个分片,下个月从头开始循环,每个月末需要手工清理数据。
-- 字段为字符串类型,yyyymmddHH 10位。

CREATE TABLE  sharding_by_hour  (create_time   timestamp NULL ON UPDATE CURRENT_TIMESTAMP  ,`db_nm`  varchar(20) NULL,sharding_col varchar(10) null);
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100101','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100108','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100109','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100116','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100117','2017-10-01', database());
INSERT INTO sharding_by_hour (sharding_col,create_time,db_nm) VALUES ('2017100123','2017-10-01', database());
select * from sharding_by_hour;

1.2.4. 按自然月分片

schema.xml

<table name="sharding_by_month" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-month" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-month">
        <rule>
            <columns>create_time</columns>
            <algorithm>sharding-by-month</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-month" class="io.mycat.route.function.PartitionByMonth">
        <property name="dateFormat">yyyy-MM-dd</property>
        <property name="sBeginDate">2017-10-01</property>
        <property name="sEndDate">2017-12-30</property>
    </function>
</mycat:rule>

test.sql

CREATE TABLE  sharding_by_month  (create_time   timestamp NULL ON UPDATE CURRENT_TIMESTAMP  ,`db_nm`  varchar(20) NULL);  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-10-01', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-10-30', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-11-11', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-11-21', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-12-01', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2017-12-31', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2018-01-01', database());  
INSERT INTO sharding_by_month (create_time,db_nm) VALUES ('2018-01-31', database());  
select * from sharding_by_month

-- 注意
-- * schema里的table的dataNode节点数必须:大于rule的开始时间按照分片数计算到现在的个数
-- * 按照自然月计算(无论是28、30、31天都是一个月的)
-- * 分片节点个数可以后增加,但是必须符合第一点说明。

1.3. 三. 离散分片

1.3.1. 枚举分片

schema.xml

<table name="sharding_by_intfile" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-intfile" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-intfile">
        <rule>
            <columns>age</columns>
            <algorithm>hash-int</algorithm>
        </rule>
    </tableRule>
    <function name="hash-int" class="io.mycat.route.function.PartitionByFileMap">
        <property name="mapFile">partition-hash-int.txt</property>
        <property name="type">0</property>
        <property name="defaultNode">0</property>
    </function>
</mycat:rule>

partition-hash-int.txt

# 枚举项=节点标识,枚举项可以是数字和字符(非汉字)
11=1
12=2

test.sql

-- 通过在配置文件中配置可能的枚举id,指定数据分布到不同的物理节点上,本规则适用于按照省份或区县来拆分数据类业务。

CREATE TABLE sharding_by_intfile  (`age`  int NOT NULL ,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_intfile` (age,db_nm) VALUES (10, database());
INSERT INTO `sharding_by_intfile` (age,db_nm) VALUES (11, database());
INSERT INTO `sharding_by_intfile` (age,db_nm) VALUES (12, database());
select * from `sharding_by_intfile`;

1.3.2. 程序指定分区的分片

schema.xml

<table name="sharding_by_substring" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-substring" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-substring">
        <rule>
            <columns>user_id</columns>
            <algorithm>sharding-by-substring</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-substring" class="io.mycat.route.function.PartitionDirectBySubString">
        <property name="startIndex">0</property>
        <property name="size">2</property>
        <property name="partitionCount">3</property>
        <property name="defaultPartition">0</property>
    </function>
</mycat:rule>

test.sql

-- 在程序运行阶段,由程序自主决定路由到哪个分片。

CREATE TABLE sharding_by_substring  (`user_id`  varchar(20) NOT NULL ,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_substring` (user_id,db_nm) VALUES ('05-10000', database());
INSERT INTO `sharding_by_substring` (user_id,db_nm) VALUES ('02-10001', database());
INSERT INTO `sharding_by_substring` (user_id,db_nm) VALUES ('03-10002', database());
select * from `sharding_by_substring`;

1.3.3. 十进制求模分片

schema.xml

<table name="sharding_by_mod" primaryKey="id" dataNode="dn$0-2" rule="sharding-mod-long" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-mod-long">
        <rule>
            <columns>id</columns>
            <algorithm>mod-long</algorithm>
        </rule>
    </tableRule>
    <function name="mod-long" class="io.mycat.route.function.PartitionByMod">
        <property name="count">3</property>
    </function>
</mycat:rule>

test.sql

CREATE TABLE `sharding_by_mod` (id int(10) null,`db_nm` varchar(20) NULL);
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (1, database());
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (2, database());
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (3, database());
INSERT INTO `sharding_by_mod` (id,db_nm) VALUES (10, database());
select * from sharding_by_mod;

1.3.4. 字符串hash解析分片

schema.xml

<table name="sharding_by_stringhash" primaryKey="ord_no" dataNode="dn$0-2" rule="sharding-by-stringhash" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-stringhash">
        <rule>
            <columns>ord_no</columns>
            <algorithm>sharding-by-stringhash</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-stringhash" class="io.mycat.route.function.PartitionByString">
        <property name="partitionLength">512</property>
        <property name="partitionCount">2</property>
        <property name="hashSlice">-6:0</property>
    </function>
</mycat:rule>

        <!-- 配置说明:-->
        <!-- * columns标识将要分片的表字段。-->
        <!-- * algorithm为分片函数。-->
        <!-- * partitionLength为字符串hash的求模基数。-->
        <!-- * partitionCount为分区数。其中partitionLength*partitionCount=1024-->
        <!-- * hashSlice为预算位,即根据子字符串中的int值进行hash运算。-->
        <!-- 0 代表 str.length(), -1 代表 str.length()-1,大于0只代表数字自身-->
        <!-- 可以理解为substring(start,end),start为0则只表示0-->
        <!-- 例1:值“45abc”,hash预算位0:2 ,取其中45进行计算-->
        <!-- 例2:值“aaaabbb2345”,hash预算位-4:0 ,取其中2345进行计算-->

test.sql

-- 此规则是截取字符串中的int数值的hash分片。

CREATE TABLE sharding_by_stringhash (ord_no varchar(20) NULL,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_stringhash` (ord_no,db_nm) VALUES (171022237582, database());  
INSERT INTO `sharding_by_stringhash` (ord_no,db_nm) VALUES (171022553756, database());  
select * from sharding_by_stringhash;  

1.3.5. 一致性哈希分片

schema.xml

<table name="sharding_by_murmurhash" primaryKey="id" dataNode="dn$0-2" rule="sharding-by-murmurhash" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-murmurhash">
        <rule>
            <columns>id</columns>
            <algorithm>sharding-by-murmurhash</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-murmurhash" class="io.mycat.route.function.PartitionByMurmurHash">
        <property name="seed">0</property><!-- 默认是0 -->
        <property name="count">2</property><!-- 要分片的数据库节点数量,必须指定,否则没法分片 -->
        <property name="virtualBucketTimes">160</property>
        <!-- 一个实际的数据库节点被映射为这么多虚拟节点,默认是160倍,也就是虚拟节点数是物理节点数的160倍-->
        <!-- <property name="weightMapFile">weightMapFile</property> 节点的权重,没有指定权重的节点默认是1。
        以properties文件的格式填写,以从0开始到count-1的整数值也就是节点索引为key,以节点权重值为值。所有权重值必须是正整数,否则以1代替 -->
        <!-- <property name="bucketMapPath">/etc/mycat/bucketMapPath</property>
        用于测试时观察各物理节点与虚拟节点的分布情况,如果指定了这个属性,会把虚拟节点的murmur hash值与物理节点的映射按行输出到
        这个文件,没有默认值>,如果不指定,就不会输出任何东西 -->
    </function>
</mycat:rule>

test.sql

-- 一致性hash算法有效解决了分布式数据的扩容问题。因为此规则优点在于扩容时迁移数据量比较少,前提是分片节点比较多,虚拟节点分配多些。
-- 虚拟节点分配的少就会造成数据分布不够均匀。但如果实际分片数据比较少,迁移量也会比较多。

CREATE TABLE sharding_by_murmurhash (id int(10) null,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (1, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (2, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (3, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (4, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (5, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (6, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (7, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (8, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (9, database());
INSERT INTO `sharding_by_murmurhash` (id,db_nm) VALUES (10, database());
select * from sharding_by_murmurhash;

1.4. 四. 综合分片

1.4.1. 范围求模分片

schema.xml

<table name="sharding_by_rang_mod" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-rang-mod" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-rang-mod">
        <rule>
            <columns>id</columns>
            <algorithm>sharding-by-rang-mod</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-rang-mod" class="io.mycat.route.function.PartitionByRangeMod">
        <!--指定分片规则的配置文件-->
        <property name="mapFile">partition-range-mod.txt</property>
        <!--未包含以上规则的数据存储节点,节点从0开始。-->
        <property name="defaultNode">0</property>
    </function>
</mycat:rule>

partition-range-mod.txt

# range start-end ,data node group size
0-200M=3
200M1-400M=1
400M1-600M=4
600M1-800M=2

test.sql

-- 该算法先进行范围分片,计算出分片组,组内再求模,综合了范围分片和求模分片的优点。分片组内使用求模可以保证组内的数据分布比较均匀,
-- 分片组之间采用范围分片可以兼顾范围分片的特点。事先规定好分片的数量,数据扩容时按分片组扩容,则原有分片组的数据不需要迁移。由于
-- 分片组内的数据分布比较均匀,所以分片组内可以避免热点数据问题。

CREATE TABLE sharding_by_rang_mod (id bigint null,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (1000, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (1002, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (30000, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (30004, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (40000, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (40005, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (60005, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (60006, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (80006, database());
INSERT INTO `sharding_by_rang_mod` (id,db_nm) VALUES (60008, database());
select * from sharding_by_rang_mod;

1.4.2. 日期范围hash分片

schema.xml

<table name="sharding_by_date_hash" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-range-date-hash" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-range-date-hash">
        <rule>
            <columns>create_time</columns>
            <algorithm>sharding-by-range-date-hash</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-range-date-hash" class="io.mycat.route.function.PartitionByRangeDateHash">
        <property name="dateFormat">yyyy-MM-dd HH:mm:ss</property>
        <property name="sBeginDate">2018-01-31 00:00:00</property>
        <property name="sPartionDay">30</property>
        <property name="groupPartionSize">10</property>
    </function>
</mycat:rule>

<!--        sPartionDay:代表多少天分一个分片;-->
<!--        groupPartionSize:代表分片组的大小。-->

test.sql

-- 其思想与范围求模一致,由于日期取模方法会出现数据热点问题,所以先根据日期分组,再根据时间hash使得短期内数据分布得更均匀。
-- 其优点是可以避免扩容时的数据迁移,又可以在一定程度上避免范围分片的热点问题,要求日期格式尽量精确,不然达不到局部均匀的目的。

CREATE TABLE `sharding_by_date_hash` (`create_time` datetime NOT NULL,`db_nm` varchar(20) NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-12 00:01:02', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-21 01:02:09', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-28 12:00:12', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-01-02 11:00:00', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-26 10:00:09', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-03-01 22:01:02', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-02 17:09:08', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-23 11:00:04', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-24 18:12:09', database());
INSERT INTO `sharding_by_date_hash` (create_time,db_nm) VALUES ('2018-02-21 07:12:00', database());
select * from sharding_by_date_hash;

1.4.3. 取模范围约束分片

schema.xml

<table name="sharding_by_pattern" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-pattern" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-pattern">
        <rule>
            <columns>id</columns>
            <algorithm>sharding-by-pattern</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPattern">
        <property name="mapFile">partition-pattern.txt</property>
        <property name="defaultNode">0</property>
        <property name="patternValue">256</property>
    </function>
</mycat:rule>

<!--        配置说明:-->
<!--        * 1-32即代表id%256后分布的范围,如果在1-32则分区在0,其他类推。-->
<!--        * 如果id非数字,则会分配在defaultNode默认节点。-->

partition-pattern.txt

# id patition range start-end, data node index
## first host configuration
1-32=0
33-64=1
65-96=2
97-128=3
## second host configuration
129-160=4
161-192=5
193-224=6
225-256=7
0-0=7

test.sql

CREATE TABLE sharding_by_pattern (id varchar(20) null,`db_nm`  varchar(20) NULL);  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (1000, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (1002, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (30000, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (30004, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (40000, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (40005, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (60005, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (60006, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES (80006, database());  
INSERT INTO `sharding_by_pattern` (id,db_nm) VALUES ("abcd0", database());  
select * from sharding_by_pattern;  

1.4.4. ASCII码求模范围约束(字符串)

schema.xml

<table name="sharding_by_ascii" primaryKey="id" dataNode="dn$0-10" rule="partition-by-ascii" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="partition-by-ascii">
        <rule>
            <columns>id</columns>
            <algorithm>sharding-by-pattern-ascii</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-pattern-ascii" class="io.mycat.route.function.PartitionByPrefixPattern">
        <property name="mapFile">partition-pattern-ascii.txt</property>
        <property name="patternValue">256</property>
        <property name="prefixLength">5</property>
    </function>
</mycat:rule>

partition-pattern-ascii.txt

# id patition range start-end, data node index
## first host configuration
1-32=0
33-64=1
65-96=2
97-128=3
## second host configuration
129-160=4
161-192=5
193-224=6
225-256=7
0-0=7

test.sql

-- 此种规则类似于取模范围约束,此规则支持数据符号字母取模。

CREATE TABLE sharding_by_ascii (id varchar(20) null,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("1000a", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("1002A", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (30000, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (30004, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("4000B", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("4000b", database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (60007, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (60006, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES (80006, database());
INSERT INTO `sharding_by_ascii` (id,db_nm) VALUES ("abcd0", database());
select * from sharding_by_ascii;

1.4.5. 固定分片hash(二进制)

schema.xml

<table name="sharding_by_long" primaryKey="id" dataNode="dn$0-10" rule="sharding-by-long" />

rule.xml

<mycat:rule xmlns:mycat="http://io.mycat/">
    <tableRule name="sharding-by-long">
        <rule>
            <columns>id</columns>
            <algorithm>sharding-by-long</algorithm>
        </rule>
    </tableRule>
    <function name="sharding-by-long" class="io.mycat.route.function.PartitionByLong">
        <property name="partitionCount">2,1</property>
        <property name="partitionLength">256,512</property>
    </function>
</mycat:rule>

<!--        配置说明:-->
<!--        * partitionCount为指定分片个数列表。-->
<!--        * partitionLength为分片范围列表。-->
<!--        分区长度:默认为最大2^n=1024 ,即最大支持1024分区-->

<!--        约束 :-->
<!--        count,length两个数组的长度必须是一致的。-->
<!--        1024 = sum((count[i]*length[i])). count和length两个向量的点积恒等于1024-->
<!--        本例的分区策略:希望将数据水平分成3份,前两份各占25%,第三份占50%。(故本例非均匀分区)-->
<!--        // |<———————1024———————————>|-->
<!--        // |<—-256—>|<—-256—>|<———-512————->|-->
<!--        // | partition0 | partition1 | partition2 |-->
<!--        // | 共2份,故count[0]=2 | 共1份,故count[1]=1 |-->
<!--        int[] count = new int[] { 2, 1 };-->
<!--        int[] length = new int[] { 256, 512 };-->

test.sql

-- 本条规则类似于十进制的求模运算,区别在于是二进制的操作,是取id的二进制低10位,即id二进制&1111111111。此算法的优点在
-- 于如果按照十进制取模运算,则在连续插入1~10时,1~10会被分到1~10个分片,增大了插入事务的控制难度。而此算法根据二进制则
-- 可能会分到连接的分片,降低了插入事务的控制难度。

CREATE TABLE sharding_by_long (id int(10) null,`db_nm`  varchar(20) NULL);
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (1000, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (1002, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (30000, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (30004, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (4000, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (4000, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (60007, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (60006, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (80006, database());  
INSERT INTO `sharding_by_long` (id,db_nm) VALUES (0, database());  
select * from sharding_by_long;  

1.5. 六. 分片策略

  • 根据表数据量判断是否需要切分,确保切分后单分片表数据量为1000W左右。
  • 根据业务的情况选择合适的分片字段: 最频繁的或者最重要的查询条件。
  • 需要考虑扩容数据迁移问题(范围类,范围取模类不需要迁移,哈希类需要迁移)。
  • 有关联关系的表配置相同分片规则(ER思想,为了应用join等复杂sql),一对多对应关系一般按多的那一方切分。
  • 如果配置类数据, 更新频率比较少,考虑全局表。

源码链接: https://download.csdn.net/download/Gusand/12102557
参考:
推荐博客:https://blog.csdn.net/ygqygq2/article/details/78390985
在线教程:http://www.mycat.io/document/mycat-definitive-guide.pdf