上篇主要介绍了kingshard的工作流程、搭建及部分功能测试,今天来测试一下kingshard对sharding的支持
一、kingshard的sharding介绍
1.1 kingshard的sharding优势
现在开源的MySQL Proxy已经有几款了,并且有的已经在生产环境上广泛应用。但这些proxy在sharding方面,都是不能分子表的。也就是说一个node节点只能分一张表。但我们的线上需求通常是这样的:
我有一张非常大的表,行数超过十亿,需要进行拆分处理。假设拆分因子是512。 如果采用单node单数据库的分表方式,那其实这512个子表还是存在一个物理节点上,意义不大。 如果采用他们的sharding功能,就需要512个物理节点,也不现实。 面对这种需求,现有的proxy就不能很好地满足要求了。通常我们希望将512张子表均分在几个MySQL节点上,从而达到系统的横向扩展。
然而kingshard较好地实现了这种典型的需求。简单来说,kingshard的分表方案采用两级映射的方式:
1.kingshard将该表分成512张子表,例如:test_0000,test_0001,...test_511。2.将shardKey通过hash或range方式定位到其要操作的记录在哪张子表上。3.子表落在哪个node上通过配置文件设置。
1.2 sharding支持的操作
目前kingshard sharding支持insert, delete, select, update和replace语句, 所有这五类操作都支持跨子表。但写操作仅支持单node上的跨子表,select操作则可以跨node,跨子表。
1.3 sharding方式
1.3.1 range方式
基于整数范围划分来得到子表下标。该方式的优点:基于范围的查询或更新速度快,因为查询(或更新)的范围有可能落在同一张子表中。这样可以避免全部子表的查询(更新)。缺点:数据热点问题。因为在一段时间内整个集群的写压力都会落在一张子表上。此时整个mysql集群的写能力受限于单台mysql server的性能。并且,当正在集中写的mysql 节点如果宕机的话,整个mysql集群处于不可写状态。基于range方式的分表字段类型受限。
1.3.2 hash方式
kingshard采用(shardKey%子表个数)的方式得到子表下标。优点:数据分布均匀,写压力会比较平均地落在后端的每个MySQL节点上,整个集群的写性能不会受限于单个MySQL节点。并且当某个分片节点宕机,只会影响到写入该节点的请求,其他节点的写入请求不受影响。分表字段类型不受限。因为任何一个类型的分表字段,都可以通过一个hash函数计算得到一个整数。缺点:基于范围的查询或更新,都需要将请求发送到全部子表,对性能有一定影响。但如果不是基于范围的查询或更新,则性能不会受到影响。
1.3.3 按时间分表
kingshard中的分表字段支持MySQL中三种类型的时间格式
- date类型,格式:YYYY-MM-DD,例如:2016-03-04,注意:2016-3-04,2016-03-4,2016-3-4等格式kingshard都是不支持的。
- datetime类型,格式:YYYY-MM-DD HH:MM:SS,例如:2016-03-04 13:23:43,注意:2016-3-04 13:23:43,2016-03-4 13:23:43,2016-3-4 13:23:43等格式kingshard都是不支持的,必须严格按照规定的格式,kingshard才支持。
- timestamp类型,整数类型,例如:1457165568,对应的是:2016-3-5 16:12:48
1.4 sharding相关的配置介绍
在配置文件中,有关sharding设置是通过schema设置:
schema :- nodes: [node1,node2] rules: default: node1 shard: - #分表所在的DB db : kingshard #分表名字 table: test_shard_hash #sharding key key: id #子表分布的节点名字 nodes: [node1, node2] #sharding类型 type: hash #子表个数分布,表示[test_shard_hash_0000, test_shard_hash_0001, test_shard_hash_0002, test_shard_hash_003]在node1上。 #[test_shard_hash_0004, test_shard_hash_0005, test_shard_hash_0006, test_shard_hash_007]在node2上 locations: [4,4] - #分表所在的DB db : kingshard #分表名字 table: test_shard_range #sharding key key: id #sharding类型 type: range #子表分布的节点名字 nodes: [node1, node2] #子表个数分布,表示[test_shard_range_0000, test_shard_range_0001, test_shard_range_0002, test_shard_range_003]在node1上。 #[test_shard_range_0004, test_shard_range_0005, test_shard_range_0006, test_shard_range_007]在node2上 locations: [4,4] #每张子表的记录数。[0,10000)在test_shard_range_0000上,[10000,20000)在test_shard_range_0001上。.... table_row_limit: 10000
一个kingshard实例只能有一个schemas,从上面的配置可以看出,schema可以分为三个部分:
1.db,表示这个schemas使用的数据库。2.nodes,表示子表分布的节点名字。3.rules,sharding规则。其中rules又可以分为两个部分: - default,默认分表规则。所有操作不在shard(default规则下面的规则)中的表的SQL语句都会发向该node。 - hash,hash分表方式。 - range,range分表方式
1.5 分表架构图
二、sharding方式演示
2.1 range分表方式演示
2.1.1 基础环境准备
分别在三个节点创建测试表:
建表语句:CREATE TABLE `test_shard_range_0000` ( `id` bigint(64) unsigned NOT NULL, `str` varchar(256) DEFAULT NULL, `f` double DEFAULT NULL, `e` enum('test1','test2') DEFAULT NULL, `u` tinyint(3) unsigned DEFAULT NULL, `i` tinyint(4) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8node1:mysql> show tables;+-----------------------+| Tables_in_test |+-----------------------+| test_shard_range_0000 || test_shard_range_0001 |+-----------------------+2 rows in set (0.00 sec)node2:mysql> show tables;+-----------------------+| Tables_in_test |+-----------------------+| test_shard_range_0002 || test_shard_range_0003 |+-----------------------+2 rows in set (0.00 sec)node3:mysql> show tables;+-----------------------+| Tables_in_test |+-----------------------+| test_shard_range_0004 || test_shard_range_0005 |+-----------------------+2 rows in set (0.00 sec)
kingshard配置及启动:
配置文件
addr : 0.0.0.0:7676user : tompassword : 123456web_addr : 0.0.0.0:9797web_user : adminweb_password : adminlog_path : /home/mysql/3309_kingshard/log/ks.loglog_level : debuglog_sql: onallow_ips : 127.0.0.1nodes :- name : node1 max_conns_limit : 32 user : tom password : 123456 master : xxxxxx:3309 down_after_noalive : 32- name : node2 max_conns_limit : 32 user : tom password : 123456 master : xxxxxxx:3309 down_after_noalive: 32- name : node3 max_conns_limit : 32 user : tom password : 123456 master : xxxxxxx:3309 down_after_noalive : 32schema : nodes: [node1,node2,node3] default: node1 shard:#按hash分表# -# db : test# table: test_shard_hasg# key: id# nodes: [node1, node2, node3]# type: hash# locations: [2,2,2]##按range分表# -# db : test# table: test_shard_range# key: id# type: range# nodes: [node1, node2, node3]# locations: [2,2,2]# table_row_limit: 10#按year分表 - db : test table: test_shard_year key: ctime nodes: [node1, node2, node3] type: date_year date_range: [2016,2017,2018]# -# db : kingshard# table: test_shard_month# key: dtime# type: date_month# nodes: [node1,node2]# date_range: [201603-201605,201609-201612]# -# db : kingshard# table: test_shard_day# key: mtime# type: date_day# nodes: [node1,node2]# date_range: [20160306-20160307,20160308-20160309]
注意:nodes的信息要和表信息还有schema里面的nodes信息对应,不然分表写入会有问题
启动
./bin/kingshard -config=./conf/ks.yaml
2.1.2 测试演示
在kingshard下面操作:mysql> insert into test_shard_range(id,str,f,e,u,i) values(1,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(11,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(21,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(31,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(41,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(51,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)跨节点查询:mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 2 | 3 || 11 | flike | 3.14 | test2 | 2 | 3 || 21 | flike | 3.14 | test2 | 2 | 3 || 31 | flike | 3.14 | test2 | 2 | 3 || 41 | flike | 3.14 | test2 | 2 | 3 || 51 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+10 rows in set (0.00 sec)跨节点更新:mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 2 | 3 || 11 | flike | 3.14 | test2 | 2 | 3 || 21 | flike | 3.14 | test2 | 2 | 3 || 22 | flike | 3.14 | test2 | 2 | 3 || 23 | flike | 3.14 | test2 | 2 | 3 || 31 | flike | 3.14 | test2 | 2 | 3 || 32 | flike | 3.14 | test2 | 2 | 3 || 33 | flike | 3.14 | test2 | 2 | 3 || 41 | flike | 3.14 | test2 | 2 | 3 || 51 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+10 rows in set (0.00 sec)mysql> update test_shard_range set i=100;Query OK, 10 rows affected (0.00 sec)mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 2 | 100 || 11 | flike | 3.14 | test2 | 2 | 100 || 21 | flike | 3.14 | test2 | 2 | 100 || 22 | flike | 3.14 | test2 | 2 | 100 || 23 | flike | 3.14 | test2 | 2 | 100 || 31 | flike | 3.14 | test2 | 2 | 100 || 32 | flike | 3.14 | test2 | 2 | 100 || 33 | flike | 3.14 | test2 | 2 | 100 || 41 | flike | 3.14 | test2 | 2 | 100 || 51 | flike | 3.14 | test2 | 2 | 100 |+----+-------+------+-------+------+------+10 rows in set (0.00 sec)跨节点删除:mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 2 | 100 || 11 | flike | 3.14 | test2 | 2 | 100 || 21 | flike | 3.14 | test2 | 2 | 100 || 22 | flike | 3.14 | test2 | 2 | 100 || 23 | flike | 3.14 | test2 | 2 | 100 || 31 | flike | 3.14 | test2 | 2 | 100 || 32 | flike | 3.14 | test2 | 2 | 100 || 33 | flike | 3.14 | test2 | 2 | 100 || 41 | flike | 3.14 | test2 | 2 | 100 || 51 | flike | 3.14 | test2 | 2 | 100 |+----+-------+------+-------+------+------+10 rows in set (0.00 sec)mysql> delete from test_shard_range where id >11 and id <51;Query OK, 7 rows affected (0.00 sec)mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 2 | 100 || 11 | flike | 3.14 | test2 | 2 | 100 || 51 | flike | 3.14 | test2 | 2 | 100 |+----+-------+------+-------+------+------+3 rows in set (0.01 sec)对子查询的支持:(不支持)mysql> select * from test_shard_range where id in (select u from test_shard_range where id=1);ERROR 1146 (42S02): Table 'test.test_shard_range' doesn't exist在node1下查看:mysql> select * from test_shard_range_0000;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)mysql> select * from test_shard_range_0001;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 11 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)在node2下查看:mysql> select * from test_shard_range_0002;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 21 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+3 rows in set (0.00 sec)mysql> select * from test_shard_range_0003;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 31 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+3 rows in set (0.00 sec)在node3下查看:mysql> select * from test_shard_range_0004;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 41 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)mysql> select * from test_shard_range_0005;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 51 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)
结论:range分表更能比较好用,不过不支持子查询,如果支持子查询就好了
2.2 hash分表方式演示
插入操作:mysql> insert into test_shard_range(id,str,f,e,u,i) values(1,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(2,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(3,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(4,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(5,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(6,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(7,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(8,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_range(id,str,f,e,u,i) values(9,"flike",3.14,'test2',2,3);Query OK, 1 row affected (0.00 sec)查询操作:mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 6 | flike | 3.14 | test2 | 2 | 3 || 1 | flike | 3.14 | test2 | 2 | 3 || 7 | flike | 3.14 | test2 | 2 | 3 || 2 | flike | 3.14 | test2 | 2 | 3 || 8 | flike | 3.14 | test2 | 2 | 3 || 3 | flike | 3.14 | test2 | 2 | 3 || 9 | flike | 3.14 | test2 | 2 | 3 || 4 | flike | 3.14 | test2 | 2 | 3 || 5 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+9 rows in set (0.01 sec)更新操作:mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 6 | flike | 3.14 | test2 | 2 | 3 || 1 | flike | 3.14 | test2 | 2 | 3 || 7 | flike | 3.14 | test2 | 2 | 3 || 2 | flike | 3.14 | test2 | 2 | 3 || 8 | flike | 3.14 | test2 | 2 | 3 || 3 | flike | 3.14 | test2 | 2 | 3 || 9 | flike | 3.14 | test2 | 2 | 3 || 4 | flike | 3.14 | test2 | 2 | 3 || 5 | flike | 3.14 | test2 | 2 | 3 |+----+-------+------+-------+------+------+9 rows in set (0.01 sec)mysql> update test_shard_range set u=10;Query OK, 9 rows affected (0.00 sec)mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 6 | flike | 3.14 | test2 | 10 | 3 || 1 | flike | 3.14 | test2 | 10 | 3 || 7 | flike | 3.14 | test2 | 10 | 3 || 2 | flike | 3.14 | test2 | 10 | 3 || 8 | flike | 3.14 | test2 | 10 | 3 || 3 | flike | 3.14 | test2 | 10 | 3 || 9 | flike | 3.14 | test2 | 10 | 3 || 4 | flike | 3.14 | test2 | 10 | 3 || 5 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+9 rows in set (0.00 sec)distinct函数的支持:mysql> select * from test_shard_range;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 6 | flike | 3.14 | test2 | 10 | 3 || 1 | flike | 3.14 | test2 | 10 | 3 || 7 | flike | 3.14 | test2 | 10 | 3 || 2 | flike | 3.14 | test2 | 10 | 3 || 8 | flike | 3.14 | test2 | 10 | 3 || 3 | flike | 3.14 | test2 | 10 | 3 || 9 | flike | 3.14 | test2 | 10 | 3 || 4 | flike | 3.14 | test2 | 10 | 3 || 5 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+9 rows in set (0.00 sec)mysql> select count(distinct id) as tom from test_shard_range;+-----+| tom |+-----+| 9 |+-----+1 row in set (0.00 sec)mysql> select count(distinct u) as tom from test_shard_range;+-----+| tom |+-----+| 6 |+-----+1 row in set (0.00 sec)mysql> select count(distinct i) as tom from test_shard_range;+-----+| tom |+-----+| 6 |+-----+1 row in set (0.00 sec)group by的支持:mysql> select * from test_shard_range where id > 0 group by u limit 1;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 5 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)mysql> select * from test_shard_range where id > 0 group by u limit 1;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 3 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)mysql> select * from test_shard_range where id > 0 group by u limit 1;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+1 row in set (0.00 sec)mysql> select * from test_shard_range where id > 0 group by u limit 1;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 5 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+1 row in set (0.01 sec)order by的支持:mysql> select * from test_shard_range where id > 0 order by id;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 10 | 3 || 2 | flike | 3.14 | test2 | 10 | 3 || 3 | flike | 3.14 | test2 | 10 | 3 || 4 | flike | 3.14 | test2 | 10 | 3 || 5 | flike | 3.14 | test2 | 10 | 3 || 6 | flike | 3.14 | test2 | 10 | 3 || 7 | flike | 3.14 | test2 | 10 | 3 || 8 | flike | 3.14 | test2 | 10 | 3 || 9 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+9 rows in set (0.00 sec)mysql> select * from test_shard_range where id > 0 order by id limit 5;+----+-------+------+-------+------+------+| id | str | f | e | u | i |+----+-------+------+-------+------+------+| 1 | flike | 3.14 | test2 | 10 | 3 || 2 | flike | 3.14 | test2 | 10 | 3 || 3 | flike | 3.14 | test2 | 10 | 3 || 4 | flike | 3.14 | test2 | 10 | 3 || 5 | flike | 3.14 | test2 | 10 | 3 |+----+-------+------+-------+------+------+5 rows in set (0.00 sec)
注意:
1、当更新的记录落在不同的子表,kingshard会以非事务的方式将更新操作发送到多个node上。
2、不支持distinct函数。
3、跨节点的group by可能每次出来的结果都不一样。
2.3 date分表演示
创建测试表:CREATE TABLE `test_shard_year_2018` ( `id` int(10) NOT NULL, `name` varchar(40) DEFAULT NULL, `ctime` datetime DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8插入数据:mysql> insert into test_shard_year(id,name,ctime) values(12,"hello","2016-02-22 13:23:45");Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_year(id,name,ctime) values(12,"hello","2017-02-22 13:23:45");Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_year(id,name,ctime) values(12,"hello","2018-02-22 13:23:45");Query OK, 1 row affected (0.00 sec)mysql> insert into test_shard_year(id,name,ctime) values(12,"hello","2019-02-22 13:23:45");ERROR 1146 (42S02): Table 'test.test_shard_year_2019' doesn't exist查看数据:mysql> select * from test_shard_year;+----+-------+---------------------+| id | name | ctime |+----+-------+---------------------+| 12 | hello | 2016-02-22 13:23:45 || 12 | hello | 2017-02-22 13:23:45 || 12 | hello | 2018-02-22 13:23:45 |+----+-------+---------------------+3 rows in set (0.01 sec)mysql> /*node1*/select * from test_shard_year_2016;+----+-------+---------------------+| id | name | ctime |+----+-------+---------------------+| 12 | hello | 2016-02-22 13:23:45 |+----+-------+---------------------+1 row in set (0.00 sec)mysql> /*node1*/select * from test_shard_year_2017;ERROR 1146 (42S02): Table 'test.test_shard_year_2017' doesn't existmysql> /*node2*/select * from test_shard_year_2017;+----+-------+---------------------+| id | name | ctime |+----+-------+---------------------+| 12 | hello | 2017-02-22 13:23:45 |+----+-------+---------------------+1 row in set (0.00 sec)mysql> /*node3*/select * from test_shard_year_2018;+----+-------+---------------------+| id | name | ctime |+----+-------+---------------------+| 12 | hello | 2018-02-22 13:23:45 |+----+-------+---------------------+1 row in set (0.00 sec)mysql> /*node3*/select * from test_shard_year;ERROR 1146 (42S02): Table 'test.test_shard_year' doesn't exist
结论:date分表功能可用。
参考地址:https://github.com/flike/kingshard/
三、kingshard性能测试、优化
https://github.com/flike/kingshard/blob/master/doc/KingDoc/kingshard_performance_test.md
https://github.com/flike/kingshard/blob/master/doc/KingDoc/kingshard_performance_profiling.md