一、前置索引的好处
使用前置索引的好处
Normally, the entire value of a column is used to build the index - this is fine for short data types (integers and the like) but can result in a lot of data in the index for longer data types (CHAR and VARCHAR, for example). Using an index prefix allows you to make a trade off between the space required for the index and the cardinality of the index.
二、前置索引的坑
第一种情况
- 索引的长度和对应字段中的长度一致(name 与 idx_of_name)
| areas | CREATE TABLE `areas` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`areaid` varchar(20) NOT NULL,
`name` varchar(50) NOT NULL,
`cityid` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_of_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> explain select id, name from areas order by name;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| 1 | SIMPLE | areas | index | NULL | idx_of_name | 152 | NULL | 3144 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)
mysql> explain select * from areas where name = '龙潭区';
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
| 1 | SIMPLE | areas | ref | idx_of_name | idx_of_name | 152 | const | 1 | Using index condition |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)
mysql> explain select id, name from areas group by name;
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
| 1 | SIMPLE | areas | index | idx_of_name | idx_of_name | 152 | NULL | 3144 | Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.01 sec)
第二种情况:
- 索引的长度和对应字段中的长度不一致(name 与 idx_of_name)
| areas | CREATE TABLE `areas` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`areaid` varchar(20) NOT NULL,
`name` varchar(50) NOT NULL,
`cityid` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_of_name` (`name`(6))
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> explain select id, name from areas order by name;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | areas | ALL | NULL | NULL | NULL | NULL | 3144 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.01 sec)
mysql> explain select * from areas where name = '龙潭区';
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
| 1 | SIMPLE | areas | ref | idx_of_name | idx_of_name | 20 | const | 1 | Using where |
+----+-------------+-------+------+---------------+-------------+---------+-------+------+-------------+
1 row in set (0.01 sec)
mysql> explain select id, name from areas group by name;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| 1 | SIMPLE | areas | ALL | NULL | NULL | NULL | NULL | 3144 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
1 row in set (0.01 sec)
虽然在查询上用到了索引,但是在排序时,索引不生效。
使用索引就一定能加快查询效率么?不一定。上面的例子就告诉我们,前置索引
并不是一个万能药,它的确可以帮助我们在一个过长的字段中建立索引。但同时也会导致排序(order by, group by)查询都无法使用前置索引。
三、如何计算出前置索引的最佳长度
关于如何计算出,最恰当的设置索引的长度,总体思想:
# 全列选择性
SELECT COUNT(DISTINCT column_name) / COUNT(*) FROM table_name;
# 测试某一长度前缀的选择性
SELECT COUNT(DISTINCT LEFT(column_name, prefix_length)) / COUNT(*) FROM table_name;
当前置的选择性越接近全列的选择性的时候,索引效果越好。
通常可以索引开始的几个字符,而不是全部值,以节约空间并取得好的性能。这使索引需要的空间变小,但是也会降低选择性。索引选择性不重复的索引值和表中所有行的比值。高选择性的索引有好处,因为它使MySQL在查找匹配的时候可以过滤更多的行。唯一索引的选择率为1,为最佳值
。
具体操作如下:
CREATE TABLE `areas` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`areaid` varchar(20) NOT NULL,
`name` varchar(50) NOT NULL,
`cityid` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select count(distinct(name))/count(id) from areas;
+---------------------------------+
| count(distinct(name))/count(id) |
+---------------------------------+
| 0.8954 |
+---------------------------------+
1 row in set (0.01 sec)
mysql> select count(distinct left(name, 2))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 2))/count(id) |
+-----------------------------------------+
| 0.8648 |
+-----------------------------------------+
1 row in set (0.00 sec)
mysql> select count(distinct left(name, 3))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 3))/count(id) |
+-----------------------------------------+
| 0.8909 |
+-----------------------------------------+
1 row in set (0.01 sec)
mysql> select count(distinct left(name, 5))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 5))/count(id) |
+-----------------------------------------+
| 0.8941 |
+-----------------------------------------+
1 row in set (0.01 sec)
mysql> select count(distinct left(name, 6))/count(id) from areas;
+-----------------------------------------+
| count(distinct left(name, 6))/count(id) |
+-----------------------------------------+
| 0.8954 |
+-----------------------------------------+
1 row in set (0.00 sec)
前置索引为6时,选择性和列值选择性相同。那么就设定该索引的长度为6位。
mysql> select max(length(name)) from areas;
+-------------------+
| max(length(name)) |
+-------------------+
| 45 |
+-------------------+
1 row in set (0.11 sec)
mysql> select * from areas where length(name) = 45;
+------+--------+-----------------------------------------------+--------+
| id | areaid | name | cityid |
+------+--------+-----------------------------------------------+--------+
| 2624 | 530925 | 双江拉祜族佤族布朗族傣族自治县 | 530900 |
| 2965 | 622927 | 积石山保安族东乡族撒拉族自治县 | 622900 |
+------+--------+-----------------------------------------------+--------+
2 rows in set (0.01 sec)
四、索引的最大长度
mysql> select 255 * 3 from dual;
+---------+
| 255 * 3 |
+---------+
| 765 |
+---------+
1 row in set (0.00 sec)
mysql> show create table areas;
CREATE TABLE `areas` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`areaid` varchar(20) NOT NULL,
`name` varchar(255) NOT NULL,
`cityid` varchar(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_of_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8 |
mysql> alter table areas change `name` `name` varchar(256) not null;
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
mysql> alter table areas drop index `idx_of_name`;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table areas change `name` `name` varchar(256) not null;
Query OK, 3144 rows affected (0.06 sec)
Records: 3144 Duplicates: 0 Warnings: 0
mysql> show create table areas;
CREATE TABLE `areas` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`areaid` varchar(20) NOT NULL,
`name` varchar(256) NOT NULL,
`cityid` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3145 DEFAULT CHARSET=utf8
mysql> alter table areas add index `idx_of_name` (name);
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
添加索引时,如果不指定索引的长度,MySQL默认会选择该字段的长度作为索引长度。其实这里
alter table areas add index `idx_of_name` (name)
等同于
alter table areas add index `idx_of_name` (name(256))
MySQL 索引最大 bytes 为 767, 255 * 3 < 767
但是 256 * 3 > 767
所以,varchar(255)时候,创建默认索引可以成功,但是varchar(256)时,创建默认索引就不成功。
并且由上述列子可知,索引和字段相互影响,当索引设置为 idx_of_name
(name(255))
时,要改变字段的长度,也会提示 Specified key was too long
错误。
总结
根据具体的业务需求来选择索引,并不是索引都会加快查询速度。
参考
错误使用MySQL前缀索引导致的慢查询
MySQL 前缀索引
mysql前缀索引
mysql省市区邮政编码和区号级联sql文件