2.Index

  1. Data structure
    1. Hash Index
    2. B-Tree
  2. Query
    1. Composite Index
    2. Index Merge
    3. composite index vs index merge
  3. External Link🔗

Data structure

索引有两种存储方式/引擎:Hash Index、B-Tree。前者只有当表格是以内存形式运行时,才能调用,所以我们常常探讨的索引是B-Tree为基础的索引。

Hash Index

MySQL官方是这样描述Hash index的。[mysql]

A type of index intended for queries that use equality operators, rather than range operators such as greater-than or BETWEEN. It is available for MEMORY tables. Although hash indexes are the default for MEMORY tables for historic reasons, that storage engine also supports B-tree indexes, which are often a better choice for general-purpose queries.

MySQL includes a variant of this index type, the adaptive hash index, that is constructed automatically for InnoDB tables if needed based on runtime conditions.

B-Tree

MySQL官方是这样描述B-Tree的。[mysql]

A tree data structure that is popular for use in database indexes. The structure is kept sorted at all times, enabling fast lookup for exact matches (equals operator) and ranges (for example, greater than, less than, and BETWEEN operators). This type of index is available for most storage engines, such as InnoDB and MyISAM.

Because B-tree nodes can have many children, a B-tree is not the same as a binary tree, which is limited to 2 children per node.

Contrast with hash index, which is only available in the MEMORY storage engine. The MEMORY storage engine can also use B-tree indexes, and you should choose B-tree indexes for MEMORY tables if some queries use range operators.

The use of the term B-tree is intended as a reference to the general class of index design. B-tree structures used by MySQL storage engines may be regarded as variants due to sophistications not present in a classic B-tree design. For related information, refer to the InnoDB Page Structure Fil Header section of the MySQL Internals Manual.

See Also hash index.

B-Tree 的特点是根据磁盘结构而量身制作的数据结构。适合范围查找。当随机插入数量过大时,树的平衡将会被打破,此时树将会重新调整结构,调整结构将会带来消耗。

Query

B-Tree 有索引有两种:Composite Index、Single index。MySQL 在使用若干个 Single index 进行查询时,会通过 Index Merge 进行优化查询。

Composite Index

CREATE TABLE test (
    id         INT NOT NULL,
    last_name  CHAR(30) NOT NULL,
    first_name CHAR(30) NOT NULL,
    PRIMARY KEY (id),
    INDEX name (last_name,first_name)  # << Composite Index
);

A multiple-column index can be considered a sorted array, the rows of which contain values that are created by concatenating the values of the indexed columns.

由于多列索引(联合索引)是一个排列好的数组,多列的值则为数组的元素。因为该数组是已经排序好的,因此,只能是遵循最左优先原则(leftmost principle) [mysql]

1
2
3
4
5
6
7
# 有效检索
A
A,B
A,B,C

# 无效
A,C

Index Merge

索引融合(Index Merge)是将若干个索引的结果集进行组合。Index Merge 有三种算法:[mysql]

  • Intersection 交集

  • Union 并集
  • Sort-Union 排序后的并集

composite index vs index merge

如果 Index Merge 和 Composite Index 使用的字段相同,那么两者的磁盘空间占用几乎一致,差异为:

  • Composite Index 里的字段是按一定顺序排序的。如果查询语句的不遵循 左优先原则 ,那么 Composite Index 将无法生效。
  • Index Merge 是将若干个索引查询结果进行结合,那么 Index Merge 调用索引数个线程,而 Composite Index 的物理索引只有一个,因此仅仅会调用一个线程。

因此:

  • Index Merge 的查询时间取决于耗时最长的线程。Composite Index 是可以看作是循环查询,会根据第一个索引字段查询,查询后再使用第二个字段进行查询,这样循环下去,而且第一个查询的结果集将会影响到第二个查询的速度。Composite Index 可以说耗时取决于位于前面字段的效率。