程序员人生 网站导航

看Sybase官方手册学索引工作原理

栏目:sybase时间:2014-01-16 14:14:41

 

Sybase数据库简介

 

Sybase公司成立于1984年,5自学网,公司名称“Sybase”取自“system”和“database”相结合的含义。Sybase公司的第一个关系数据库产品是1987年5月推出的Sybase SQLServer1.0。Sybase首先提出Client/Server 数据库体系结构的思想,并率先在Sybase SQLServer 中实现。Sybase觉得单靠一家力量,难以把SQLServer(那时不叫ASE)做到老大,于是联合微软,共同开发。后来1994年,两家公司合作终止。截止此时,应该是都拥有一套完全相同的SQLServer代码。Sybase SQLServer后来为了与微软的MS SQL Server相区分,改名叫:Sybase ASE(Adaptive Server Enterprise),其实,应该改名字的是微软。Sybase ASE仍然保持着大型数据库厂商的地位。在电信、交通、市政、银行等领域,拥有强大的市场。


不过似乎多是大公司的遗留系统。正是上面的历史原因,Sybase中不少语法跟MS SQLServer的T-SQL很像。现在网上的Sybase资料和文档比较少,多是很多年以前的了。这个Sybase的在线帮助手册算是比较完整的了,地址是。

 

以下是手册里第十二章索引如何工作的,对Sybase索引的工作原理讲解的比较易懂。并且大部分理论应该同样适用于其它数据库,所以还是比较有参考价值的。

 

Chapter 12: How indexes work

 

Indexes are the most important physical design element in improving database performance:

  • Indexes help prevent table scans. Instead of reading hundreds of data pages, a few index pages and data pages can satisfy many queries.

  • For some queries, data can be retrieved from a nonclustered index without ever accessing the data rows.

  • Clustered indexes can randomize data inserts, avoiding insert “hot spots” on the last page of a table.

  • Indexes can help avoid sorts, if the index order matches the order of columns in an order by clause.

  • In addition to their performance benefits, indexes can enforce the uniqueness of data.

    Indexes are database objects that can be created for a table to speed direct access to specific data rows. Indexes store the values of the key(s) that were named when the index was created, and logical pointers to the data pages or to other index pages.

    Although indexes speed data retrieval, they can slow down data modifications, since most changes to the data also require updating the indexes.

     

    索引可以防止全表扫描,对某些查询无需访问数据页(复合索引),聚集索引避免频繁插入新数据到最后一页,避免排序。

     

    Types of indexes

    Adaptive Server provides two types of indexes:

  • Clustered indexes, where the table data is physically stored in the order of the keys on the index:

  • For allpages-locked tables, rows are stored in key order on pages, and pages are linked in key order.

  • For data-only-locked tables, indexes are used to direct the storage of data on rows and pages, but strict key ordering is not maintained.

  • Nonclustered indexes, where the storage order of data in the table is not related to index keys

  • You can create only one clustered index on a table because there is only one possible physical ordering of the data rows. You can create up to 249 nonclustered indexes per table.

    A table that has no clustered index is called a heap. The rows in the table are in no particular order, and all new rows are added to the end of the table. Chapter 8, “Data Storage,” discusses heaps and SQL operations on heaps.

     

    聚集索引的数据页上的数据是根据索引键排好序的,因此一张表只能有一个聚集索引。没有聚集索引的表也叫堆。

     

    Index pages

    Index entries are stored as rows on index pages in a format similar to the format used for data rows on data pages. Index entries store the key values and pointers to lower levels of the index, to the data pages, or to individual data rows.

    Adaptive Server uses B-tree indexing, so each node in the index structure can have multiple children.

    Index entries are usually much smaller than a data row in a data page, and index pages are much more densely populated than data pages. If a data row has 200 bytes (including row overhead), there are 10 rows per page.

    An index on a 15-byte field has about 100 rows per index page (the pointers require 4–9 bytes per row, depending on the type of index and the index level).

    Indexes can have multiple levels:

  • Root level

  • Leaf level

  • Intermediate level

  •  

    B-tree平衡树,即父节点可以有多个子节点(不像二叉树只有两个)。

     

    Root level

    The root level is the highest level of the index. There is only one root page. If an allpages-locked table is very small, so that the entire index fits on a single page, there are no intermediate or leaf levels, and the root page stores pointers to the data pages.

    Data-only-locked tables always have a leaf level between the root page and the data pages.

    For larger tables, the root page stores pointers to the intermediate level index pages or to leaf-level pages.

     

    对于很小的表,只需一个根索引页即可。大表可能会有很多中间页。

     

    Leaf level

    The lowest level of the index is the leaf level. At the leaf level, the index contains a key value for each row in the table, and the rows are stored in sorted order by the index key:

  • For clustered indexes on allpages-locked tables, the leaf level is the data. No other level of the index contains one index row for each data row.

  • For nonclustered indexes and clustered indexes on data-only-locked tables, the leaf level contains the index key value for each row, a pointer to the page where the row is stored, and a pointer to the rows on the data page.

    The leaf level is the level just above the data; it contains one index row for each data row. Index rows on the index page are stored in key value order.

  •  

    页级别的索引页包含每行数据的名值对,并且索引页上的索引项是按索引键排好序的。

    对于聚集索引来说,页级别索引页就是数据页。对于非聚集索引,页级别包含所有数据行的索引项。(具体原因继续向下看)

     

    Intermediate level

    All levels between the root and leaf levels are intermediate levels. An index on a large table or an index using long keys may have many intermediate levels. A very small allpages-locked table may not have an intermediate level at all; the root pages point directly to the leaf level.

     

    Index Size

     describes the new limits for index size for APL and DOL tables:


    Table 12-1: Index row-size limit

    Page size

    User-visible index row-size limit

    Internal index row-size limit

    2K (2048 bytes)

    600

    650

    4K (4096bytes)

    1250

    1310

    8K (8192 bytes)

    2600

    2670

    16K (16384 bytes)

    5300

    5390

    Because you can create tables with columns wider than the limit for the index key, these columns become non-indexable. For example, if you perform the following on a 2K page server, then try to create an index on c3, the command fails and Adaptive Server issues an error message because column c3 is larger than the index row-size limit (600 bytes).

    create table t1 (c1 int,c2 int,c3 char(700))

    “Non-indexable” does not mean that you cannot use these columns in search clauses. Even though a column is non-indexable (as in c3, above), you can still create statistics for it. Also, if you include the column in a where clause, it will be evaluated during optimization.

     

    列的长度不能大于索引项的最大长度,否则会报错。

     

    Clustered indexes on allpages-locked tables

    In clustered indexes on allpages-locked tables, leaf-level pages are also the data pages, and all rows are kept in physical order by the keys.

    Physical ordering means that:

  • All entries on a data page are in index key order.

  • By following the “next page” pointers on the data pages, Adaptive Server reads the entire table in index key order.

  • On the root and intermediate pages, each entry points to a page on the next level.

     

    Clustered indexes and select operations

    ------分隔线----------------------------
    ------分隔线----------------------------

    最新技术推荐