What is fragmentation?

It’s common knowledge that SQL indexes become fragmented over time and should be rebuilt to remove that fragmentation. But what exactly is fragmentation?

First, a little bit on index structure.

The index key defines the logical order of the index leaf pages. Each page has two pointers, one to the previous page and one to the next page in the index.

A hypothetical index on an integer column

In an ideal situation, the logical order of the pages would correspond to the physical order, and a page with higher index key values  would have a higher page number (corresponding to physical position in the file) than a page with lower index key values.

If that is the case for all pages in the index leaf level, then the index has no fragmentation. Unfortunatly, that is not normally the case. Inserts and updates to the index can cause page splits. A page split moves half the rows on one of the index leaf pages to a new page and then linkes that page into the chain. It’s unlikely that a free page will be in the correct physical logation in the file and hence after the page split, the physical and logical ordering of the pages will no longer be the same.

The logical fragmentation is a measure of how different the logical and physical orderings are. It’s a ratio of the number of pages found out of order (lower in the file than the previous page in the index) to the total number of pages in the index.

As an example, consider an index with 1000 pages in it. While traversing the index leaf level, 400 of those pages are found at a physical location lower than the page before them in the index. Hence the ratio of out of order pages to total pages is 400:1000 and we can say that the index is 40% fragmented.

Because fragmentation is purely a feature of where the pages are on disk, it only affects SQL when the table/index in question is not in memory and has to be read from disk. For this reason, it’s not usually necessary to worry about the fragmentation of tables with very few pages (<100) as they will probably remain in memory if they are heavily used.

On larger tables the presence of fragmentation at the leaf level will make scans of the index less efficient, requiring more IOs and disk time to get the data off disk. If the system already has an IO bottleneck, this can cause severe problems.

Detecting fragmentation

There are two ways to see the fragmentation of an index. The first is the older SQL 2000 way. It still works in SQL 2005 and 2008, but is deprecated in both versions and will be removed from a future version

DBCC SHOWCONTIG

Showcontig can display the fragmentation results for a single index, for all the indexes on a table or for all of the indexes in the database, depending on the options passed to it. It also can return results as a recordset, instead of text, which is very usweful for automated index maintenance

[source:SQL]
DBCC SHOWCONTIG (<Table Name>) — just for this table
DBCC SHOWCONTIG (<Table Name>, <Index Name>) — just for this index
DBCC SHOWCONTIG WITH ALL_INDEXES — everything in the DB
DBCC SHOWCONTIG WITH TABLERESULTS — results in a recordset
[/source]

sys.dm_db_index_physical_stats

This is one of the DMVs introduced with SQL 2005 and it’s the replacement for showcontig. It takes 5 parameters, the database ID, the table, the index, the partition and a setting for how detailed the results should be.

[source:SQL]SELECT * FROM sys.dm_db_index_physical_stats (NULL, NULL, NULL, NULL, NULL) — all indexes on all tables in all databases
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL); — all indexes in this DB
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(<Table Name>), NULL, NULL, NULL); — all indexes on that table
SELECT * FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(<Table Name>), <Index ID>, NULL, NULL) — stats for just that index[/source]

8 Comments

  1. Jack Corbett

    Wow what a great blog! I’ll have to add this one to my favorites! 😉

    Reply
  2. Gail

    LOL. No need to overdo it now… 😉

    Reply
  3. Mark

    Hi Gail

    With the results returned from the DMV dm_db_index_physical_stats, at which values should i start considering recreating the indexes. I have just run the query on one of our fact tables and I see the avg_fragmentation_in_percent column is quite high for all indexes on this table.

    Reply
  4. Gail

    I would say 25-30% is a good place to start. Rebuild any indexes with a fragmentation above that.

    Also check the sizes of the tables. There’s no point and often no gain in rebuilding indexes on very small tables. Don’t worry about tables with less than 100 pages

    Reply
  5. Mark

    Thanks! This is a huge table, 40 mil plus records, the clustered index fragmentation is 9% but most other non clustered indexes on this table are above 50%. Would you say this is a problem?

    Reply
  6. Gail

    It could be, especially on a table that large. When you get some downtime, rebuild the nonclustered indexes, one at a time. It shouldn’t take too long. The nonclusters (unless they are very wide) will be much smaller than the table itself

    Reply
  7. Henrico

    Hi Gail
    Thank you, this article made things a bit clearer.
    Keep up the excellent work.
    Hope to join some of your sessions in JHB in the future.

    Henrico

    Reply
  8. Tobin

    Very Good Article!!!

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.