{"id":1623,"date":"2015-10-20T16:30:21","date_gmt":"2015-10-20T14:30:21","guid":{"rendered":"https:\/\/www.sqlinthewild.co.za\/?p=1623"},"modified":"2015-09-20T14:27:03","modified_gmt":"2015-09-20T12:27:03","slug":"of-clustered-indexes-and-ordering","status":"publish","type":"post","link":"https:\/\/www.sqlinthewild.co.za\/index.php\/2015\/10\/20\/of-clustered-indexes-and-ordering\/","title":{"rendered":"Of clustered indexes and ordering"},"content":{"rendered":"<p>There is a particularly irritating and persistent belief that indexes (usually it\u2019s the clustered that gets picked on) are always physically ordered within the data file by the key columns. That is, that the data within the database file is always ordered by the key column.<\/p>\n<p>It doesn\u2019t help that official documentation states this \u2018fact\u2019.<\/p>\n<p>I\u2019m going to diverge from my usual methodology of first proving (or disproving) a statement and then explaining it in this case.<\/p>\n<p>Do indexes (clustered or non-clustered) define the physical storage order of the rows?<\/p>\n<p>No, absolutely not.<\/p>\n<p>What indexes do is provide a logical ordering, a collection of pointers, that allow the storage engine to retrieve data from an index ordered by the index key, but that\u2019s logical ordering, it specified nothing regarding the physical ordering.<\/p>\n<p>The index structure is such that the page with key values 4, 5 and 6 will appear earlier in the index\u2019s logical ordering than the page with key values 10,11 and 12. Where these pages are in the file is not defined at all. The page with key values 10,11 and 12 could be page 240 in the database file while the page with key values 4, 5 and 6 could be page 655.<\/p>\n<p>On the data pages themselves there\u2019s no guarantee that the row with the key value 4 will appear earlier on the page than the row with the key value of 6. 6 could be the first row on the page and 4 last and that would be just fine.<\/p>\n<p>Let\u2019s prove this. Time for DBCC page and some undocumented commands.<\/p>\n<p>First up, the order of rows on the page. I\u2019m going to create a table in a nice new database (so that there are no other tables around messing things up) and populate it with some data.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">CREATE TABLE OddandEven (\r\nSomeNumber INT,\r\nFiller CHAR(500) DEFAULT ' '\r\n) ;\r\nGO\r\n\r\nCREATE UNIQUE CLUSTERED INDEX idx_SomeNumber ON OddandEven (SomeNumber);\r\nGO\r\n\r\nINSERT INTO OddandEven (SomeNumber)\r\nSELECT TOP (50) (ROW_NUMBER() OVER (ORDER BY object_id))*2 - 1 FROM sys.objects;\r\n\r\nINSERT INTO OddandEven (SomeNumber)\r\nSELECT TOP (50) (ROW_NUMBER() OVER (ORDER BY object_id))*2 FROM sys.objects;<\/pre>\n<p>So what I\u2019m doing there is simply inserting 50 odd numbers first and 50 even numbers second<\/p>\n<p>A quick check with DBCC IND shows me that page 89 of this database is a data page for this table. I\u2019m going to use dump style 2 for DBCC Page, because I want a raw binary dump with no interpretation (I\u2019m removing the portions that are just the filler, as that\u2019s just intentionally wasted space)<\/p>\n<p><span style=\"font-family: Lucida Console; font-size: x-small;\">000000000EB6AC50:\u00a0\u00a0 20020000 1000fb01 37332020 20202020 \u2020 &#8230;..\u00fb.<strong>73<\/strong><br \/>\n000000000EB6AE40:\u00a0\u00a0 20202020 20202020 20202020 20202002 \u2020\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 .<br \/>\n000000000EB6AE50:\u00a0\u00a0 00001000 fb013735 20202020 20202020 \u2020&#8230;.\u00fb.<strong>75 <\/strong><br \/>\n000000000EB6AE60:\u00a0\u00a0 20202020 20202020 20202020 20202020 \u2020<br \/>\n000000000EB6B040:\u00a0\u00a0 20202020 20202020 20202020 20020000 \u2020\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &#8230;<br \/>\n000000000EB6B050:\u00a0\u00a0 1000fb01 36342020 20202020 20202020 \u2020..\u00fb.<strong>64\u00a0 <\/strong><br \/>\n000000000EB6B060:\u00a0\u00a0 20202020 20202020 20202020 20202020 \u2020<br \/>\n000000000EB6B240:\u00a0\u00a0 20202020 20202020 20202002 00001000 \u2020\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &#8230;..<br \/>\n000000000EB6B250:\u00a0\u00a0 fb013636 20202020 20202020 20202020 \u2020\u00fb.<strong>66<\/strong><br \/>\n000000000EB6B260:\u00a0\u00a0 20202020 20202020 20202020 20202020 \u2020<\/span><\/p>\n<p>Hmm\u2026 73, 75, 64, 66. That\u2019s not the correct physical ordering\u2026 What happened here is that I inserted the odd values first, they were written to the pages then when I wrote the even numbers the pages had to split (firstly) leaving them probably around 50% full, then the even numbers were added in the empty space. SQL doesn\u2019t reorder the rows on the page (that would be expensive).<\/p>\n<p>What keeps track of the logical ordering, what rows should be read first, second, etc. to get the results back in logical ordering, is the slot array at the end of the page<\/p>\n<pre>OFFSET TABLE:<\/pre>\n<pre>Row - Offset\r\n 14 (0xe) - 7236 (0x1c44)\r\n 13 (0xd) - 3666 (0xe52)\r\n 12 (0xc) - 6726 (0x1a46)\r\n 11 (0xb) - 3156 (0xc54)\r\n 10 (0xa) - 6216 (0x1848)\r\n 9 (0x9) - 2646 (0xa56)\r\n 8 (0x8) - 5706 (0x164a)\r\n 7 (0x7) - 2136 (0x858)\r\n 6 (0x6) - 1626 (0x65a)\r\n 5 (0x5) - 5196 (0x144c)\r\n 4 (0x4) - 1116 (0x45c)\r\n 3 (0x3) - 4686 (0x124e)\r\n 2 (0x2) - 606 (0x25e)\r\n 1 (0x1) - 4176 (0x1050)\r\n 0 (0x0) - 96 (0x60)<\/pre>\n<p>That tells me that the row with the lowest key value is found at offset 0x60, the next lowest at offset 0x1050, then 0x25e, etc. The rows are not stored on this page in physical order, the slot array defines the logical order so that anything needing the rows in logical order of the index, can read them off the page that way.<\/p>\n<p>That answers the question about rows on a page. Let\u2019s now look at whether pages are always stored in physical order within the data file.<\/p>\n<p>I\u2019m going to drop the OddandEven table and create a new table with the rows sized so that only a few rows fit onto a page.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">CREATE TABLE PagePhysicalOrder (\r\n  SomeNumber INT,\r\n  Filler CHAR(800) DEFAULT ' '\r\n);\r\n\r\nCREATE UNIQUE CLUSTERED INDEX idx_TestingPhysicalOrder ON PagePhysicalOrder (SomeNumber)\r\n\r\nDECLARE @i INT = 9;\r\nWHILE @i &amp;gt;= 0\r\n  BEGIN\r\n    INSERT INTO dbo.PagePhysicalOrder (SomeNumber, Filler)\r\n    SELECT TOP (10)\r\n      ROW_NUMBER() OVER (ORDER BY (SELECT 1)) +@i*10,''\r\n      FROM sys.objects;\r\n\r\n    SET @i = @i - 1;\r\n  END<\/pre>\n<p>That gets me 100 rows in the table, written in groups of 10, with the higher values for SomeNumber being inserted first. Now, to find where the rows are stored, I\u2019m going to use the sys.fn_PhysLocFormatter function and the %%physloc%% virtual column. See <a title=\"http:\/\/www.sqlskills.com\/blogs\/paul\/sql-server-2008-new-undocumented-physical-row-locator-function\/\" href=\"http:\/\/www.sqlskills.com\/blogs\/paul\/sql-server-2008-new-undocumented-physical-row-locator-function\/\">http:\/\/www.sqlskills.com\/blogs\/paul\/sql-server-2008-new-undocumented-physical-row-locator-function\/<\/a> for more details on these.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">SELECT SomeNumber,\r\nsys.fn_PhysLocFormatter(%%physloc%%) AS RowLocation\r\nFROM dbo.PagePhysicalOrder<\/pre>\n<p><a href=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2015\/09\/RowPhysicalLocations.png\"><img loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-top: 0px; padding-left: 0px; display: inline; padding-right: 0px; border-width: 0px;\" title=\"RowPhysicalLocations\" src=\"https:\/\/www.sqlinthewild.co.za\/wp-content\/uploads\/2015\/09\/RowPhysicalLocations_thumb.png\" alt=\"RowPhysicalLocations\" width=\"241\" height=\"368\" border=\"0\" \/><\/a><\/p>\n<p>The output of the PhysLocFormatter is FileID : Page Number : Slot Index. The output shows the rows with SomeNumber 75, 76, 77 and a few others are on page 197 while rows with a lower SomeNumber (65-70) are on page 248, further into the data file than the page containing the larger values of SomeNumber.<\/p>\n<p>Hence we can say that the clustered index doesn\u2019t enforce the physical order of the pages in the data file either.<\/p>\n<p>The only thing that the clustered index (or nonclustered indexes) enforce is what values belong on a page together. If we have a table with an index on an integer column, we cannot have a situation where rows with a key value of 1, 2, 4, 8, 9 are on one page and rows with a key value of 3, 5, 6, 7 and 10 are on another. If only 5 rows fit onto a page, one page will have 1, 2, 3, 4 and 5 and another page will have 6, 7, 8, 9 and 10.\u00a0 The physical order of the rows on those pages is irrelevant, as is the physical order of those two pages in the data file.<\/p>\n<p>I suspect this myth came about because, when SQL creates or rebuilds an index, it will try as best as possible to put the pages of the index down in physical order of the index key. Doing so reduces logical fragmentation and allows the read-ahead reads to work as efficiently as possible. This applies only when the index is created, rebuilt or reorganised, not during regular operations.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There is a particularly irritating and persistent belief that indexes (usually it\u2019s the clustered that gets picked on) are always physically ordered within the data file by the key columns. That is, that the data within the database file is&#8230; <a class=\"read-more-button\" href=\"https:\/\/www.sqlinthewild.co.za\/index.php\/2015\/10\/20\/of-clustered-indexes-and-ordering\/\">(Read more)<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"Blog post: Of clustered indexes and ordering","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[29,15,16],"tags":[],"class_list":["post-1623","post","type-post","status-publish","format-standard","hentry","category-internals","category-sql-server","category-syndication"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p7h6n-qb","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/1623","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/comments?post=1623"}],"version-history":[{"count":5,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/1623\/revisions"}],"predecessor-version":[{"id":1628,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/1623\/revisions\/1628"}],"wp:attachment":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/media?parent=1623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/categories?post=1623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/tags?post=1623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}