{"id":1831,"date":"2016-02-23T16:30:00","date_gmt":"2016-02-23T14:30:00","guid":{"rendered":"https:\/\/www.sqlinthewild.co.za\/?p=1831"},"modified":"2016-02-29T23:26:27","modified_gmt":"2016-02-29T21:26:27","slug":"monitoring-wait-stats","status":"publish","type":"post","link":"https:\/\/www.sqlinthewild.co.za\/index.php\/2016\/02\/23\/monitoring-wait-stats\/","title":{"rendered":"Monitoring wait stats"},"content":{"rendered":"<p>This post, like last week\u2019s, is based off the presentation I did to the DBA Fundamentals virtual chapter.<\/p>\n<p>The request was for more details on the method I use to capture wait and file stats on servers, The methods are pretty similar, so I\u2019ll show waits.<\/p>\n<p>This is by no means the only way of doing it, it\u2019s the way I do it.<\/p>\n<h3>Part the First: Capture job<\/h3>\n<p>This is the easy part. Into a job step goes the following:<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">INSERT\u00a0 INTO Performance.dbo.WaitStats\r\nSELECT\u00a0 wait_type as WaitType,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 waiting_tasks_count AS NumberOfWaits,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 signal_wait_time_ms AS SignalWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 wait_time_ms - signal_wait_time_ms AS ResourceWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 GETDATE() AS SampleTime\r\nFROM\u00a0\u00a0\u00a0 sys.dm_os_wait_stats\r\nWHERE\u00a0\u00a0 wait_time_ms &gt; 0\r\n\u00a0\u00a0 \u00a0AND wait_type NOT IN (&lt;list of waits to ignore&gt;);<\/pre>\n<p>Schedule the job to run on an interval for a couple of days. I like to run it every 15 min, maybe every half an hour. I\u2019m trying to get overall behaviour, not identify queries. If I need later to see what queries incur a particular wait, I can use an extended event session.<\/p>\n<p>For the list of waits to ignore, I use Glenn\u2019s list, the latest version found at <a title=\"http:\/\/www.sqlskills.com\/blogs\/glenn\/sql-server-diagnostic-information-queries-detailed-day-14\/\" href=\"http:\/\/www.sqlskills.com\/blogs\/glenn\/sql-server-diagnostic-information-queries-detailed-day-14\/\">http:\/\/www.sqlskills.com\/blogs\/glenn\/sql-server-diagnostic-information-queries-detailed-day-14\/<\/a><\/p>\n<p>I run this no less than a day, preferably a week if I can. 2-3 days is normally what I get.<\/p>\n<h3>Part the Second: Analysis script<\/h3>\n<p>The analysis script does two things:<\/p>\n<ul>\n<li>Get the wait times within an interval<\/li>\n<li>Pivot them so that I can easily graph in excel<\/li>\n<\/ul>\n<p>To see which waits I want to include in the pivot, I look at the 20 waits with the highest increase in the interval monitored (this requires that the server wasn\u2019t restarted during it).<\/p>\n<p>I\u2019m not necessarily going to graph and analyse all of them, but it does help ensure I don\u2019t miss something interesting (like, for example, high LCK_M_Sch_S locks every day between 08:00 and 08:45)<\/p>\n<p>For the purposes of this post, let\u2019s say the ones I\u2019m interested in for a particular analysis are LCK_M_IX, PAGELATCH_EX, LATCH_EX and IO_COMPLETION.<\/p>\n<p>To be clear, those are for this example only. Do Not copy the below code and run without specifying the waits you\u2019re interested in looking at, or the results are going to be less than useless.<\/p>\n<p>The first thing I want to do is add a Row_Number based on the times the wait stats were recorded, so that I can join and take the difference between one interval and the next. In theory it should be possible to do this with times, but the insert doesn\u2019t occur at exactly the same time, to the millisecond, each interval, hence this would require fancy date manipulation. Easier to use a ROW_NUMBER<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">SELECT\u00a0 WaitType,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 NumberOfWaits,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SignalWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ResourceWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SampleTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ROW_NUMBER() OVER (PARTITION BY WaitType ORDER BY SampleTime) AS Interval\r\nFROM\u00a0\u00a0\u00a0 dbo.WaitStats\r\nWHERE\u00a0\u00a0 WaitType IN ('LCK_M_IX', \u2018PAGELATCH_EX\u2019, 'LATCH_EX', 'IO_COMPLETION');<\/pre>\n<p>Next step, turn that into a CTE, join the CTE to itself with an offset and take the difference of the waiting tasks, the signal wait time and the resource wait time.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">WITH\u00a0\u00a0\u00a0 RawWaits\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 AS (SELECT\u00a0\u00a0\u00a0 WaitType,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 NumberOfWaits,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SignalWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ResourceWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SampleTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ROW_NUMBER() OVER (PARTITION BY WaitType ORDER BY SampleTime) AS Interval\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 FROM\u00a0\u00a0\u00a0\u00a0\u00a0 dbo.WaitStats\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 WHERE\u00a0\u00a0\u00a0\u00a0 WaitType IN ('LCK_M_IX', \u2018PAGELATCH_EX\u2019, 'LATCH_EX', 'IO_COMPLETION')\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 )\r\n\u00a0\u00a0\u00a0 SELECT\u00a0 w1.SampleTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w1.WaitType AS WaitType,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w2.NumberOfWaits - w1.NumberOfWaits AS NumerOfWaitsInInterval,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w2.ResourceWaitTime - w1.ResourceWaitTime AS WaitTimeInInterval,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w2.SignalWaitTime - w1.SignalWaitTime AS SignalWaitTimeInInterval\r\n\u00a0\u00a0\u00a0 FROM\u00a0\u00a0\u00a0 RawWaits w1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 LEFT OUTER JOIN RawWaits w2 ON w2.WaitType = w1.WaitType\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 AND w2.Interval= w1.Interval + 1;<\/pre>\n<p>Last step, pivot the results. This will pivot and show the resource wait. Change the column that\u2019s in the select and the pivot to show the others. It doesn\u2019t matter what aggregation function is used because there\u2019s only one value in each interval, so sum, avg, min and max will all give the same result (just, don\u2019t use count)<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">WITH\u00a0\u00a0\u00a0 RawWaits\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 AS (SELECT\u00a0\u00a0\u00a0 WaitType,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 NumberOfWaits,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SignalWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ResourceWaitTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 SampleTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ROW_NUMBER() OVER (PARTITION BY WaitType ORDER BY SampleTime) AS Interval\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 FROM\u00a0\u00a0\u00a0\u00a0\u00a0 dbo.WaitStats\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 WHERE\u00a0\u00a0\u00a0\u00a0 WaitType IN ('LCK_M_IX', 'PAGELATCH_EX', 'LATCH_EX', 'IO_COMPLETION')\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 WaitIntervals\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 AS (SELECT\u00a0\u00a0\u00a0 w1.SampleTime,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w1.WaitType AS WaitType,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w2.NumberOfWaits - w1.NumberOfWaits AS NumerOfWaitsInInterval,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w2.ResourceWaitTime - w1.ResourceWaitTime AS WaitTimeInInterval,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 w2.SignalWaitTime - w1.SignalWaitTime AS SignalWaitTimeInInterval\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 FROM\u00a0\u00a0\u00a0\u00a0\u00a0 RawWaits w1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 LEFT OUTER JOIN RawWaits w2 ON w2.WaitType = w1.WaitType\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 AND w2.Interval = w1.Interval + 1\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 )\r\n\u00a0\u00a0\u00a0 SELECT\u00a0 *\r\n\u00a0\u00a0\u00a0 FROM\u00a0\u00a0\u00a0 (SELECT SampleTime, WaitType, WaitTimeInInterval FROM WaitIntervals\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ) p PIVOT ( AVG(WaitTimeInInterval) FOR WaitType IN (&#x5B;LCK_M_IX], &#x5B;PAGELATCH_EX], &#x5B;LATCH_EX], &#x5B;IO_COMPLETION]) ) AS pvt\r\n\u00a0\u00a0\u00a0 ORDER BY SampleTime;<\/pre>\n<p>And there we have a result that can easily be imported into excel (or R) and graphed or analysed further.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post, like last week\u2019s, is based off the presentation I did to the DBA Fundamentals virtual chapter. The request was for more details on the method I use to capture wait and file stats on servers, The methods are&#8230; <a class=\"read-more-button\" href=\"https:\/\/www.sqlinthewild.co.za\/index.php\/2016\/02\/23\/monitoring-wait-stats\/\">(Read more)<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"New blog post: Monitoring wait stats","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[25,15,16],"tags":[],"class_list":["post-1831","post","type-post","status-publish","format-standard","hentry","category-performance","category-sql-server","category-syndication"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p7h6n-tx","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/1831","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/comments?post=1831"}],"version-history":[{"count":8,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/1831\/revisions"}],"predecessor-version":[{"id":1853,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/posts\/1831\/revisions\/1853"}],"wp:attachment":[{"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/media?parent=1831"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/categories?post=1831"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlinthewild.co.za\/index.php\/wp-json\/wp\/v2\/tags?post=1831"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}