I want to spend some time over the next few months looking at query compilation and the plan cache, and there’s a couple concepts that I want to get cleared up beforehand. The first of those is around two terms that are often used interchangeably, compile and recompile.
A compile occurs when a query is given to the query optimiser and, when it does a lookup into the plan cache, no matching plan is found. The optimism must then compile the query, generating an execution plan, must add that plan to the plan cache (in most cases) and then must pass that plan onto the query execution engine so that the query can be executed. (http://technet.microsoft.com/en-us/library/Cc966425)
A recompile is something slightly different. For a recompile, the optimiser must find a matching plan when it queries the plan cache, must hand that cached plan over to the query execution engine and then while doing validation checks the execution engine must determine that then query plan is no longer valid and request the optimiser to partially or completely recompile the query. (http://technet.microsoft.com/en-us/library/Cc966425)
Subtle difference. Both cases result in the optimiser generating an execution plan, but the reasons can be different. Also worth noting is that a compile results in a new plan in the cache, a recompile simply replaces an existing plan.
Another difference since SQL 2005 – a compile is for the entire batch, but a recompile can be for just a single statement within the batch.
Now the theory’s dealt with, let’s look at some examples and see how we can track these two events and try and get a better understanding of which occurs when and how they look.
The tools I’m going to use to track these are performance monitor with the compiles/sec and recompiles/sec counters and SQL Profiler with the event SP:StmtRecompile event (there’s no profiler event for compilation). I’ll also check what’s in the plan cache after each test.
The first one’s going to be very simplistic, a query run against an empty plan cache.
DBCC FREEPROCCACHE GO EXEC dbo.OutStandingTotalByStatus GO
What we get from that is a non-zero value for SQL Compilations/sec (perfmon) and the following from profiler (The SQL Recompiles/sec remains 0)
and the plan cache now contains one plan with one use. (for more info on how the CacheMiss and CacheInsert events work, see http://www.sqlinthewild.co.za/index.php/2010/07/27/hit-and-miss/ and http://www.sqlinthewild.co.za/index.php/2010/08/31/come-and-gone/)
In this case, I hope it was clear, we had a compile occur (empty plan cache before, new plan added to cache).
Now what happens if, with no clearing of the cache nor anything else being done, I mark that procedure for recompile and run it again?
EXEC sp_recompile OutStandingTotalByStatus GO EXEC dbo.OutStandingTotalByStatus GO
This is interesting. We don’t see a non-zero value for SQLRecompiles/sec (as might have been expected), instead, as with the previous case we get a spike for SQLCompiles/sec. The profiler output shows why
Note the SP:CacheRemove being run for the stored procedure’s plan. So sp_recompile does not mark a plan as invalid and needing recompilation. It removes it from the cache entirely. The next time that procedure runs SQL does a cache lookup, doesn’t find a matching plan and compiles a new one (compile, not recompile)
Onwards… What about if I alter the procedure?
I’m not going to clear the cache first, the procedure has a plan in there from the last test, we can use that to see the effects.
ALTER PROCEDURE OutStandingTotalByStatus AS -- This has changed! SELECT 1; SELECT o.OrderStatus, SUM(UnitPrice*Quantity) AS TotalOutstanding FROM dbo.Orders o INNER JOIN dbo.OrderDetails od ON o.OrderID = od.OrderID WHERE ShippingDate IS NULL GROUP BY OrderStatus; GO EXEC dbo.OutStandingTotalByStatus GO
Again perfmon shows a bump in Compiles/sec and the recompiles/sec is still at 0. Again, profiler shows why.
Same as when I ran the sp_recompile, the ALTER PROCEDURE triggered a SP:CacheRemove. The plan was gone from cache at the point the stored procedure ran again, so this was considered a compile, not a recompile.
btw, for those curious, if I query the plan cache between altering the procedure and running it again, the procedure’s plan really has gone.
Let’s try something more complex, an alteration of one of the base tables. I’m going to clear the cache first, for reasons that will later be clear
DBCC FREEPROCCACHE GO EXEC dbo.OutStandingTotalByStatus GO ALTER TABLE dbo.OrderDetails ADD Filler CHAR(10) GO EXEC dbo.OutStandingTotalByStatus GO
Ha! This time perfmon shows a non-zero value for SQLRecompiles/sec. The profiler trace has a different form too
There’s no SP:CacheRemove. No SP:CacheInsert either.
The alteration of the table did not result in SQL going through the plan cache and removing plans for dependant objects. That makes sense if you think about it. There could be millions of plans in cache, for SQL to check each and every one to see if it has a dependency on the table just changed would be terribly time consuming, so it’s not done. What happens instead is that the next time the procedure runs SQL looks in the cache, finds the procedure’s plan (the SP:CacheHit), does the pre-execution validations and finds that one of the base objects has changed since that plan was compiled. The plan is then sent to the optimiser to recompile that plan (the SP:Recompile event)
Here’s the other interesting thing, from a query of the plan cache
The usecounts is 2, not 1. I ran that procedure once before altering the table, once after. The alter of the table forced SQL to recompile the plan on the second execution, but the usecount for the plan reflects both executions, the one from before the recompile and the one from after.
In all the previous cases, when the plan was actually removed from cache and then compiled and inserted the usecount was only 1 because when the plan was removed all related information was removed with it. With a recompile it’s not removed, so the usecounts aren’t reset to 0.
This is important to note. There are lots of articles (and books even) that say to check plan usecounts as a low usecount is a sign of frequent recompiles. As we can now see, it’s not. It’s a sign of frequent compiles, of the plans being removed from cache and later re-added, recompiles don’t affect the usecount (well, at least not in this case, there may be cases where it does)
Now try a couple more common cases, see what they show, starting with rebuilding an index.
DBCC FREEPROCCACHE GO EXEC dbo.OutStandingTotalByStatus GO ALTER INDEX PK_OrderDetails ON dbo.OrderDetails REBUILD GO EXEC dbo.OutStandingTotalByStatus GO
It’s a recompile (and the perfmon counters confirm that). As with the previous case, the usecount shown in sys.dm_exec_cached_plans is 2 for the procedure’s plan, not 1
Lastly, a statistics update. Can anyone guess what’s going to happen here? Anyone?
DBCC FREEPROCCACHE GO EXEC dbo.OutStandingTotalByStatus GO -- Insert a few more rows into OrderDetails at this point UPDATE STATISTICS dbo.OrderDetails WITH FULLSCAN GO EXEC dbo.OutStandingTotalByStatus GO
Yup, it’s a recompile.
It is worth nothing that (in my tests anyway), if at least one row isn’t changed, the stats update has no effect. Perhaps an optimisation that checks whether there have been modifications or not before updating? I’m not sure what’s causing that.
There is a lot more I could test, SET option changes, memory pressure, configuration changes, etc, but I think I’ve done enough that anyone interested can play themselves for further info.
Ok, but what’s the use? Was this just a dive into internals with no practical usage? I don’t think so.
One thing that’s important when investigating SQL problems is to understand what the various counters, events and columns are saying and what they mean. If one is faced with a server with very high SQLCompiles/sec, it’s a waste of time to go looking at whether automatic or manual stats updates are happening too often. Similarly if the SQLRecompiles/sec is high, investigating the usage of WITH RECOMPILE and OPTION(RECOMPILE) is completely off track and is going to waste time and not help with solving the problem.
That’s why I believe it’s important to understand what the various counters, events and columns actually means and what occurrences within SQL affects them. None of us have time to spend a couple days busy investigating something that’s not related to what’s actually wrong with the server based on an incorrect understanding of what SQL is showing us.