Parameter sniffing

This seems to come up again and again on the forums.

At its heart, parameter sniffing is the ability of the SQL Server optimiser to know the values of parameters passed to a stored proc at the point that it compiles the procedure. The idea is that if the parameter values are known, then the appropriate column statistics can be used and the optimiser can estimate the number of rows that the various query operators will have to process for various different possible execution plans.

Since the approximate number of rows is known, the cost of each possible plan can be more accurately calculated and a more accurate execution plan can be selected.

So, why is parameter sniffing so often a problem? Well, mainly, because parameter values do change.

Parameter sniffing is most commonly a problem when data distribution is skewed, when there’s some parameter values that return a small number of rows and some that return a large number of rows. Take this as an example. (full code at the end as always)

select * from largetable where somestring = 'zzz'
-- there are 9 rows matching this  in a 10000 row table.

The optimiser picks an index seek with a lookup back to the cluster.

select * from largetable where somestring = 'abc'
--  there are around 1660 rows matching this in a 10000 row table.

The optimiser picks a clustered index scan.

So far, all’s well. For each query the more optimal plan has been selected. Now, let’s move those into a stored proc.

CREATE PROCEDURE TestSniffing @StringVar VARCHAR(10) AS
SELECT *  from largetable where somestring = @StringVar

Now, try the first again
EXEC TestSniffing ‘zzz’
As expected, we get an index seek.

And the second
EXEC TestSniffing ‘abc’
We also get an index seek, even though it is not the optimal plan. Why? Well, a close examination of the xml plan shows the following in it.

<ColumnReference Column="@StringVar" ParameterCompiledValue="'zzz'" />

When the plan was compiled, the parameter value was ‘zzz’. For all subsequent calls to that proc until the plan gets removed from cache, the plan that was optimal for the value ‘zzz’ is used.

The main symptom of a parameter sniffing problem is a proc that runs well with one parameter and badly with another, however the query within the proc runs well in QA/management studio, regardless of what parameter value is used.

So, what can be done if there’s a parameter sniffing problem?

One option is to use local variables instead of parameters in the query. The earlier proc would then look like this

CREATE PROCEDURE TestSniffing @StringVar VARCHAR(10) AS
DECLARE @InnerString VARCHAR(10)
SET @InnerString = @StringVar
SELECT *  from largetable where somestring = @InnerString

This works because the optimiser can’t see the value of local variables when it compiles the query, so it has to make a guess as to the number or rows affected. The guess that it makes is based on the average density of values in the column for an equality match, and 1/3 of the table for an inequality match. This way we’re guaranteed never to have a bad plan, but we also may not ever get an excellent plan.

A second option is to mark the proc or the query for recompilation each time it runs. Essentially telling SQL that there’s no single optimal plan for the proc/query, so please compile a new one at each execution.

The third option, available on SQL 2005, is to the the OPTIMISE FOR query hint to specify exactly what value the optimiser should use to estimate the affected rows. Once optimise for is used, the actual value of the parameter that the query compiled with is quite meaningless. This means that the query will run well for values of the parameter that affect similar number of rows to the value specified in the optimise for, and may run poorly for values that have very different number of affected rows.

Code:

CREATE TABLE LargeTable (
ID INT,
SomeString CHAR(6),
RandomDate DATETIME
)
GO

;WITH DataPopulate (RowNo, Strng,ADate) AS (
SELECT 1 AS RowNo, 'abc' as Strng,
DATEADD(dd, FLOOR(RAND()*75454),'1753/01/01')
UNION ALL
SELECT rowNo+1, CASE when rowno%1000=0 then 'zzz' else char(96+floor(rand(RowNo*96322)*6)) + 'bc' end as Strng,
DATEADD(dd, FLOOR(RAND(RowNo*96322)*85454),'1753/01/01')
FROM DataPopulate WHERE RowNo<10000
)
INSERT INTO LargeTable
SELECT * FROM DataPopulate
OPTION (MAXRECURSION 10000)
GO

CREATE CLUSTERED INDEX idx_Large1 on LargeTable (ID)
GO

CREATE INDEX idx_Test ON LargeTable(SomeString)
GO

There are two other aspects to parameter sniffing, and those are detailed in the following posts.

4 Comments

  1. Sandesh nagaraj

    This really helped me a lot. Thanks

    Reply
  2. Pingback: Beware of the sniff « did i say that?!

  3. Andre Guerreiro Neto

    I just saw your posts about this subject now and I must say that they’re probably the best I’ve read about parameter sniffing. You have the ability to explain things in a clear way without too much text and unnecessary theory. Thank you and keep up the good work.

    Reply
  4. Derek

    I’ve been struggling to understand this concept. Thanks for this post.

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.