Parameter sniffing, pt 3

Last, but not least, here’s one final look at parameter sniffing.

In part 1 of this mini-series I wrote about data skew and the problems it can cause with parameters. In part 2, I looked at what happens when the optimiser can’t sniff values. In this post, I’m going to look at a third thing that can cause parameter sniffing problems. It’s fairly obvious once you know about it, but the first time it was pointed out to me I was a little surprised.

So, to recap. When the query optimiser gets a stored procedure to compile, it knows the values of any parameter passed to that procedure. It will then compile the procedure and optimise the queries based upon the value of those parameters. The optimiser cannot sniff the values of variables, because the values of the variables have not been set at the time that the procedure is optimised.

I’m going to use the same sample code and data as in the first article, to keep things consistent.

From the tests that were done before, I know that the query

select * from largetable where somestring = 'zzz'

executes optimally with an index seek and returns 9 rows. Likewise, I know that the query

select * from largetable where somestring = 'abc'

executes optimally with a clustered index scan and returns 1667 rows.

Now, let’s see if I can get the optimiser to make the wrong choice.

CREATE PROCEDURE TestSniffing3 @StringVar VARCHAR(10) AS
IF @StringVar not like 'a%'
SET @StringVar = 'abc'

SELECT *  from largetable where somestring = @StringVar

GO

Looks simple enough. It the parameter isn’t within a certain range, set it to some default value. This kind of stored proc construction isn’t unusual. I often see it in search procs. Pass in a NULL if you want all rows to match. Let’s see what the optimiser makes of it.

EXEC TestSniffing3 'zzz'

It returns 1667 rows, as expected. Now take a look at the execution plan. It’s doing an index seek. Not what was expected and not optimal. On my machine the key lookup’s in at 87% of the query.

So, what went wrong?

A look at the properties of the index seek give a clue. Estimated rows – 9 actual rows – 1667. The xml form of the execution plan give another clue.

<ColumnReference Column="@StringVar"
ParameterCompiledValue="'zzz'" ParameterRuntimeValue="'abc'" />

When this procedure was compiled, the value of the parameter was ‘zzz’. The optimiser compiled a plan optimal for that value. It didn’t and couldn’t know that the parameter was going to change within the stored procedure before the query was executed.

This is something that can really bite badly. There often doesn’t appear to be a good reason for the plan to be wrong. Most people don’t go reading the raw XML of the execution plans, for good reason. It esspecially needs to be ketp in mind on procs that have multiple optional parameters, like the following

create procedure DoSearch @var1 varchar(10) = NULL, @var2 varchar(10) = NULL
AS
IF @var1 is null
@var1 = '%'

IF @var2 is null
@var2 = '%'

SELECT * FROM SomeBigTable WHERE Col1 like @var1 AND Col2 LIKE @var2
GO

If, when the proc is first compiled, either of the parameters is NULL, SQL will compile a plan optimal for no rows because a LIKE NULL will not match anything. If the query then goes on to match the entire table, the plan is not going to be very optimal at all.

That I think concludes the short series on parameter sniffing. As always, I’m interested in hearing other people’s take on these problems and their favourite solutions.

13 Comments

  1. Rob

    Hi Gail,

    Great articles you’ve written here.

    I’ve been reading these posts on sniffing with interest as I’ve had some problems with this in the past.

    It’s probably me but it seems that you’re stuffed either way.

    Would it be better to have a plan that’s optimised for a small rowset but returns a large one or vice versa?

    I realise that modifying the params within the proc will cause problems but is there a way of covering ‘abc’ and ‘xyz’ optimally?

    Reply
  2. Gail

    I haven’t tested, but gut feel is that you’ll have less problems with a plan that’s optimised for more rows than it gets.

    Still not ideal, memory allocations will be larger than necessary, expensive operators will be used where cheap ones would have sufficed, but it shouldn’t be as bad as a case where an operator that’s only optimal for small row counts was selected, but lots of rows were affected.

    I think I’d use something like that as a last resort when all other fixes have not worked.

    Reply
  3. Rob

    Thank you for the reply,

    I guess it’s a case of getting stats on the mean row count over say, a month and forcing the plan to optimise based on that if there is a strong trend rather than hoping the first execution will suffice.

    What I can gather is that in the case of a query that has a wide spread of row counts and no strong trend it would be better to have it scan.

    If it were seeking, and returned a large rowset it would be more innefficient compared to a scan only picking out one row if that’s all that was returned?

    On that note, off the top of your head do you know the overhead of forcing a recompile every execution? My gut feeling is that would just be daft.

    I appreciate that this isn’t a support forum, I should probably ask over on SqlSQLServerCentral.com 🙂 but it would be nice to know that I’m thinking along the right lines.

    Reply
  4. Gail

    Please do ask this over at Central. It’s easier to go into detail in a post reply there than it is in blog comments. Post the link here and I’ll pick it up.

    Reply
  5. joe Tigeleiro

    HI GAIL,
    A well written series. Looking at the xml plans, the parameters appear as two different variables, the ParameterCompiledValue and the ParameterRuntimeValue. Was it is the difference between the two?

    Reply
  6. Gail

    The compiled value is the value that was passed as a parameter for the first execution of the query, the one that got optimised. The runtime value is the value of the parameter for this particular execution.

    Reply
  7. joe Tigeleiro

    So just to make sure I understand this. If I read the plan and there is only the compiled version, then this is what was used for this execution. If the runtime was listed, then that was used. I have seen plans where both were listed with the same parameter values. What should I count on.

    Reply
  8. Gail

    No. If you look at a plan and there’s only the compiled, you’ve got an estimated execution plan.
    If the plan you are looking at has both compiled and runtime, then it’s an actual execution plan.

    https://www.sqlinthewild.co.za/index.php/2009/02/19/estimated-and-actual-execution-plan-revisited/

    If they’re both the same value then it means that the plan was compiled with the same values as it was run with.

    Reply
  9. joe Tigeleiro

    Thank you for your responses and the link that you provided. The article was very informative and helpfull.

    Reply
  10. azad

    hi gail…great article..thanks a ton.

    Reply
  11. Lee Crain

    Gail, great series of articles. Thanks for taking the time to write them.

    A professional anecdote on this subject:

    In my last job, I noticed that the DBCC FREEPROCCACHE command was being executed as a SQL Agent job at frequent intervals on 2 production servers. It talked to the VP of Software Engineering and found that he’d created the job to reduce the size of the SQL plan cache because he’d had a problem with it becoming huge.

    To make a long story short, to solve a perceived problem with parameter sniffing involving searches on date ranges, he’d implemented a ubiquitous “solution” in hundreds of stored procedures by the setting of local variables with the values of passed in parameters, to force query plan recompilations. This had the undesirable side effect of causing plan cache bloat.

    None of the old query plans were being used. New ones for the same stored procedures were being generated every time the stored procedures were called.

    To eliminate the cache plan bloat, I implemented “WITH RECOMPILE” in the hundreds of affected stored procedures. This stopped cache bloat on the servers.

    Interestingly, the Books Online documentation on “WITH RECOMPILE” states that it forces parameter sniffing which I left nullified by the setting of local variables.

    Ultimately, testing showed this combination (WITH RECOMPILE) and the setting of local variables to thwart parameter sniffing produced the best performance on the 2 servers for the affected stored procedures. Why, I’m not certain. But testing did confirm it.

    Reply
  12. Gail (Post author)

    Hi Lee

    Can’t tell from your post what the problem you had was caused by, but using variables as you describe in procedures neither forces recompiles nor prevents plan reuse. It just causes the optimiser to come up with an ‘average’ plan, not one targetted to the parameter values. I would guess there was something more going on to cause the plan cache bloat.

    Option(Recompile) ‘forces’ parameter sniffing, not WITH RECOMPILE, because at the point of the statement recompile the optimiser can sniff the value of both variables and parameters.

    Reply
  13. Martin Smith

    Hello,

    Regarding your final point about passing in NULL as a parameter AFAIK this is not sniffed and it just uses the same cardinality estimates as it would for variables or OPTIMIZE FOR UNKNOWN.

    On SQL Server 2012 the estimated number of rows for the following proc is 134.126 rather than 0.

    CREATE PROC P
    @name nvarchar(35) = NULL
    AS

    SET @name = ISNULL(‘%’,@name)

    SELECT *
    FROM master..spt_values
    WHERE name LIKE @name

    GO

    EXEC P

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.