Dynamic SQL and SQL injection

When I wrote about the catch-all queries, someone asked why the dynamic SQL that I offered wasn’t vulnerable to SQL injection. I thought I’d go into the whys and the wherefores of that in a little bit more detail.

I’m just going to look at SQL injection from the aspect of dynamic SQL. The front-end code aspect has been dealt with hundreds of times, most recently here – http://www.simple-talk.com/community/blogs/philfactor/archive/2009/03/30/72651.aspx

The most important thing to realise with SQL Injection (and with all other forms of command injection) is that it requires that a user-inputted string be incorperated as part of a command that’s going to be executed. Not as part of a paramter value, but as part of the command itself.

Let me show you want I mean.

DECLARE @sSQL varchar(500)
SET @sSQL = 'SELECT * FROM sys.objects'


In this exeedingly simple example, there’s no possibility for SQL injection. There’s no user-inputted string that can become part of the command. Let’s look at two slightly more complex examples

Example 1:

DECLARE @inputParam VARCHAR(100) -- Assume this comes from user input
DECLARE @sSQL varchar(500)


IF @inputParam = 'Table1'
SET @sSQL = @sSQL + 'Table1'
IF @inputParam = 'Table2'
SET @sSQL = @sSQL + 'Table2'
IF @inputParam = 'Table3'
SET @sSQL = @sSQL + 'Table3'
IF @inputParam = 'Table4'
SET @sSQL = @sSQL + 'Table4'


Example 2:

DECLARE @inputParam VARCHAR(100) -- Assume this comes from user input
DECLARE @sSQL varchar(500)

SET @sSQL = 'SELECT * FROM ' + @inputParam



Catch-all queries

The query form that I refer to as ‘Catch-all’ typically results from search screens in the application where the user may enter any one (or more) of a number of optional parameters. One of the more common ways for such a query to be written in SQL is with multiple predicates in the where clause of the form (WHERE SomeColumn = @SomeVariable OR @SomeVariable IN NULL)

Now this does work, the problem is that it works fairly inefficiently and, on large tables, can result in really poor query performance. I’m going to take a look at why that is the case and what alternatives there are.

Erland Sommarskog has written on this as well, and in a lot more detail than I’m going to. His article on dynamic search conditions is well worth reading, as are the rest of his articles.

A typical example of a ‘catch-all’ query would be this one, based off a table in the AdventureWorks database.

(@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)
SELECT ProductID, ReferenceOrderID, TransactionType, Quantity,
TransactionDate, ActualCost from Production.TransactionHistory
WHERE (ProductID = @Product Or @Product IS NULL)
AND (ReferenceOrderID = @OrderID OR @OrderID Is NULL)
AND (TransactionType = @TransactionType OR @TransactionType Is NULL)
AND (Quantity = @Qty Or @Qty is null)

Now, let’s say that I run that query and pass values for the ProductID and the Transaction type. Let’s further say that there’s a nonclustered index (called idx_TranHistory_TranTypeProductID) on those two columns.

EXEC SearchHistory @Product = 978, @TransactionType = 'W'

Now this returns 52 rows out of 980000 that are in the table, so we’d expect that SQL would use an index seek operation on that index, followed by a bookmark lookup. (more…)

On the OUTPUT of a data modification

or “Who needs a trigger anyway?

The output clause was, I think, one of those wonderful features of SQL 2005 that very few people used, myself included. Now in 2008, it’s even better, but still doesn’t appear to be widely used.

The output clause can be used to get, as a resultset, data from the inserted and deleted tables that are usually only visible in a trigger. As a very simple example:

Create Table #Testing (
id int identity,
somedate datetime default getdate()

insert into #Testing
output inserted.*
default values 

Neat. We can get back the inserted values as a result set. We can also insert them into a table variable for later processing. Using the same temp table

declare @OutputTable TABLE (id int, somedate datetime)

insert into #Testing
output inserted.* into @OutputTable
default values

select * from @OutputTable 

Very neat. Now how about a practical example? Say we have the following three tables in a database.

Create Table ParentTable (
ID int identity primary key,
ParentDescription varchar(50),
ID Int identity Primary Key,
ParentID int not null constraint fk_parent foreign key references ParentTable(ID),
Somedescription varchar(20),
SomeValue Money

Create Table AuditTable (
AuditID int identity primary key,
ChildID int,
SomeValue Money,

We get a set of data (perhaps in a temp table, perhaps in an xml document) that needs to be inserted into those tables. The source data will have multiple parent rows, each with multiple child rows. Those need to be inserted into the appropriate tables and the foreign keys have to be assigned correctly. In addition, the ID of the child rows, along with the value and the current date must be written into an audit table, along with the login name of the current user.


Common T-SQL mistakes

I have the pleasure at the moment of doing a code review on some vendor code. No names will be mentioned. I’ve seen better. I’ve seen a lot better. I’m seeing very common mistakes in the code, so, in the interests of my sanity, I’m going to go over a couple of common T-SQL mistakes in the hopes that the next batch of code I get to review doesn’t have these mistakes in…

1. Error Handling

Proper error handling is hard. SQL 2005 has made it a lot easier with the TRY…CATCH blocks, but it still means that everything that can throw an error be wrapped inside a TRY block, with an appropriate CATCH block to handle any errors.

It was a lot harder on SQL 2000 when all we had to work with was @@Error. What I think was not well understood was what statements set and reset @@Error, and how long a non-zero value persists, leading to code constructs like this

Insert into SomeTable ...
Update SomeTable SET ...
Delete From SomeOtherTable ...

IF @@Error !=0
Print 'An error occured'


Comparisons with NULL

Or ‘True, False or FileNotFound’ (1)

Null seems to confuse a lot of people. The main issue with null is that it is not a value. It’s the absence of a value, a place holder for unknown data. Hence, when it is used in a comparison, it returns results that look surprising. Take this as a quick example

SET @Var1 = NULL

CASE WHEN @Var1 = NULL THEN ‘Var1 = Null’ ELSE ‘Var1 <> NULL’ END AS EqualityTest,
CASE WHEN @Var1 <> NULL THEN ‘Var1 <> Null’ ELSE ‘Var1 = NULL’ END AS InequalityTest

The results look a little strange. They’re supposed to. (No, I didn’t mean IS NULL/IS NOT NULL)


Temp tables and table variables

I’ve encountered a fair bit of confusion on various forums as to the differences between temporary tables and table variables. As a quick article (I’m knee-deep in some AI stuff at the moment) I thought I’d quickly go over some points on temp tables and table variables.

Temporary Tables

  • Created using the Create table syntax, preceding the table name with a’#’ for a local temp table and ‘##’ for a global temp table
  • Allocated storage space within the TempDB database and entered into the TempDB system tables 1
  • The table’s actual name is the name is was created with, a large number of underscores and a hash value, to prevent object name collisions if two connections create a temp table with the same name
  • Can have a primary key, defaults, constraints and indexes (however the names of these are not hashed, possibly leading to duplicate object errors for constraints and defaults)
  • May not have triggers.
  • Foreign keys are permitted, but are not enforced
  • Have column statistics kept on them. The algorithm for determining when to update is different to permanent tables
  • Exist until they are dropped, or the connection closes.
  • Are visible in any child procedures called from the one where the table was created. Are not visible to parent procedures
  • Are not persisted to disk unless there is memory pressure, or the table is too large to fit in the data cache


DateTime Manipulation

The date time data type and the date time functions within SQL are things that I see coming up time and time again in news groups and forums. Questions on how to get rid of the time, how to get the first day of the week, the last day of the month and so on. With the new Date and Time data types coming in SQL 2008, things will get easier, nut until then we have to do things the hard way.

In systems I’ve worked on I’ve seen several implementations of functions to find the first and last day of a week, a month or a quarter. Some have worked well, some have worked and others, well, haven’t


Data Conversion Woes, pt 2

In Part 1 I discussed why some queries that look like they should work fail with a conversion error. In this part, I’m going to go over some solutions to the problem.

  1. Probably best solution, at least in the long term is, as I mentioned previously, to correct the data type of the column and to remove the data that doesn’t fit. This is probably, however, the hardest of the solutions to implement, due to the potential impact.


Data Conversion Woes

Or “I’m filtering on ISDATE(dt) = 1. Why am I still getting conversion errors?

Well, to put it simply, because SQL’s evaluating the where clause in a different order to what you may have expected.

Let’s try a simple example.

SELECT * FROM tbl_TestingConversion
WHERE ISDate(TheDate)=1
 AND CAST(TheDate AS DATETIME) > '2007/07/01'