SQL Server 2005 : Dynamic T-SQL - Supporting Optional Parameters (part 4) - sp_executesql: A Better EXECUTE

10/14/2012 7:31:40 PM

sp_executesql: A Better EXECUTE

In the previous sections, I identified two major problems with building dynamic SQL statements and executing them using EXECUTE: First of all, there is the issue of extraneous compilation and query plan caching, which makes performance drag and uses up valuable system resources. Second, and perhaps more important, is the threat of SQL injection attacks.

Query parameterization, is the key to fixing both of these problems. Parameterization is a way to build a query such that any parameters are passed as strongly typed variables, rather than formatted as strings and appended to the query. In addition to the performance benefits this can bring by allowing SQL Server to do less work when processing the query, parameterization also has the benefit of virtually eliminating SQL injection attacks.

The first step in parameterizing a query is to replace literals with variable names. For instance, the injection-vulnerable query from the previous section could be rewritten in a parameterized manner as follows (I've removed the stored procedure creation code for simplicity):

DECLARE @String NVARCHAR(60)
SET @String = 'Stone'

DECLARE @sql NVARCHAR(MAX)

SET @sql = '' +
    'SELECT AddressId ' +
    'FROM Person.Address ' +
    'WHERE AddressLine1 LIKE ''%'' + @String + ''%'''

The only thing that has changed about this query compared to the version in the last section is two additional single quotes added such that the literal value of @String is no longer concatenated with the rest of the query. Previously, the literal value of @sql after concatenation would have been as follows:

SELECT AddressId FROM Person.Address WHERE AddressLine1 LIKE '%Stone%'

As a result of this change, the literal value after concatenation is now the following:

SELECT AddressId FROM Person.Address WHERE AddressLine1 LIKE '%' + @Stone + '%'

Due to the fact that no variables are getting formatted into strings for concatenation with the rest of the query, the type of SQL injection described in the previous section is impossible in this scenario. The only thing such an attempt would yield for a hacker is a search in which no results are returned!

Trying to execute this SQL using EXECUTE results in the following exception:

Msg 137, Level 15, State 2, Line 1
Must declare the scalar variable "@String".

The reason for this is that EXECUTE runs the SQL in a different context than that in which it was created. In the other context, the variable @String has not been declared and is therefore unknown.

The solution to this problem is to use the sp_executesql system stored procedure, which allows you to pass parameters to dynamic SQL, much as you can to a stored procedure. The parameters for sp_executesql are a Unicode (NVARCHAR or NCHAR) string containing a dynamic SQL batch, a second Unicode string that defines the data types of the variables referenced in the dynamic SQL, and a list of values or variables from the calling scope that correspond to the variables defined in the data type list. The following T-SQL shows how to execute the Person.Address query using sp_executesql:

DECLARE @String NVARCHAR(60)
SET @String = 'Stone'

DECLARE @sql NVARCHAR(MAX)

SET @sql = '' +
    'SELECT AddressId ' +
    'FROM Person.Address ' +
    'WHERE AddressLine1 LIKE ''%'' + @String + ''%'''

EXEC sp_executesql
    @sql,
    N'@String NVARCHAR(60)',
    @String

Running this batch will produce the same results as calling FindAddressByString and passing the string "Stone". The parameters to sp_executesql serve to map the @String variable from the outer scope, into the new scope spawned when the dynamic SQL is executed—without having to concatenate the literal value of the variable.

For an example that uses multiple parameters, consider again the GetEmployeeData stored procedure, now rewritten to use sp_executesql instead of EXECUTE:

CREATE PROCEDURE GetEmployeeData
    @EmployeeId INT = NULL,
    @NationalIdNumber NVARCHAR(15) = NULL
AS
BEGIN
    SET NOCOUNT ON

    DECLARE @sql NVARCHAR(MAX)

    SET @sql = '' +
        'SELECT ' +
            'ContactId, ' +
            'LoginId, ' +
            'Title ' +
        'FROM HumanResources.Employee ' +
        'WHERE 1=1 ' +

CASE
            WHEN @EmployeeId IS NULL THEN ''
            ELSE 'AND EmployeeId = @EmployeeId '
        END +
        CASE
            WHEN @NationalIdNumber IS NULL THEN ''
            ELSE 'AND NationalIdNumber = @NationalIdNumber '
        END

    EXEC sp_executesql
        @sql,
        N'@EmployeeId INT, @NationalIdNumber NVARCHAR(60)',
        @EmployeeId,
        @NationalIdNumber
END

For multiple parameters, simply comma-delimit their data type definitions in the second parameter, and then pass as many outer parameters as necessary to define every variable listed in the second parameter. Note that you can use a string variable for the second parameter, which might make sense if you are defining a long list—but I usually keep the list in a string literal so that I can easily match the definitions with the variables passed in from the outer scope.

Another important thing to note here is that even though both parameters are optional, they will both get passed to the query every time it is executed. This is perfectly okay! There is very little overhead in passing parameters into sp_executesql, and trying to work around this issue would either bring back the combinatorial explosion problem or require some very creative use of nested dynamic SQL. Neither solution is maintainable or worth the time required, so save your energy for more interesting pursuits.

Performance Comparison

As a first step in evaluating the relative performance of the sp_executesql solution against other solutions mentioned, you can verify that sp_executesql really is reusing query plans as expected. To verify, run the same code that was used to show that the EXECUTE method was not reusing plans:

DBCC FREEPROCCACHE
GO

EXEC GetEmployeeData
    @EmployeeId = 1
GO

EXEC GetEmployeeData
    @EmployeeId = 2
GO

EXEC GetEmployeeData
    @EmployeeId = 3
GO

After running this code, query the sys.dm_exec_cached_plans view as before. The results should be similar to those shown in Figure 9. One plan is cached for the procedure itself, and one for the invocation of the dynamic query with the @EmployeeId parameter. Invoking the query with a different combination of parameters will result in creation of more cached plans, because the resultant query text will be different. However, the maximum number of plans that can be cached for the stored procedure is five: one for the procedure itself and one for each possible combination of parameters.

The sp_executesql solution promotes reuse of query plans.

To further validate performance, open up the SQLQueryStress tool for some simple load testing. But before configuring SQLQueryStress, create a renamed version of the best performing (but worst for maintenance) static SQL version of the stored procedure. Call it GetEmployeeData_Static:

CREATE PROCEDURE GetEmployeeData_Static
    @EmployeeId INT = NULL,
    @NationalIdNumber NVARCHAR(15) = NULL
AS
BEGIN
    SET NOCOUNT ON

    IF (@EmployeeId IS NOT NULL
        AND @NationalIdNumber IS NOT NULL)
    BEGIN
        SELECT
            ContactId,
            LoginId,
            Title
        FROM HumanResources.Employee
        WHERE
            EmployeeId = @EmployeeId
            AND NationalIdNumber = @NationalIdNumber
    END
    ELSE IF (@EmployeeId IS NOT NULL)
    BEGIN
        SELECT
            ContactId,
            LoginId,
            Title
        FROM HumanResources.Employee
        WHERE
            EmployeeId = @EmployeeId
    END

ELSE IF (@NationalIdNumber IS NOT NULL)
    BEGIN
        SELECT
            ContactId,
            LoginId,
            Title
        FROM HumanResources.Employee
        WHERE
            NationalIdNumber = @NationalIdNumber
    END
    ELSE
    BEGIN
        SELECT
            ContactId,
            LoginId,
            Title
        FROM HumanResources.Employee
    END
END

This version produces the best possible query plans, but of course has the issue of being impossible to maintain. It also has no additional overhead associated with context switching, which may make it slightly faster than a dynamic SQL solution if the queries are very simple. For more complex queries that take longer, any context switching overhead will be overshadowed by the actual runtime of the query.

Once the stored procedure is created, enter the following into the SQLQueryStress Query textbox:

EXEC GetEmployeeData_Static
    @EmployeeId,
    @NationalIdNumber

Next, click the Database button and configure your database connection to use AdventureWorks as the default database. Once finished, click the Parameter Substitution button and enter the following T-SQL into the Parameter Query textbox:

SELECT EmployeeId, NationalIdNumber
FROM HumanResources.Employee

UNION ALL

SELECT NULL, NationalIdNumber
FROM HumanResources.Employee

UNION ALL

SELECT NULL, NationalIdNumber
FROM HumanResources.Employee

UNION ALL

SELECT NULL, NULL

This code selects one row for every possible combination of input parameters to the stored procedure, which will allow testing of every type of invocation. Click the Get Columns button, and map the EmployeeId column to @EmployeeId, and the NationalIdNumber column to @NationalIdNumber. Then, click OK to go back to the main SQLQueryStress screen.

To keep the test simple, configure the tool to use five threads and to do 8,710 iterations. Since there are 871 rows returned by the parameter substitution query, this means that each thread will run through every possible combination ten times. These numbers are somewhat arbitrary, but the point is to simply validate the solutions against each other with a consistent test, not to completely stress them until they break. The results of the test as run on my system are shown in Figure 10.

SQLQueryStress output from the test of the static SQL solution

Next, create a renamed version of the EXECUTE solution, called GetEmployeeData_Execute:

CREATE PROCEDURE GetEmployeeData_Execute
    @EmployeeId INT = NULL,
    @NationalIdNumber NVARCHAR(15) = NULL
AS
BEGIN
    SET NOCOUNT ON

    DECLARE @sql NVARCHAR(MAX)

    SET @sql = '' +
        'SELECT ' +

'ContactId, ' +
            'LoginId, ' +
            'Title ' +
        'FROM HumanResources.Employee ' +
        'WHERE 1=1 ' +
        CASE
            WHEN @EmployeeId IS NULL THEN ''
            ELSE
                'AND EmployeeId = ' +
                CONVERT(NVARCHAR, @EmployeeId) + ' '
        END +
        CASE
            WHEN @NationalIdNumber IS NULL THEN ''
            ELSE
                'AND NationalIdNumber = N''' +
                @NationalIdNumber + ''' '
        END

    EXEC(@sql)
END

Testing this stored procedure against the static solution and, later, the sp_executesql solution will create a nice means by which to compare static SQL against both parameterized and nonparameterized dynamic SQL, and will show the effects of parameterization on performance. Back in SQLQueryStress, change the text in the Query textbox to the following:

EXEC GetEmployeeData_Execute
    @EmployeeId,
    @NationalIdNumber

Once finished, click the GO button to begin the test; do not reconfigure the parameter substitution, number of threads, or number of iterations, in order to keep things consistent. The results of the test as run on my system are shown in Figure 11.

SQLQueryStress output from the test of the EXECUTE solution

The final stored procedure to test is, of course, the sp_executesql solution. Once again, create a renamed version of the stored procedure in order to differentiate it. This time, call it GetEmployeeData_sp_executesql:

CREATE PROCEDURE GetEmployeeData_sp_executesql
    @EmployeeId INT = NULL,
    @NationalIdNumber NVARCHAR(15) = NULL
AS
BEGIN
    SET NOCOUNT ON

    DECLARE @sql NVARCHAR(MAX)

    SET @sql = '' +
        'SELECT ' +
            'ContactId, ' +
            'LoginId, ' +
            'Title ' +
        'FROM HumanResources.Employee ' +
        'WHERE 1=1 ' +
        CASE
            WHEN @EmployeeId IS NULL THEN ''
            ELSE 'AND EmployeeId = @EmployeeId '
        END +

CASE
            WHEN @NationalIdNumber IS NULL THEN ''
            ELSE 'AND NationalIdNumber = @NationalIdNumber '
        END

    EXEC sp_executesql
        @sql,
        N'@EmployeeId INT, @NationalIdNumber NVARCHAR(60)',
        @EmployeeId,
        @NationalIdNumber
END

In SQLQueryStress, change the text of the Query textbox to the following:

EXEC GetEmployeeData_sp_executesql
    @EmployeeId,
    @NationalIdNumber

When finished, click GO to begin the test. Figure 12 shows the results from my system.

SQLQueryStress output from the test of the sp_executesql solution

Interestingly, the results between the two dynamic SQL solutions are very close together, with the sp_executesql solution beating the EXECUTE solution by only just over a second, even given the benefits of parameterization for performance. Runs with a lower number of iterations or against stored procedures that are more expensive for SQL Server to compile will highlight the benefits more clearly.

The static SQL version, in this case, clearly wins from a performance point of view (although all three are extremely fast). Again, more complex stored procedures with longer run times will naturally overshadow the difference between the dynamic SQL and static SQL solutions, leaving the dynamic SQL vs. static SQL question purely one of maintenance.

NOTE

When running these tests on my system, I restarted my SQL Server service between each run in order to ensure absolute consistency. Although this may be overkill for this case, you may find it interesting to experiment on your end with how restarting the service affects performance. This kind of test can also be useful for general scalability testing, especially in clustered environments. Restarting the service before testing is a technique that you can use to find out how the application will behave if a failover occurs, without having to have a clustered testing environment.

Output Parameters

Although it is somewhat of an aside to this discussion, I would like to point out one other feature that sp_executesql brings to the table compared with EXECUTE, which is often overlooked by users who are just getting started using it. sp_executesql allows you to pass parameters to dynamic SQL just like to a stored procedure—and this includes output parameters.

Output parameters become quite useful when you need to use the output of a dynamic SQL statement that perhaps only returns a single scalar value. An output parameter is a much cleaner solution than having to insert the value into a table and then read it back into a variable.

To define an output parameter, simply append the OUTPUT keyword in both the parameter definition list and the parameter list itself. The following T-SQL shows how to use an output parameter with sp_executesql:

DECLARE @SomeVariable INT

EXEC sp_executesql
    N'SET @SomeVariable = 123',
    N'@SomeVariable INT OUTPUT',
    @SomeVariable OUTPUT

As a result of this T-SQL, the @SomeVariable variable will have a value of 123.

Since this is an especially contrived example, I will add that in practice I often use output parameters with sp_executesql in search stored procedures with optional parameters. A common user interface requirement is to return the number of total rows found by the selected search criteria, and an output parameter is a quick way to get the data back to the caller.

SQL Server 2005 : Dynamic T-SQL - Supporting Optional Parameters (part 3) - SQL Injection

SQL Server 2005 : Dynamic T-SQL - Supporting Optional Parameters (part 2) - Going Dynamic: Using EXECUTE

SQL Server 2005 : Dynamic T-SQL - Supporting Optional Parameters (part 1) - Optional Parameters via Static T-SQL

Other

SQL Server 2005 : Dynamic T-SQL - Why Go Dynamic?

MySQL for Python : Passing a query to MySQL

MySQL for Python : Forming a query in MySQL

SQL Injection : Platform-Level Defenses - Securing the Database

SQL Injection : Platform-Level Defenses - Using Runtime Protection (part 2) - Intercepting Filters

SQL Injection : Platform-Level Defenses - Using Runtime Protection (part 1) - Web Application Firewalls

SQL Injection : Code-Level Defenses - Designing to Avoid the Dangers of SQL Injection

SQL Injection : Code-Level Defenses - Canonicalization

SQL Injection : Code-Level Defenses - Encoding Output

Upgrading and Converting to Access 2010 : INSTALLING MULTIPLE VERSIONS OF ACCESS ON ONE PC, CHANGING FILE FORMATS