DATABASE

SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Stored Procedure Performance

4/28/2011 4:29:29 PM
Using stored procedures can provide a number of benefits to SQL Server applications. One performance benefit is reduced network traffic because stored procedures minimize the number of round trips between client applications and SQL Server. Stored procedures can consist of many individual SQL statements but can be executed with a single statement. This allows you to reduce the number and size of calls from the client to the server. If you have to take different actions based on data values, you can specify to have these decisions made directly in the procedure, avoiding the need to send data back to the application to determine what to do with the data values.

By default, SQL Server sends a message back to the client application after each statement is completed within the stored procedure to indicate the number of rows affected by the statement. To further reduce the amount of “chatter” between the client and server and to therefore further improve stored procedure performance, you can eliminate the DONE_IN_PROC messages that SQL Server sends to the client API by issuing the set nocount on command at the beginning of the stored procedure. Be aware that if you turn on this option, the number of rows affected by the commands in the procedure is not available to the client application. If you need this information, you can still issue select @@rowcount after a statement executes to determine the number of rows affected.

Another performance benefit of using stored procedures is potentially faster execution due to the caching of stored procedure query plans. Stored procedure query plans are kept in cache memory after the first execution. The code doesn’t have to be reparsed and reoptimized on subsequent executions.

Query Plan Caching

When a batch of SQL statements is submitted to SQL Server, SQL Server performs a number of steps, including the following, before the data can be returned to the client:

1.
Parse the SQL statements and build a query tree (the internal format on which SQL Server operates).

2.
Check for a previous cached plan for the query/procedure. If one does not exist or is no longer valid, optimize the SQL statements and generate an execution plan.

3.
Check for permissions for access to the underlying objects.

4.
Execute the execution plan for the SQL statements.

The first time a stored procedure executes, SQL Server loads the SQL code for the stored procedure from the system catalog into the plan cache and optimizes and compiles an execution plan.

The optimization of SQL statements is based on the parameters passed, the index distribution statistics, the number of rows in each table, and other information available at the time of the first execution. The compiled plan is then saved in cache memory. For subsequent executions, all SQL Server has to do is find the plan in the cache and execute it, essentially skipping most of the work in steps 1 and 2. Parsing and compilation always add some overhead, and depending on the complexity of the stored procedure code, they can sometimes be as expensive as the actual execution. Just by skipping these two steps, you can achieve a performance gain by using stored procedures.

The SQL Server Plan Cache

SQL Server uses the same buffer area for storing data and index pages as it does for storing query execution plans. The portion of the buffer pool used to store execution plans is referred to as the plan cache. The percentage of the memory pool allocated to execution plans fluctuates dynamically, depending on the state of the system. SQL Server also can keep execution plans in cache for ad hoc queries. This means that even dynamic SQL queries might be able to reuse a cached execution plan and skip recompilation. The cache space is dynamically allocated as needed.

With the ability to keep query plans in memory for ad hoc queries, it is not as critical in SQL Server 2008 for applications to use stored procedures to achieve performance benefits of using precompiled plans. However, when and how the plans are stored and reused for ad hoc queries is not nearly as predictable as with stored procedures. The query plans for stored procedures remain in cache memory more persistently. In addition, you have less explicit control over the recompilation of ad hoc queries.

Tip

You can get information about what is currently in the plan cache via the dm_exec_cached_plans, dm_exec_plan_attributes, and dm_exec_sql_text dynamic management views. These views return the current server state information regarding the plan cache.


Shared Query Plans

SQL Server 2008 execution plans consist of two main components: a query plan and an execution context. The query plan is the bulk of the execution plan. Query plans are re-entrant, read-only data structures used by any number of users. There are at most ever only two copies of the query plan in memory: one copy for all serial executions and another for all parallel executions. The parallel copy covers all parallel executions, regardless of their degree of parallelism. When a SQL statement is executed, the Database Engine searches the plan cache to see whether an execution plan for the same SQL statement is already in the plan cache. If a query plan does exist, the Database Engine reuses it, saving the overhead of recompiling the SQL statement. However, if no existing query plan is found, SQL Server 2008 generates a new execution plan for the query and saves it into the plan cache.

For each user currently executing a query, there is a data structure that holds information specific to that user’s execution, such as parameter values. This data structure is referred to as the execution context. Execution context data structures are also reusable if they are not currently in use. When a user executes a query, SQL Server looks for an execution context structure not being used, and it reinitializes the structure with the context for the new user. If no free execution context structures exist, SQL Server creates a new one. Thus, there can potentially be multiple execution context structures in the plan cache for the same query.

Automatic Query Plan Recompilation

SQL Server attempts to reuse existing execution plans for stored procedures, but certain operations cause the execution plans to become inefficient or invalid. In these cases, a new execution plan needs to be recompiled on the next execution of the stored procedure. The following conditions cause a plan to be invalidated:

  • Whenever there is a change to the schema of a referenced table or view

  • When an index for a referenced table is dropped or changed

  • When the statistics used by an execution plan have been updated, either explicitly or automatically

  • When sp_recompile has been run on a table referenced by a stored procedure

  • When a sufficient number of data changes have been made to a table that is referenced by the stored procedure

  • For tables with triggers, when the number of rows in the inserted and deleted tables grows significantly

In addition to these reasons, other events that can cause stored procedures to recompile new query plans include the following:

  • When SQL Server activity is heavy enough to cause execution plans to be flushed from cache memory

  • When the WITH RECOMPILE option has been specified in the CREATE PROCEDURE or EXEC command

  • When shutting down and restarting SQL Server, which flushes all query plans from memory

In SQL Server 2000 and prior versions, whenever an execution plan was invalidated, the entire batch or stored procedure was recompiled. In SQL Server 2005 and later, only the statement, batch, or stored procedure that caused the query plan to be invalidated has to be recompiled. Because often only a small number of statements in batches or stored procedures are the reason a plan becomes invalidated, statement-level recompilation improves performance in terms of CPU time and locks by avoiding the need to have to recompile all the other statements in the batch whose execution plans are still valid.

Monitoring Stored Procedure Recompilation

You can monitor when stored procedures or statements are automatically recompiled by using SQL Server Profiler. The two events you want to monitor are the SP:Recompile and SQL:StmtRecompile trace events (see Figure 1). In SQL Server 2008, the TextData column of these events is filled in with information about the query that caused the recompile, so you do not need to also trace the SP:StmtStarting or SP:StmtCompleted events to be able to capture the query information.

Figure 1. Adding events in SQL Server Profiler to monitor stored procedure recompilation.

If a stored procedure or statement is automatically recompiled during execution, SQL Server Profiler displays the SP:Recompile event and/or the SQL:StmtRecompile event. For example, you can create the following stored procedure to create and populate a temporary table:

create proc recomp_test
as
create table #titles (title_id varchar(6), title varchar(80), pubdate datetime)
insert #titles select title_id, title, pubdate from titles
where pubdate > '10/1/2001'
select * from #titles
go



Say you turn on SQL Server Profiler and then execute the following SQL, which executes the procedure (which in turn results in the initial compilation), and then add an index on pubdate to the titles table:

exec recomp_test
go
create index idx1 on titles (pubdate)
go
exec recomp_test
go
drop index titles.idx1
go

When you do this, you capture events similar to those shown in Figure 2.

Figure 2. Recompile events captured for a stored procedure in SQL Server Profiler.

The key columns to focus on in the Profiler trace are ObjectName, EventSubclass, and TextData. The TextData column shows which statements were recompiled. You can see in Figure 2 that on the subsequent execution, only the statement affected by the new index on the titles table was recompiled. The EventSubclass column provides the reason for the recompile. These reasons are summarized in Table 1.

Table 1. SQL Server Profiler EventSubClass Values for Recompile Events
EventSubClass ValueDescription
1Schema changed.
2Statistics changed.
3Deferred compile.
4SET option changed.
5Temporary table changed.
6Remote rowset changed.
7FOR BROWSE permission changed.
8Query notification environment changed.
9Partitioned view changed.
10Cursor options changed.
11OPTION (RECOMPILE) requested.

Forcing Recompilation of Query Plans

In some situations, a stored procedure might generate different query plans, depending on the parameters passed in. At times, depending on the type of query and parameter values passed in, it can be difficult to predict the best query plan for all executions. Consider the following stored procedure:

IF EXISTS ( SELECT * FROM sys.procedures
WHERE schema_id = schema_id('dbo')
AND name = N'advance_range')
DROP PROCEDURE dbo.advance_range
GO
create proc advance_range
(@low money, @high money)
as
select * from dbo.titles
where advance between @low and @high
return

Assume that a nonclustered index exists on the advance column in the titles table. A search in which advance is between 1,000 and 2,000 might be highly selective, and the index statistics might indicate that fewer than 5% of the rows fall within that range, and thus an index would be the best way to find the rows. If those were the values passed on the first execution, the cached query plan would indicate that the index should be used.

Suppose, however, that if on a subsequent execution, search values of 5,000 and 10,000 were specified. These values match against 90% of the rows in the table, and if optimized normally, SQL Server would likely use a table scan because it would have to visit almost all rows in the table anyway. Without recompiling, however, it would use the index as specified in the cached query plan, which would be a suboptimal query plan because it would likely be accessing more pages using the index than would a table scan.

When a lot of variance exists in the distribution of data values in a table or in the range of values passed as parameters, you might want to force the stored procedure to recompile and build a new execution plan during execution and not use a previously cached plan. Although you incur the overhead of compiling a new query plan for each execution, it is typically much less expensive than executing the wrong query plan.

You can force recompilation of the query plan for a stored procedure by specifying the WITH RECOMPILE option when creating or executing a stored procedure. Including the WITH RECOMPILE option in the create procedure command causes the procedure to generate a new query plan for each execution:

IF EXISTS ( SELECT * FROM sys.procedures
WHERE schema_id = schema_id('dbo')
AND name = N'advance_range')
DROP PROCEDURE dbo.advance_range
GO
create proc advance_range
(@low money, @high money)
WITH RECOMPILE
as
select * from dbo.titles
where advance between @low and @high
return

If the procedure is not created with the WITH RECOMPILE option, you can generate a new query plan for a specific execution by including the WITH RECOMPILE option in the EXEC statement:

exec advance_range 5000, 10000 WITH RECOMPILE

Because of the performance overhead of recompiling query plans, you should try to avoid using WITH RECOMPILE whenever possible. One approach is to create different subprocedures and execute the appropriate one based on the passed-in parameters. For example, you could have a subprocedure to handle small-range retrievals that would benefit from an index and a different subprocedure to handle large-range retrievals. The queries in each procedure would be identical; the only difference would be in the parameter values passed to them. This is controlled in the top-level procedure. An example of this approach is demonstrated in Listing 1.

Listing 1. Using Multiple Stored Procedures as an Alternative to Using WITH RECOMPILE
IF EXISTS ( SELECT * FROM sys.procedures
WHERE schema_id = schema_id('dbo')
AND name = N'advance_range')
DROP PROCEDURE dbo.advance_range
GO
go

create proc get_titles_smallrange
@low money, @high money
as
select * from titles
where advance between @low and @high
return
go
create proc get_titles_bigrange
@low money, @high money
as
select * from titles
where advance between @low and @high
return
go
create proc advance_Range
@low money, @high money
as
if @high - @low >= 1000
-- if the difference is over 5000
exec get_titles_bigrange @low, @high
else
-- execute the small range procedure
exec get_titles_smallrange @low, @high



Obviously, this solution would require substantial knowledge of the distribution of data in the table and where the threshold is on the range of search values that results in different query plans.

Another type of stored procedure that can sometimes generate different query plans based on initial parameters is a multipurpose procedure, which usually performs different actions based on conditional branching, as in the following example:

IF EXISTS ( SELECT * FROM sys.procedures
WHERE schema_id = schema_id('dbo')
AND name = N'get_titles_data')
DROP PROCEDURE dbo.get_titles_data
GO
create proc get_titles_data (@flag tinyint, @value money)
as
if @flag = 1
select * from titles where price = @value
else
select * from titles where advance = @value

At query compile time, the Query Optimizer doesn’t know which branch will be followed because the if...else construct isn’t evaluated until runtime. On the first execution of the procedure, the Query Optimizer generates a query plan for all SELECT statements in the stored procedure, regardless of the conditional branching, based on the parameters passed in on the first execution. A value passed in to the parameter intended to be used for searches against a specific table or column (in this example, price versus qty) might not be representative of normal values to search against another table or column.

Again, a better approach would be to break the different SELECT statements into separate subprocedures and execute the appropriate stored procedure for the type of query to be executed, as in the following example:

IF EXISTS ( SELECT * FROM sys.procedures
WHERE schema_id = schema_id('dbo')
AND name = N'get_titles_data')
DROP PROCEDURE dbo.get_titles_data
GO
drop proc get_titles_data
go
create proc get_titles_data_by_price (@value money)
as
select * from titles where price = @value
go
create proc get_titles_data_by_advance (@value money)
as
select * from titles where advance = @value
go
create proc get_titles_data (@flag tinyint, @value money)
as
if @flag = 1
exec get_titles_data_by_price @value
else
exec get_titles_data_by_advance @value

Using sp_recompile

In versions of SQL Server prior to 7.0, it was necessary to use the sp_recompile system stored procedure when you wanted to force all stored procedures that referenced a specific table to generate a new query plan upon the next execution. This was necessary if you had added new indexes to a table or had run UPDATE STATISTICS on the table. However, the usefulness of this command in SQL Server 2008 is questionable because new query plans are generated automatically whenever new indexes are created or statistics are updated on a referenced table. It appears that sp_recompile is available primarily for backward compatibility or for times when you want the recompilations to occur explicitly for all procedures referencing a specific table.

 
Other  
  •  SQL Server 2008 : Using Remote Stored Procedures
  •  SQL Server 2008 : Using Temporary Tables in Stored Procedures
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Nested Stored Procedures
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Using Cursors in Stored Procedures
  •  SQL Server 2008 : T-SQL Stored Procedure Coding Guidelines
  •  SQL Server 2008 : Leveraging the Microsoft Sync Framework
  •  SQL Server 2008 : Using ADO.NET Data Services
  •  SQL Server 2008 : Developing with LINQ to SQL (part 2) - Uncovering LINQ to SQL with Linqpad
  •  SQL Server 2008 : Developing with LINQ to SQL (part 1)
  •  Getting Comfortable with ADO.NET 3.5 and SQL Server 2008
  •  SQL Server System and Database Administration : System Views
  •  SQL Server System and Database Administration : System Tables & System Stored Procedures
  •  SQL Server System and Database Administration : System Databases
  •  SQL Server 2008 : Monitoring Your Server - Monitoring Your CPU
  •  Programming Microsoft SQL Server 2005 : Querying XML Data Using XQuery (part 3) - XML DML & Converting a Column to XML
  •  Programming Microsoft SQL Server 2005 : Querying XML Data Using XQuery (part 2) - SQL Server 2005 XQuery in Action
  •  Programming Microsoft SQL Server 2005 : Querying XML Data Using XQuery (part 1) - XQuery Defined
  •  SQL Server 2008 : Monitoring Your Server - Familiarizing Yourself with the Performance Monitor
  •  Programming Microsoft SQL Server 2005 : FOR XML Commands (part 3) - OPENXML Enhancements in SQL Server 2005
  •  Programming Microsoft SQL Server 2005 : FOR XML Commands (part 2) - FOR XML EXPLICIT
  •  
    Video
    Top 10
    Beginning Android 3 : The Input Method Framework - Fitting In
    Windows 7 : Creating and Managing Groups
    Windows 7 : Managing User Properties
    Exchange Server 2010 : Unique database names
    Exchange Server 2010 : Transaction log replay: The foundation for DAG replication
    Exchange Server 2010 : Active Manager - Automatic database transitions & Best copy selection
    Exchange Server 2010 : Breaking the link between database and server
    iPhone 3D Programming : Drawing an FPS Counter (part 2) - Rendering the FPS Text
    iPhone 3D Programming : Drawing an FPS Counter (part 1) - Generating a Glyphs Texture with Python
    Mobile Application Security : Mobile Geolocation - Geolocation Methods & Geolocation Implementation
    Most View
    Windows Server 2008 : The Migration Planning Phase - Documenting the Process for Migration
    Programming with DirectX : Additional Texture Mapping - Sprites
    Creating Link-Worthy Content and Link Marketing : Choosing the Right Link-Building Strategy
    Windows Server 2008: Domain Name System and IPv6 - Resource Records
    Designing and Configuring Unified Messaging in Exchange Server 2010 : Unified Messaging Installation (part 1)
    Backing Up the Exchange Server 2010 Environment : Roles and Responsibilities & Developing a Backup Strategy
    iPhone Application Development : Using Advanced Interface Objects and Views - User Input and Output
    Exchange Server 2010 : Operating Without Traditional Point-in-Time Backups
    Programming Microsoft SQL Server 2005 : CLR Stored Procedures and Server-Side Data Access
    Microsoft XNA Game Studio 3.0 : Text and Computers
    Maintaining Windows 7 with Backup and Restore (part 1) - Creating a Backup & Restoring Files from a Backup
    SharePoint 2010 : Operations Management with the SharePoint Central Administration Tool (part 6)
    Advanced ASP.NET : Data-Access Components (part 3) - Enhancing the Component with Error Handling & Enhancing the Component with Aggregate Information
    New IE 7.0 Security Features
    Partitioning Disks and Preparing Them for Use in Vista
    Windows Phone 7 Development : Using Cloud Services As Data Stores - Creating a Cloud Database
    Windows Mobile Security - Networking
    Queries in SQL
    Backing Up the Exchange Server 2010 Environment : Understanding the Importance of Backups & Establishing Service Level Agreements
    Windows Phone 7 Development : Handling Errors in Rx.NET