DATABASE

SQL Server 2008 : Managing Query Performance - Finding Similar Queries

 
2/8/2011 4:04:05 PM
Prior to SQL Server 2008, finding and tuning similar queries was a daunting task. Prior to SQL Server 2008, you would have to capture a set of queries using a trace or Dynamic Management View (DMV), perform some logic to replace literal values within those queries, and then group by some or all of the query text. The concept is to try to replace literal values with something, such as a pound sign (for example, SELECT col1 FROM Table1 WHERE col1 = # AND col2 = #), so you can group the queries such that the only difference is the parameter values. This approach is very time consuming and in many cases less than accurate. Microsoft has added the query_hash and query_plan_hash columns to the sys.dm_exec_query_stats and the sys.dm_exec_requests DMVs to provide a far more accurate way of finding similar queries.

The sys.dm_exec_query_stats DMV returns aggregate performance statistics about query plans that are currently in cache, and the sys.dm_exec_requests DMV returns information about queries that are currently executing. A query hash is a binary hash value that is calculated on a query based on the query text prior to being sent to the optimizer. A query plan hash is a binary hash value that is calculated based on a compiled query plan. Two queries may have the same query hash values and different plan hash values. For example, if you execute two queries that are identical except for the literal values, the query hash value will be the same. However, one query may produce an index scan, while the other may produce an index seek in the query plan (depending on the range of values being requested), resulting in different query plan hash values.

Now that you understand the mechanics of using hash values, let's take a look at a few examples. Run the script in Listing 1 to clear the plan cache and execute some similar queries.

Example 1. Sample Queries to Create Similar Hash Values
DBCC FREEPROCCACHE
GO
SELECT * FROM sys.messages WHERE severity = 23
GO
SELECT * FROM sys.messages WHERE severity = 23
GO
SELECT * FROM sys.messages WHERE severity = 23;
GO
SELECT * FROM sys.messages WHERE severity=23
GO
SELECT * FROM sys.messages WHERE severity = 24
GO
SELECT * FROM sys.messages WHERE severity IN (24)
GO
SELECT * FROM sys.messages WHERE severity in (23,24)
GO

Now that you have executed a few similar queries, you can review the output from the sys.dm_exec_query_stats DMV. You will also use the sys.dm_exec_sql_text function to return the query text by passing in the plan handle used for the query. Run the query in Listing 2 to review the hash values generated by the queries in Listing 1.

Example 2. Query to Analyze the Hash Values Generated by Listing 1
SELECT qs.query_hash,
qs.query_plan_hash,
st.text ,
qs.execution_count,
qs.total_worker_time
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text (plan_handle) st
WHERE st.text like 'Select * From sys.messages%'
ORDER BY qs.total_worker_time desc

The results from Listing 2 are displayed in Figure 1. As you can see by the execution count, out of the seven queries we ran, only two of them used the same cached execution plan. The query text must be exactly the same in order for SQL Server to reuse the execution plan. There are only minor differences in the queries we are limiting on severity 23, but even an extra space or a semicolon causes SQL Server to compile a separate execution plan. It would be much more efficient to parameterize these queries so that SQL Server could make use of the same cached execution plan for each one. If you look at the hash values in the first two columns, you will see they all match except for the query in the second row. The query text and query plan were both slightly different enough to produce different hash values.

Figure 1. Results returned from running the query in Listing 2

Now let's run the query in Listing 3 to aggregate the data based on the hash values to get a more accurate view of the execution count and total worker time for the queries run in Listing 1.

Example 3. Query to Aggregate Execution Count and Total Worker Time Using Hash Values
SELECT qs.query_hash,
qs.query_plan_hash,
MIN(st.text) QueryTextLike,
SUM(qs.execution_count) ExecutionCount,
SUM(qs.total_worker_time)TotalWorkerTime
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text (plan_handle) st
WHERE st.text LIKE 'SELECT * FROM sys.messages%'
GROUP BY qs.query_hash, qs.query_plan_hash
ORDER BY TotalWorkerTime DESC

Now look at the aggregated results in Figure 2. This shows an execution count of six for the queries that share the same hash values. This level of aggregation could raise queries to the top of your result set that may otherwise go unnoticed. For example, if you are looking for the top ten queries with the highest total worker time, and if several like queries were run hundreds of times generating different execution plans, they may not even make the list. Using the new plan hash feature will allow you to focus on tuning the queries actually using the most resources on your system instead of just those heavy hitters that are easy to find.

Figure 2. Results returned from running the query in Listing 3
Other  
  •  SQL Server 2008 : Correlating Profiler and the Performance Monitor
  •  SQL Server 2008 : Explaining XML - XML Indexes
  •  SQL Server 2008 : Explaining XML - XQuery and XPath
  •  SQL Azure : Managing a Shard (part 2) - Working with Partial Shards
  •  SQL Azure : Managing a Shard (part 1) - Managing Exceptions & Managing Performance
  •  SQL Server 2008 : OPENXML, sp_xml_preparedocument, and sp_xml_removedocument
  •  SQL Server 2008 : Retrieving and Transforming XML Data
  •  SQL Azure: Building a Shard (part 4) - Updating and Deleting Records in the Shard & Adding Records to the Shard
  •  SQL Azure: Building a Shard (part 3) - Reading Using the Shard & Caching
  •  SQL Azure: Building a Shard (part 2) - Managing Database Connections
  •  SQL Azure: Building a Shard (part 1) - Designing the Shard Library Object
  •  SQL Azure: Designing for High Performance - General Performance Concepts
  •  SQL Server 2008 : Explaining Advanced Query Techniques - Creating and Altering Tables
  •  SQL Server 2008 : Explaining Advanced Query Techniques - Managing Internationalization Considerations
  •  Programming Microsoft SQL Server 2005 : Deployment (part 2) - Testing Your Stored Procedures
  •  Programming Microsoft SQL Server 2005 : Deployment (part 1) - Deploying Your Assembly
  •  Programming Microsoft SQL Server 2005 : CLR Stored Procedures and Server-Side Data Access
  •  SQL Server 2008 : Explaining Advanced Query Techniques - Controlling Execution Plans (part 3) - Using the Resource Governor
  •  SQL Server 2008 : Explaining Advanced Query Techniques - Controlling Execution Plans (part 2)
  •  SQL Server 2008 : Explaining Advanced Query Techniques - Controlling Execution Plans (part 1)
  •  
    Top 20
    iPhone 3D Programming : Adding Textures to ModelViewer (part 3) - Enabling Textures with ES1::RenderingEngine
    Surviving Changes to the Signature of a Stored Procedure
    Windows Phone 7 Development : Debugging Application Exceptions (part 1) - Debugging Page Load Exceptions
    C# 4.0 : Add a Static Constructor and Initialization
    SQL Server 2005 Encryption Key Hierarchy
    Windows 7: Getting into Your Multimedia (part 2) - Navigating Windows Media Player Menus and Toolbars
    Mobile Application Security : BlackBerry Security - Local Data Storage
    Windows 7 :Navigating Your Computer with the Address Bar (part 1) - Accessing Locations on Your Computer
    iPhone 3D Programming : Adding Depth and Realism - Surface Normals (part 2)
    Network Programming with Windows Sockets : A Thread-Safe DLL for Socket Messages
    Windows Server AppFabric
    Algorithms for Compiler Design: ACTIVATION OF THE PROCEDURE AND THE ACTIVATION RECORD
    Algorithms for Compiler Design: IMPLEMENTATION in Bottom-up Parsing
    .NET Enterprise Services Technologies : BizTalk Server
    Customizing Hardware Device Settings
    Programming Microsoft SQL Server 2005: Using the Data Mining Wizard and Data Mining Designer (part 4) - Deploying and Processing Data Mining Objects
    Examining Real-World SharePoint 2010 Deployments
    Positioning Elements in XAML
    Detecting and Resolving Windows Vista Errors
    Install Windows Server 2008