Using Join Hints
The
last kind of hint we will discuss is join hints. The optimizer has a
number of different ways that it can perform join operations. This is
separate from the join types of INNER, OUTER, FULL,
and so on. Join hints instead allow us to specify what mechanism SQL
Server uses internally as it performs the requested join type. The join
mechanisms can be LOOP, MERGE, HASH, or REMOTE.
It is beyond the scope of this book to explain why you might want to use one mechanism rather than another; our purpose rather is to show how you can specify a join hint. The screenshot shown in Figure 2 shows two queries and their execution plans. The first query contains no join hints, whereas the second query specifies that a MERGE
join operation should be performed. The execution plans have been
cropped to allow you to see the important parts. You should see that the
execution plan for the first query (where the optimizer chose the join
mechanism) uses a “Hash Match” join, whereas the second execution plan
shows a “Merge Join” operation. The most important part is the query
cost of each plan; the first query was 45 percent of the batch cost,
whereas the one where we forced the merge join was 55 percent of the
batch cost. Again, the optimizer won.
We see from these examples that while we can
override the optimizer using hints, the optimizer more often than not
will come up with a better plan. Hints are a great tool to use during
development and testing, and as a learning tool to help better
understand the optimizer. Before we implement hints in our production
code, however, we should thoroughly test them, document them, and
periodically review them to ensure they are still the best choice.
This section on hints
showed us how we can change the optimizer’s behavior in hopes of
generating execution plans with lower total costs. In the next section
we will look at the Query Governor. The Query Governor allows us to make
sure that “expensive” queries do not
run. This can save you from the accidental runaway query that gets
executed in the middle of the day, and can keep it from killing your
system’s performance.
Using the Query Governor
The Query Governor
allows administrators to specify the maximum allowable cost of the
statements that can be executed on an instance of SQL Server. This
server level setting helps ensure that statements with unacceptable
costs do not run unless we specifically allow it.
The cost limit for the Query Governor is set at the instance level using either the Connections page of the Server Properties window in SQL Server Management Studio, or the sp_configure system stored procedure. It can be overridden on a session-by-session basis using the SET QUERY_GOVERNOR_COST_LIMIT session-level setting.
The SQL Server
documentation is a little confusing as to what the value for the cost
limit should be. It states in some places that the value we enter for
the Query Governor cost limit is the maximum number of seconds that a
query should be allowed to execute, and then in other places it states
that the value is the maximum allowable query cost. Testing shows that
it is in fact the maximum query cost, not a time estimate. This is a
good thing because the query costs are easy to retrieve by looking at
execution plans. Guessing how long a query will take is much less
specific.
First, let’s see how we can set the Query Governor cost limit at the instance level using sp_configure.
The following example sets the Query Governor cost limit to 20, thus
preventing any queries that have an estimated cost of 20 or higher from
running:
EXEC sp_configure 'Show Advanced Options', 1;
RECONFIGURE;
EXEC sp_configure 'query governor cost limit', 20;
RECONFIGURE;
This script must first turn on the Show Advanced Options setting. This is because the Query Governor Cost Limit setting is an advanced option and can only be set when the Show Advanced Options setting is turned on. The RECONFIGURE statement after each call to sp_configure
causes SQL Server to reload its configuration based on the new
settings. This makes the new values available immediately and means we don’t have to shut down our SQL Server instance and restart it to get the new settings.
In the preceding example,
we set the cost limit to 20. Where did that number come from? What
should our cost limit be? To determine the cost limit, we first need to
know the “cost” of our queries. SQL Server provides a number of ways for
us to view the execution plans it uses to run our statements. It is in
those execution plans that we can see the estimated costs of the
statements. The following script returns the execution plan for the
query below it as a set of rows:
USE AdventureWorks2008;
GO
SET SHOWPLAN_ALL ON;
GO
SELECT TOP 1000000 *
FROM Sales.SalesOrderDetail AS OD1
CROSS JOIN Sales.SalesOrderDetail AS OD2;
If we run this query and look at the TotalSubtreeCost column value for the first row that is returned, we will see a cost of about 4.47. That means that the SELECTSELECT statement, thus preventing it from running. We see the error message returned below the statement:
statement has a total estimated cost of approximately 4.47 cost
“units.” The cost is made up of the IO and CPU costs of all steps
required by the optimizer to complete the statement. We can now use that
number to set the Query Governor cost limit. The following example sets
the Query Governor cost limit below the estimated cost of the
USE AdventureWorks2008;
GO
SET QUERY_GOVERNOR_COST_LIMIT 4
GO
SELECT TOP 1000000 *
FROM Sales.SalesOrderDetail AS OD1
CROSS JOIN Sales.SalesOrderDetail AS OD2;
Msg 8649, Level 17, State 1, Line 1
The query has been canceled because the estimated cost of this query
(5) exceeds the configured threshold of 4. Contact the system
administrator.
The
error message shows that the estimated cost of 4.47 we saw before was
rounded up to five. The estimated statement cost of five is greater than
the specified cost limit of four, and the statement is not allowed to
execute. If we were to rerun the above script with the Query Governor
cost limit value set to five, the query would then be allowed to
execute.
The Query Governor gives us a
way to prevent “expensive” queries from executing. The Resource
Governor, a new feature introduced with SQL Server 2008, gives us a way
to limit the resources available to clients based on certain criteria
such as who they are, or what program they are using, rather than just
blindly denying the execution as with the Query Governor.