DATABASE

SQL Server 2008 : Explaining Advanced Query Techniques - Applying Ranking Functions (part 1) - Using ROW_NUMBER

1/24/2011 4:46:17 PM
Ranking functions are built-in system functions that allow us to include numeric columns in our result sets where the value in the column reflects the “rank” of the row. We determine how to rank the rows by choosing first which of the four possible ranking functions (ROW_NUMBER, RANK, DENSE_RANK, or NTILE E) we wish to use, what value the rank will be based on, and optionally over what ranges or “partitions” we want the ranking to be performed. All of the functions follow the same basic format:
<ranking_function > OVER ( [ <partition_by_clause> ] <order_by_clause> )

We’ll look at the general use of the functions by starting with the first one, ROW_NUMBER.

Using ROW_NUMBER

We will start with the ROW_NUMBER function because it is the most basic of the four ranking functions. It simply generates a numeric value based on the number of rows in the result.

The following examples of the ranking functions will build on a query that returns products from three specific subcategories using the Product and ProductSubcategory tables in the AdventureWorks.Production schema. The subcategories we see in the query are Bib-shorts, Bottom Brackets, and Gloves. These subcategories were chosen to help keep the result set small for demo purposes but still provide a set of data that can be used to show all the ranking functions. Here is the initial query we will use without any ranking function yet used:

USE AdventureWorks2008;
GO
SELECT
S.Name AS SubName, P.Name ProdName, ListPrice AS Price
FROM Production.Product AS P
JOIN Production.ProductSubcategory AS S
ON P.ProductSubcategoryID = S.ProductSubcategoryID
WHERE S.ProductSubCategoryID IN (5,18,20)
ORDER BY S.Name, ListPrice;

The ROW_NUMBER function allows us to return a sequential row number value in our result sets. We have to indicate to the ROW_NUMBER function the order in which we want the rows to be numbered. For example, should they be numbered in order of list price, or should they be listed by some other sort order? We state our sorting desires using required <order_by_clause>. Because the ranking function gets its own <order_by_clause>, the ordering of the row number sequence can be different than the ordering of result set. In all of the examples, we will order the ranking functions by the ListPrice of the products where lower list prices have a lower rank value and higher list prices will have a higher rank value. Here is the same query with the ROW_NUMBER function added and its results:

USE AdventureWorks2008
GO
SELECT
S.Name AS SubName, P.Name ProdName, ListPrice AS Price,
ROW_NUMBER() OVER (ORDER BY ListPrice) AS ROW
FROM Production.Product AS P
JOIN Production.ProductSubcategory AS S

ON P.ProductSubcategoryID = S.ProductSubcategoryID
WHERE S.ProductSubCategoryID IN (5,18,20)
ORDER BY S.Name, ListPrice;
SubName ProdName Price ROW
---------------- ---------------------- ------- ---
Bib-Shorts Men's Bib-Shorts, S 89.99 8
Bib-Shorts Men's Bib-Shorts, M 89.99 9
Bib-Shorts Men's Bib-Shorts, L 89.99 10
Bottom Brackets LL Bottom Bracket 53.99 7
Bottom Brackets ML Bottom Bracket 101.24 11
Bottom Brackets HL Bottom Bracket 121.49 12
Gloves Half-Finger Gloves, S 24.49 1
Gloves Half-Finger Gloves, M 24.49 2
Gloves Half-Finger Gloves, L 24.49 3
Gloves Full-Finger Gloves, S 37.99 4
Gloves Full-Finger Gloves, M 37.99 5
Gloves Full-Finger Gloves, L 37.99 6


Notice that the values in the ROW column are not ordered by the position of the rows, but rather by the magnitude of the ListPrice value. This is because the call to the ROW_NUMBER function indicated that the ranking should be ordered by the ListPrice, whereas the rows themselves are being sorted first by the subcategory name, then by list price. The ROW_NUMBER function call has been italicized in the preceding query to help you find it easily.

If we want the row number to number the rows in the same order as they are returned, we need to ask the ROW_NUMBER function to order its values using the same ordering as the ORDER BY clause on the query itself (S.Name, P.ListPrice), as follows:

USE AdventureWorks2008
GO
SELECT
S.Name AS SubName, P.Name ProdName, ListPrice AS Price,
ROW_NUMBER() OVER
(ORDER BY S.Name, ListPrice) AS ROW
FROM Production.Product AS P
JOIN Production.ProductSubcategory AS S
ON P.ProductSubcategoryID = S.ProductSubcategoryID
WHERE S.ProductSubCategoryID IN (5,18,20)

ORDER BY S.Name, ListPrice;
SubName ProdName Price ROW
---------------- --------------------- ------ ----
Bib-Shorts Men's Bib-Shorts, S 89.99 1
Bib-Shorts Men's Bib-Shorts, M 89.99 2
Bib-Shorts Men's Bib-Shorts, L 89.99 3
Bottom Brackets LL Bottom Bracket 53.99 4
Bottom Brackets ML Bottom Bracket 101.24 5
Bottom Brackets HL Bottom Bracket 121.49 6
... ... ... ...

So far, the row numbers generated are based on the entire result set (all 12 rows). We can break the ranking functions down into different subsets or “partitions” of the total number of rows much in the same way that we can group rows with the group by clause when using aggregate functions. We indicate what rows are in the same partition by stating the columns whose values the partitions will be based on in the <partition_by_clause>. For example, if we wanted to have the row number start over with each subcategory, we would use the following query:

USE AdventureWorks2008
GO
SELECT
S.Name AS SubName, P.Name ProdName, ListPrice AS Price,
ROW_NUMBER() OVER
(PARTITION BY S.ProductSubcategoryID ORDER BY ListPrice) AS ROW
FROM Production.Product AS P
JOIN Production.ProductSubcategory AS S
ON P.ProductSubcategoryID = S.ProductSubcategoryID
WHERE S.ProductSubCategoryID IN (5,18,20)
ORDER BY S.Name, ListPrice;
SubName ProdName Price ROW
--------------- --------------------- ------ ---
Bib-Shorts Men's Bib-Shorts, S 89.99 1
Bib-Shorts Men's Bib-Shorts, M 89.99 2
Bib-Shorts Men's Bib-Shorts, L 89.99 3
Bottom Brackets LL Bottom Bracket 53.99 1
Bottom Brackets ML Bottom Bracket 101.24 2
Bottom Brackets HL Bottom Bracket 121.49 3
Gloves Half-Finger Gloves, S 24.49 1

Gloves Half-Finger Gloves, M 24.49 2
Gloves Half-Finger Gloves, L 24.49 3
Gloves Full-Finger Gloves, S 37.99 4
Gloves Full-Finger Gloves, M 37.99 5
Gloves Full-Finger Gloves, L 37.99 6


Notice that the values in the ROW column now start back at one with each new subcategory. This is because of the <partition_by_clause> of PARTITION BY S.ProductSubcategoryID .

New & Noteworthy...: ROW_NUMBER() vs. IDENTITY()

The new ROW_NUMBER() function provides a much more flexible way to number rows in our result sets that does the IDENTITY() function.

The IDENTITY() function (similar to the IDENTITY property in a table column) provides a way to automatically number rows in a result set. However, the IDENTITY() function can only be used in SELECT ... INTO queries. The ROW_NUMBER() function, on the other hand, can be used in any SELECT statement and is quite configurable via the partition and order by clauses.


That is the basics of using ROW_NUMBER and how to use the <order_by_clause> and <partition_by_clause> for all ranking functions. Let’s now take a look at those other functions.

					  
Other  
  •  SQL Server 2008 : Demystifying Data Types - Computed Columns
  •  Programming Microsoft SQL Server 2005: Overview of SQL CLR - Visual Studio/SQL Server Integration
  •  Programming Microsoft SQL Server 2005: DDL Triggers and Notifications
  •  Programming Microsoft SQL Server 2005: Enabling CLR Integration
  •  Reporting Services with SQL Azure : Creating the SQL Azure Data Source
  •  Reporting Services with SQL Azure : Starting a SQL Azure–Based Report
  •  SQL Server 2008 : Service Broker - Message Types
  •  SQL Server 2008 : Service Broker Defined
  •  SQL Azure : Building Two OData Consumer Applications (part 2) - Windows Mobile 7 Application
  •  SQL Azure : Building Two OData Consumer Applications (part 1) - Simple Demo App
  •  SQL Server 2008 : Explaining Advanced Query Techniques - Creating CTEs
  •  SQL Server 2008 : Multiple Sources with SQL Server Integration Services
  •  SQL Server 2008 : Working with Multiple-Source Queries - OpenQuery, OpenRowSet, and OpenDataSource Explained
  •  SQL Server 2008 : Working with Multiple-Source Queries - Using Four-Part Database Names & The DTC Explained
  •  SQL Server 2008 : Working with Multiple-Source Queries - Using Linked Servers
  •  OData with SQL Azure - Enabling OData on an Azure Database
  •  OData with SQL Azure - OData Overview
  •  SQL Server 2008 : Performing Nonlogged DML Operations
  •  SQL Server 2008 : Using the OUTPUT Clause with the MERGE Statement
  •  SQL Server 2008 : Returning Data from DML Operations Using the OUTPUT Clause
  •  
    Top 10
    A Look At Truecrypt The Open Source Security Tool
    Price Of Piracy
    Acer Aspire 5600U 23" Touchscreen All-in-One PC
    Zalman FX100-Cube Fanless Cooler
    Devolo dLAN LiveCam Starter Kit
    Has Apple Lost It? (Part 2)
    Has Apple Lost It? (Part 1)
    Sony Computer Entertainment (Part 3)
    Sony Computer Entertainment (Part 2)
    Sony Computer Entertainment (Part 1)
    Most View
    CyanogenMod overclocking : Built-in controls, Alright Governor & Custom tools
    NVidia GeForce GTX 660 Ti - Does Kepler Fair Against Its Established Rivals?
    Storage, Screens And Sounds (Part 3)
    Windows Server 2008: Improvements for Thin Client Remote Desktop Services
    Algorithms for Compiler Design: INTERMEDIATE CODE GENERATION
    Samsung Series 9 Notebook – One Step Over And Beyond
    Windows Vista : Programming the WshNetwork Object
    Promote Your Photography (Part 1)
    Programming the Mobile Web : HTML 5 (part 4) - Client Storage
    SharePoint 2010 : Business Intelligence - Excel Services (part 2) - Accessing Excel Services Over SOAP
    ASP.NET 4 in VB 2010 : ADO.NET Fundamentals (part 4) - Direct Data Access - The DataReader
    iPhone Application Development : Building a Multi-View Tab Bar Application (part 4) - Implementing the Summary View
    How to Uninstall Windows Drivers : Roll back to previous driver & Uninstall a driver
    WebOS open source challenges Android
    Integrating Applications with the Windows Phone OS : Working with Launchers and Choosers
    Sharepoint 2010 : Maintaining SQL Server in a SharePoint Environment
    Choosing A... NAS Device (Part 1)
    Building and Deploying Applications for Windows Azure : Activating the Storage Account Account
    Beginning Android 3 : Working with Containers - Tabula Rasa
    SharePoint 2010 : Operations Management with the SharePoint Central Administration Tool (part 6)