DATABASE

SQL Server 2008 : T-SQL Tips and Tricks (part 3) - Generating T-SQL Statements with T-SQL & De-Duping Data with Ranking Functions

5/17/2011 4:08:23 PM

Generating T-SQL Statements with T-SQL

The system catalogs in SQL Server 2008 contain a wealth of information you can use to save a lot of time and effort when generating SQL statements that need to be run repeatedly on a large number of database objects or when trying to build a column list for a query. You can use T-SQL code to select information from the system catalogs, system tables, and system views to generate SELECT statements and the like. For example, say you want to grant EXECUTE permission to the user fred on each stored procedure in the bigpubs2008 database. This can be a tedious task to perform using SSMS because in the Securables dialog for the user fred, you have to select each procedure, one at a time, and click the Grant Execute check box. If there were a large number of procedures, this could be pretty time-consuming, and your mouse-clicking finger would probably get pretty tired.

The quicker and easier way would be to build the SQL statements necessary to grant EXECUTE permission to fred on all the stored procedures. The following SELECT statement can be used to generate a SQL script with those commands:

select 'grant execute on ' +  name + ' to fred'
from sys.procedures
order by name
go

--------------------------------------------------------
grant execute on byroyalty to fred
grant execute on cursor_proc to fred
grant execute on error_handler to fred
grant execute on find_books_by_type2 to fred
grant execute on gen_sequence to fred
grant execute on get_authors to fred
grant execute on get_next_item_from_queue to fred
grant execute on get_titles to fred
grant execute on p_explicit_cols to fred
grant execute on p_fetch_explicit_cols to fred
grant execute on p_insert_explicit_cols to fred
grant execute on reptq1 to fred
grant execute on reptq2 to fred
grant execute on reptq3 to fred
grant execute on SHOW_PARTS_LIST to fred
grant execute on title_authors to fred
grant execute on trantest to fred
grant execute on ytd_sales to fred
grant execute on ytd_sales2 to fred



You can copy and paste the output from this statement into a query window in SSMS and execute it to grant the desired permissions. When you get to know your system catalog views, you can begin to automate the generation of a number of SQL operations in this manner, freeing up your time to spend on more interesting projects.

Working with @@ERROR and @@ROWCOUNT

When you are writing T-SQL code that needs to check for both errors and the number of rows affected after your SQL statements, one of the common pitfalls is trying to get both the error status and the number of rows after a SQL statement runs. You have to remember that all SQL statements except the DECLARE statement reset the value of @@ROWCOUNT and @@ERROR to the status of the last command executed.

If after a SQL statement you check the value of @@ERROR, the statement used to check @@ERROR resets @@ROWCOUNT. If you check @@ROWCOUNT first, it resets the value of @@ERROR. To check both values, you need to use an assignment SELECT immediately after the SQL statement you are checking and capture both values into local variables. Note that you cannot accomplish this with the SET statement because the SET statement allows setting a value to only a single variable at a time.

The example in Listing 14 provides a way to capture and check both @@ROWCOUNT and @@ERROR after an UPDATE statement in a T-SQL batch.

Listing 14. Capturing Both @@ROWCOUNT and @@ERROR After an UPDATE Statement
declare @rowcnt int,
@error int
UPDATE dbo.titles set price = price * 1.10
where type = 'fiction'
select @rowcnt = @@ROWCOUNT, @error = @@ERROR
if @rowcnt = 0
print 'no rows updated'
if @error <> 0
raiserror ('Update of titles failed', 16, 1)
return



De-Duping Data with Ranking Functions

One common problem encountered with imported data is unexpected duplicate data rows, especially if the data is being consolidated from multiple sources. In previous versions of SQL Server, de-duping the data often involved the use of cursors and temp tables. Since the introduction of the ROW_NUMBER ranking function and common table expressions in SQL Server 2005, you are able to de-dupe data with a single statement.

To demonstrate this approach, Listing 14 shows how to create an authors_import table and populate it with some duplicate rows.

Listing 15. Script to Create and Populate the authors_import Table
USE bigpubs2008
GO
CREATE TABLE dbo.authors_import(
au_id dbo.id NOT NULL,
au_lname varchar(30) NOT NULL,
au_fname varchar(20) NOT NULL)
go

INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('681-61-9588', 'Ahlberg', 'Allan')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('739-35-5165', 'Ahlberg', 'Janet')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('499-84-5672', 'Alexander', 'Lloyd')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('499-84-5672', 'Alexander', 'Lloyd')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('432-31-3829', 'Bate', 'W. Jackson')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('432-31-3829', 'Bate', 'W. Jackson')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('432-31-3829', 'Bate', 'W. Jackson')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('437-99-3329', 'Bauer', 'Caroline Feller')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('378-33-9373', 'Benchley', 'Nathaniel')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('378-33-9373', 'Benchley', 'Nate')
INSERT INTO dbo.authors_import(au_id, au_lname, au_fname)
VALUES('409-56-7008', 'Bennet', 'Abraham')
GO



You can see in the data for Listing 15 that there are two duplicates for au_id 499-84-5672 and three for au_id 432-31-3829. To start identifying the duplicates, you can write a query using the ROW_NUMBER() function to generate a unique row ID for each data row, as shown in Listing 16.

Listing 16. Using the ROW_NUMBER() Function to Generate Unique Row IDs
SELECT ROW_NUMBER() OVER (ORDER BY au_id, au_lname, au_fname) AS ROWID, *
FROM dbo.authors_import
go

ROWID au_id au_lname au_fname
-------------------- ----------- ------------------------------ ----------------
1 378-33-9373 Benchley Nate
2 378-33-9373 Benchley Nathaniel
3 409-56-7008 Bennet Abraham
4 432-31-3829 Bate W. Jackson
5 432-31-3829 Bate W. Jackson
6 432-31-3829 Bate W. Jackson
7 437-99-3329 Bauer Caroline Feller
8 499-84-5672 Alexander Lloyd
9 499-84-5672 Alexander Lloyd
10 681-61-9588 Ahlberg Allan
11 739-35-5165 Ahlberg Janet



Now you can use the query shown in Listing 16 to build a common table expression to find the duplicate rows. In this case, we keep the first row found. To make sure it works correctly, write the query first as a SELECT statement to verify that it is identifying the correct rows, as shown in Listing 17.

Listing 17. Using a Common Table Expression to Identify Duplicate Rows
(SELECT ROW_NUMBER() OVER (ORDER BY au_id, au_lname, au_fname) AS ROWID, *
FROM dbo.authors_import)
select * FROM authors_import WHERE ROWID NOT IN
(SELECT MIN(ROWID) FROM authors_import
GROUP BY au_id,au_fname, au_lname);
GO

ROWID au_id au_lname au_fname
-------------------- ----------- ------------------------------ -----------------
5 432-31-3829 Bate W. Jackson
6 432-31-3829 Bate W. Jackson
9 499-84-5672 Alexander Lloyd



Now you simply change the final SELECT statement in Listing 17 into a DELETE statement, and it removes the duplicate rows from authors_import:

WITH authors_import AS
(SELECT ROW_NUMBER() OVER (ORDER BY au_id, au_lname, au_fname) AS ROWID, *
FROM dbo.authors_import)
delete FROM authors_import WHERE ROWID NOT IN
(SELECT MIN(ROWID) FROM authors_import
GROUP BY au_id,au_fname, au_lname);
GO
select * from authors_import
go

au_id au_lname au_fname
----------- ------------------------------ --------------------
681-61-9588 Ahlberg Allan
739-35-5165 Ahlberg Janet
499-84-5672 Alexander Lloyd
432-31-3829 Bate W. Jackson
437-99-3329 Bauer Caroline Feller
378-33-9373 Benchley Nathaniel
378-33-9373 Benchley Nate
409-56-7008 Bennet Abraham

If you want to retain the last duplicate record and delete the previous ones, you can replace the MIN function with the MAX function in the DELETE statement.

Notice that the uniqueness of the duplication is determined by the columns specified in the GROUP BY clause of the subquery. Notice that there are still two records for au_id 378-33-9373 remaining in the final record set. The duplicates removed were based on au_id, au_lname, and au_fname. Because the first name is different for each of the two instances of au_id 378-33-9373, both Nathaniel Benchley and Nate Benchley remain in the authors_import table. If you remove au_fname from the GROUP BY clause, the earlier record for Nathaniel Benchley would remain, and Nate Benchley would be removed. However, this result may or may not be desirable. You would probably want to resolve the disparity between Nathaniel and Nate and confirm manually that they are duplicate rows before deleting them. Running the query in Listing 15 with au_fname removed from the GROUP BY clause helps you better determine what your final record set would look like.

Other  
  •  SQL Server 2008 : General T-SQL Performance Recommendations
  •  SQL Server 2008 : General T-SQL Coding Recommendations (part 2) - Avoid SQL Injection Attacks When Using Dynamic SQL & Comment Your T-SQL Code
  •  SQL Server 2008 : General T-SQL Coding Recommendations (part 1) - Provide Explicit Column Lists & Qualify Object Names with a Schema Name
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Using Extended Stored Procedures
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Installing and Using .NET CLR Stored Procedures
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Using Dynamic SQL in Stored Procedures
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Stored Procedure Performance
  •  SQL Server 2008 : Using Remote Stored Procedures
  •  SQL Server 2008 : Using Temporary Tables in Stored Procedures
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Nested Stored Procedures
  •  
    Video
    Top 10
    SG50 Ferrari F12berlinetta : Prancing Horse for Lion City's 50th
    The latest Audi TT : New angles for TT
    Era of million-dollar luxury cars
    Game Review : Hearthstone - Blackrock Mountain
    Game Review : Battlefield Hardline
    Google Chromecast
    Keyboards for Apple iPad Air 2 (part 3) - Logitech Ultrathin Keyboard Cover for iPad Air 2
    Keyboards for Apple iPad Air 2 (part 2) - Zagg Slim Book for iPad Air 2
    Keyboards for Apple iPad Air 2 (part 1) - Belkin Qode Ultimate Pro Keyboard Case for iPad Air 2
    Michael Kors Designs Stylish Tech Products for Women
    REVIEW
    - First look: Apple Watch

    - 3 Tips for Maintaining Your Cell Phone Battery (part 1)

    - 3 Tips for Maintaining Your Cell Phone Battery (part 2)
    Popular Tags
    Video Tutorail Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Exchange Server Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe Photoshop CorelDRAW X5 CorelDraw 10 windows Phone 7 windows Phone 8 Iphone