SQL Server : Reusing T-SQL Code - The Dangers of Copy-and-Paste

3/23/2013 9:36:34 PM

The biggest problem with copy-and-paste as a means of solving a set of similar problems is that, of course, it leads to code duplication. In turn, this means that we need to maintain multiples copies of essentially the same code, but with each copy subtly modified to suit a particular need. The real danger arises when requirements change, and we need to make sure that this is reflected, not just in the original code, but in all the subsequent copies. We can easily demonstrate this risk with an example. Listing 1 creates the Sales table and loads it with some test data.

Listing 1. Creating the Sales table and populating it with test data.

Listing 2 shows the stored procedure, SelectTotalSalesPerStateForMonth, which returns the total sales per state for a given month.

Listing 2. The SelectTotalSalesPerStateForMonth stored procedure.

At the time we developed this code, our understanding of a report "for a given month" is one that covers the period of time from the first calendar day of the month until the day we run the report. For this purpose, our stored procedure serves the customers' needs well, and we soon receive a request for a similar report, returning the average sales per state, for a given month. Note that our new report is required to use the same definition of "for a given month."

It is very tempting to just copy the existing SelectTotalSalesPerStateForMonth procedure, and replace sum with avg to meet the new requirements, as shown in Listing 3.

Listing 3. A simple adaptation of our "total sales" stored procedure allows us to produce an "average sales" equivalent.

In this way, we have completed the task in just a few seconds and, in the short term at least, it will do the job.

Suppose, however, that at some later time the users request to change the definition of "for a given month" to "thirty consecutive calendar days, ending on the day we run the report." Unfortunately, the definition of "for a given month" is implemented twice, both in SelectTotalSalesPerStateForMonth and in SelectAverageSalesPerStateForMonth. Even if one and the same person developed them both, it is possible to forget it by the time we need to implement the change. Even if it is clearly documented that both procedures should use one and the same definition, it is still possible that the developer implementing the change has failed to modify both stored procedures in a consistent way.

Suppose, for example, that only the SelectAverageSalesPerStateForMonth stored procedure was modified to meet this new requirement. Listing 4 shows how it was changed.

Listing 4. The modified SelectAverageSalesPerStateForMonth stored procedure, accomodating the new definition of "for a given month."

When we make such changes, it is very easy to forget that we have implemented the definition of "for a given month" in two places. If we update the definition in one place and not the other, we will get inconsistent results, as demonstrated by Listing 5.

Listing 5. The stored procedures produce different results.

Clearly the average sales size for the state of California (4,630,000) is many times greater than the total sales for the same state (10,000), which makes no sense at all. In this example, I have deliberately used test data that makes the discrepancy obvious. In general, however, such discrepancies may be more subtle and difficult to spot, so they can lurk around for a long time.

As this example clearly demonstrates, when we cut and paste code, we expose our code to the possibility of bugs if our requirements change, and we fail to change each of the multiple implementations of one and the same logic in exactly the same way. In short, copy-and-paste coding is a direct violation of the DRY (Don't Repeat Yourself) principle, which is so fundamental in software engineering.

The code to implement a given logic should be implemented once, and once only, and reused by all applications that need it. However, of course, due care must be taken when reusing SQL code. Careless reuse of code can lead to maintenance and performance issues, especially when this reuse takes the form of scalar UDFs. We cannot reuse code without first verifying that it runs fast enough.

  •  SQL Server 2005 Native XML Web Services : Example Native XML Web Services Project (part 3) - Creating the Client Application
  •  SQL Server 2005 Native XML Web Services : Example Native XML Web Services Project (part 2) - Exposing the Endpoints
  •  SQL Server 2005 Native XML Web Services : Example Native XML Web Services Project (part 1) - Creating the SQL Server Functionality
  •  SQL Server 2005 Native XML Web Services : Exposing SQL Programmability as Web Services (part 2) - Calling Native XML Web Service Endpoints from Client Applications
  •  SQL Server 2005 Native XML Web Services : Exposing SQL Programmability as Web Services (part 1)
  •  Western Digital Black 4TB Hard Drive - 4TB Storage Goes Mainstream
  •  Intel Solid State Drive 335 Series - Better Under The Hood
  •  HDD, SSD and Hybrid Hard Drive Competition
  •  SQL Server 2008 : Index analysis (part 3) - Identifying index fragmentation
  •  SQL Server 2008 : Index analysis (part 2) - Identifying indexes to add
    Most View
    Microsoft SharePoint 2010 Web Applications : Presentation Layer Overview - Ribbon (part 1)
    The Cyber-athletic Revolution – E-sports’ Era (Part 1)
    Windows Server 2003 : Implementing Software Restriction Policies (part 4) - Implementing Software Restriction Policies - Creating a Path Rule, Designating File Types
    Sql Server 2012 : Hierarchical Data and the Relational Database - Populating the Hierarchy (part 1)
    Two Is Better Than One - WD My Cloud Mirror
    Programming ASP.NET 3.5 : Data Source-Based Data Binding (part 3) - List Controls
    Windows 8 : Configuring networking (part 5) - Managing network settings - Understanding the dual TCP/IP stack in Windows 8, Configuring name resolution
    Nikon Coolpix A – An Appealing Camera For Sharp Images (Part 2)
    Canon PowerShot SX240 HS - A Powerful Perfection
    LG Intuition Review - Skirts The Line Between Smartphone And Tablet (Part 2)
    Popular Tags
    Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8 BlackBerry Android Ipad Iphone iOS
    Top 10
    Review : Acer Aspire R13
    Review : Microsoft Lumia 535
    Review : Olympus OM-D E-M5 Mark II
    TomTom Runner + MultiSport Cardio
    Timex Ironman Run Trainer 2.0
    Suunto Ambit3 Peak Sapphire HR
    Polar M400
    Garmin Forerunner 920XT
    Sharepoint 2013 : Content Model and Managed Metadata - Publishing, Un-publishing, and Republishing
    Sharepoint 2013 : Content Model and Managed Metadata - Content Type Hubs