DATABASE

Surviving Changes to the Definition of a Primary or Unique Key

9/27/2010 5:50:35 PM
Changes to the keys in your tables should, hopefully, be rare, but they can cause trouble to the unwary when they happen. In a broader context, the defensive programmer should always fully document and test any assumptions about the underlying uniqueness of the column data. The following examples demonstrate what can happen to perfectly correct code when changes are made to the underlying unique or primary keys, thus invalidating assumptions in the code regarding the uniqueness of the column data. We'll then discuss how a query against the system views, or use of @@ROWCOUNT, can detect if such assumptions are still true.

In Listing 1, we create a table, Customers, using a UNIQUE constraint to guarantee the uniqueness of phone numbers, and then populate it with some test data.

Listing 1. Creating the Customers table, with a UNIQUE constraint on the PhoneNumber column.

We need to implement a simple stored procedure, shown in Listing 2, which will allow users to find a customer based on their phone number, and set their customer status (regular, preferred, or VIP). If no customer exists for a given phone number, we don't need to raise an exception; we simply do nothing.

Listing 2. The SetCustomerStatus stored procedure, which finds a customer by phone number and sets their status.

This implementation assumes that at most one customer has any given phone number. Clearly, right now, this assumption is true as it is guaranteed by the UNQ_Customers constraint.

Suppose, however, that at some later time we need to store data about customers from different countries. At this point, the phone number alone no longer uniquely indentifies a customer, but the combination of country code and phone number does. In order to accommodate this requirement, our Customers table is altered to add the new column, CountryCode, and our UNQ_Customers constraint is modified so that it enforces uniqueness based on a combination of the CountryCode and PhoneNumber columns. These alterations are shown in Listing 3.

Listing 3. Adding a CountryCode column to the table and to the unique constraint.

Note that, in reality, we should have added a lookup table, dbo.CountryCodes, referred to by a FOREIGN KEY constraint. However, I've avoided a lookup table in this case, in favor of keeping the example simple.

At this point, our constraint is no longer enforcing the uniqueness of values in the PhoneNumber column, so we can insert a customer with an identical phone number to an existing customer, but with a different country code, as shown in Listing 4.

Listing 4. Wayne Miller has the same phone number as Darrell Ling, but with a different country code.

Our Stored procedure, however, is still working on the assumption that a customer can be uniquely identified by their phone number alone. Since this assumption is no longer valid, the stored procedure, in its current form, could erroneously update more than one row of data, as demonstrated in Listing 5.

Listing 5. The unchanged stored procedure modifies two rows instead of one.

Perhaps the most prevalent and damaging mistake made during the development of SQL code is a failure to define or recognize the assumptions on which the implementation relies. The result, as demonstrated here, is code that is brittle, and liable to behave unpredictably when these assumptions are invalidated by changes to the underlying database objects.

Of course, the most obvious lesson to be learned here is that whenever we change our unique and/or primary keys, we need to review all the procedures that depend on the modified tables. However, the manual process of reviewing the potentially affected code is, like all manual processes, slow and prone to error. It may be more efficient to automate the process of identifying the modules that rely on particular assumptions about the underlying schema. Unit tests allow us to accomplish exactly that; we can easily, for example, write a unit test that succeeds if there is a UNIQUE constraint on the PhoneNumber column alone, and fails when this is no longer the case.

Using unit tests to document and test assumptions

Let's translate the assumption that the PhoneNumber column uniquely identifies a customer into a query against the system views. The query is rather complex, so we'll develop it in several steps. First of all, we need to know if there are any constraints on the PhoneNumber column, as shown in Listing 6.

Listing 6. Step 1, a query to check for constraints on PhoneNumber.

This query returns 1, confirming that there is a constraint on that column. Next, we need to verify that the constraint is either a primary key or a unique constraint:

Listing 7. Step 2 determines if the constraint on column PhoneNumber is a primary key or unique.

Finally, we need to make sure that no other columns are included in that UNIQUE or PRIMARY KEY constraint, as follows:

Listing 8. Step 3, the final query determines whether there is a unique or primary key constraint that is built on only the PhoneNumber column.

When we run this query against the original database schema, with a UNIQUE constraint on the PhoneNumber column, it returns a value of 1 indicating that there is indeed a constraint built only on the PhoneNumber column. However, when we run it after the column CountryCode has been added to the definition of the unique constraint, the second subquery returns the value 2, which means that the UNIQUE constraint UNQ_Customers is built on two columns, and so the outer query returns a value of 0.

In short, this query provides us with a means to verify the validity of the assumption that the PhoneNumber column uniquely identifies a customer. By incorporating this query into our unit test harness, we can accomplish two goals:

  • our assumption is documented – the code in Listing 8 clearly documents the fact that the dbo.SetCustomerStatus stored procedure needs a unique or primary constraint on a single column, PhoneNumber

  • our assumption is tested – if the required constraint is dropped, or includes more than one column, we shall get a clear warning, because the unit test will fail.

Of course, we should wrap this query in a stored procedure and reuse it, because there will surely be other cases when we rely on the uniqueness of a column used in our search condition.

We can use a similar technique to verify whether or not a combination of columns, considered together, are guaranteed to be unique. Implementing this query is left as an exercise to the reader.

Using @@ROWCOUNT to verify assumptions

Alternatively, instead of documenting our assumption as a unit test, we can have our stored procedure detect how many rows it modified, and roll back if it updated more than one row, as shown in Listing 9.

Listing 9. A stored procedure that will not modify more than one row.

To see it in action, run Listing 10; the stored procedure raises an error and does not modify the data.

Listing 10. Testing the altered stored procedure.

In general, this approach could be useful although, in this particular case, it is less preferable than a unit test. The reason is very simple: a unit test will alert us about a problem before deployment, allowing us fix the problem early, and to deploy without this particular bug. The altered stored procedure might not indicate a problem until the code has been deployed to production, which means troubleshooting a production system and redeploying a fix; a situation we usually want to avoid.

Using SET instead of SELECT when assigning variables

In the previous chapter, we discussed how important it is to understand the different behavior of SET and SELECT when assigning values to variables. That same knowledge will help you write application code that is resistant to changes to the underlying schema objects.

Let's consider a second example whereby a search condition contains an implied assumption regarding the uniqueness of the underlying data column. The search condition, and subsequent variable assignment, shown in Listing 11 assumes, again, that the PhoneNumber column can uniquely identify a customer.

Listing 11. Unpredictable variable assignment, using SELECT.

In our original database schema, before we added CountryCode column to the Customers table, the result of this assignment was predictable. However, in our new schema, the UNQ_Customers constraint only guarantees the uniqueness of the values in the PhoneNumber and CountryCode columns, considered together. As a result, we have two customers with this phone number and so the variable assignment is unpredictable; we do not, and cannot, know which of the two CustomerId values, 1 or 3, will populate the variable.

In most cases, such ambiguity is not acceptable. The simplest fix is to use SET instead of SELECT to populate the variable, as shown in Listing 12.

Listing 12. Whereas SELECT ignores the ambiguity, SET detects it and raises an error.


Other  
 
Video
Top 10
SG50 Ferrari F12berlinetta : Prancing Horse for Lion City's 50th
The latest Audi TT : New angles for TT
Era of million-dollar luxury cars
Game Review : Hearthstone - Blackrock Mountain
Game Review : Battlefield Hardline
Google Chromecast
Keyboards for Apple iPad Air 2 (part 3) - Logitech Ultrathin Keyboard Cover for iPad Air 2
Keyboards for Apple iPad Air 2 (part 2) - Zagg Slim Book for iPad Air 2
Keyboards for Apple iPad Air 2 (part 1) - Belkin Qode Ultimate Pro Keyboard Case for iPad Air 2
Michael Kors Designs Stylish Tech Products for Women
REVIEW
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
Popular Tags
Video Tutorail Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Exchange Server Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe Photoshop CorelDRAW X5 CorelDraw 10 windows Phone 7 windows Phone 8 Iphone