SQL Server 2005 introduces a new data type for working with XML data: the XML
data type. Using this new data type, you can store XML in its native
format, query the data within the XML, efficiently and easily modify
data within the XML without having to replace the entire contents, and
index the data in the XML. You can use it as any of the following:
There are some limitations of the XML
data type that you should be aware of. Although this data type can
contain and be checked for null values, unlike other native types, you
cannot directly compare an instance of an XML data type to another instance of an XML data type. Any such equality comparisons require first casting the XML type to a character type. This limitation also means that you cannot use ORDER BY or GROUP BY with an XML data type. There are several other restrictions, which we will discuss in more detail later.
These might seem like pretty severe restrictions, but they don’t really affect the XML data type when it is used appropriately. The XML data type also has a rich feature set that more than compensates for these limitations.
Working with the XML Data Type as a Variable
Let’s start by writing some code that uses the new XML
data type as a variable. As with any other T-SQL variable, you simply
declare it and assign data to it, as in the following example, which
uses a generic piece of XML to represent a sales representative’s data
as XML:
DECLARE @xmlData AS XML
Set @xmlData='
<Customers>
<CustomerID>CORZN</CustomerID>
<CompanyName>Corzen, Inc</CompanyName>
<ContactName>Stephen Forte</ContactName>
<ContactTitle>Sales Representative</ContactTitle>
<Address>5-9 Union Square West</Address>
<City>New York</City>
<PostalCode>10028</PostalCode>
<Country>USA</Country>
<Phone>030-0074321</Phone>
<Fax>030-0076545</Fax>
</Customers>
'
Select @xmlData
This basic example shows an XML variable being declared like any other native SQL Server data type by using the DECLARE statement. Then the variable is assigned a value. Oddly enough, a string of XML data is assigned to the XML
data type and the type parses it into XML. (Coincidentally, CLR-based
user-defined types also support and require this same parsing
functionality.) The example also checks that the XML is well formed,
such as by validating that an element’s start and end tags match not
only in name but also in case.
More Info
For more details on how to create well-formed XML, see Michael J. Young’s book XML Step by Step, Second Edition (Microsoft Press, 2002). |
The last statement returns the XML to the caller via a SELECT
statement, and the results appear with the XML in a single column in a
single row of data. Another benefit of having the database recognize
that you are working with XML and not raw text that happens to be XML is
that XML results in SQL Management Studio are displayed as a hyperlink,
which when clicked opens a new window displaying nicely formatted XML
(using the Microsoft Internet Explorer XML Template), as shown in Figure 1. But that’s not all: If you right-click on the XML results window’s title bar, you get a shortcut menu (Figure 2) with some useful options, such as the ability to save the XML to a file or to copy the full XML path.
Working with XML in Tables
As we stated earlier, you can also define a column in a table as XML, as shown in the following example:
USE AdventureWorks
GO
--create the table with the XML Datatype
CREATE TABLE OrdersXML
(OrderDocID INT PRIMARY KEY,
xOrders XML NOT NULL)
As we also stated earlier, the XML data type has a few other restrictions, in this case when used as a column in a table:
It cannot be used as a primary key.
It cannot be used as a foreign key.
It cannot be declared with a UNIQUE constraint.
It cannot be declared with the COLLATE keyword.
We stated earlier that you can’t compare two instances of the XML
data type. Primary keys, foreign keys, and unique constraints all
require that you must be able to compare any included data types;
therefore, XML cannot be used in any of those situations. The SQL Server
COLLATE statement is meaningless with the XML data type because SQL Server does not store the XML as text; rather, it uses a distinct type of encoding particular to XML.
Now let’s get some data
into the column. This example takes some simple static XML and inserts
it into the OrdersXML table we just created, using the XML data type as a variable.
USE AdventureWorks
GO
--Insert Static XML via a variable
DECLARE @xmlData AS XML
SET @xmlData = '
<Orders>
<Order>
<OrderID>5</OrderID>
<CustomerID>65</CustomerID>
<OrderAmount>25</OrderAmount>
</Order>
</Orders>'
--insert into the table
INSERT INTO OrdersXML (OrderDocID, xOrders) Values (1, @xmlData)
You can insert XML into
these columns in a variety of other ways: XML bulk load , loading from an XML variable (as shown
here), or loading from a SELECT statement using the FOR XML TYPE feature, which we will discuss shortly. Only
well-formed XML (including fragments) can be inserted; any attempt to
insert malformed XML will result in an exception, as shown in this
fragment:
--Fails because of the malformed XML
INSERT INTO OrdersXML (OrderDocID, xOrders) VALUES (3, '<nm>steve</NM>')
The results produce the following error from SQL Server:
Msg 9436, Level 16, State 1, Line 1
XML parsing: line 1, character 14, end tag does not match start tag
Defaults and Constraints
The XML data type can, like other data types, conform to nullability, defaults, and constraints. If you want to make a field required (NOT NULL) and provide a default value on your XML column, just specify this as you would for any other column.
USE AdventureWorks
GO
CREATE TABLE OrdersXML
(OrderDocID INT PRIMARY KEY,
xOrders XML NOT NULL DEFAULT '<Orders/>'
The following insert works because it relies on the default:
INSERT INTO OrdersXML (OrderDocID, xOrders) VALUES (2, DEFAULT)
Adding a default does not enforce just <Orders>
from being added. For example, we have specified an Orders node as the
default but have not yet indicated any way to enforce that. The
following insert works even if we only want <Orders> in our table because we have not declared a constraint on the column.
Insert Into OrdersXML (OrderDocID, xOrders) Values (3, '<blah>steve</blah>')