SQL Server 2005 XQuery in Action
SQL Server 2005 has a standards-based implementation of XQuery that directly supports XQuery functions on the XML data type. It supports XQuery by using five methods of the XML data type:
xml.exist()
Uses XQuery input to return 0, 1, or NULL, depending on the result of
the query. It returns 0 if no elements match, 1 if there is a match, and
NULL if there is no XML data on which to query. This method is often
used for query predicates.
xml.value() Accepts an XQuery as input and returns an SQL Server scalar type.
xml.query() Accepts an XQuery as input and returns an XML data type stream as output.
xml.nodes()
Accepts an XQuery as input and returns a single-column rowset from the
XML document. In essence, it shreds XML into multiple smaller XML
results.
xml.modify() Allows you to insert, delete, or modify nodes or sequences of nodes in an XML data type instance using XQuery Data Manipulation Language.
We will discuss the methods of the XML
data type shortly. But first we must create some sample data. We will
create a simple table that contains speakers at a software developers
conference and the corresponding classes they will teach. Usually you
normalize the data and have a one-to-many relationship between a
speakers table and the classes table. Instead of using an additional
normalized table, we will model this as one table with the speaker’s
information and one XML column with the speaker’s classes. In the real
world, you might encounter this scenario when in a back-office database
you have the speaker and his classes represented in a series of
one-to-many tables. Then for the Web database, you might “publish” a
database on a frequent time interval (like a reporting database) or
transform normalized data and use the XML column for easy HTML display
(or XSLT transformations).
We first create an
XSD schema (for reasons that will soon become clear) for our XML column.
This schema will define the type of XML allowed in the column,
including the XML data types and required properties for particular XML elements:
use AdventureWorks
go
CREATE xml schema collection dbo.classes_xsd
As
'<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="class">
<xs:complexType>
<xs:attribute name="name" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
<xs:element name="classes">
<xs:complexType>
<xs:sequence>
<xs:element ref="class" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>'
Next we create our table, tblSpeakers. Notice that the XML column, Speaker_XML, uses the classes_xsd XSD schema described earlier.
Create Table tblSpeakers
(
Speaker_ID Integer Primary Key Identity,
Speaker_NM nVarChar(50),
Speaker_Country nVarChar(25),
Speaker_XML XML (classes_xsd) Not Null)
XQuery runs more
efficiently when there is an XML index on the XML column. As you learned
earlier, an XML index works only if there is a primary key constraint
on the table (which we have). The code here creates a primary and then a
structural (PATH) index because our examples do a lot of where restrictions on the values of particular elements.
--XML Index: Primary
CREATE Primary XML INDEX idx_1
ON tblSpeakers (Speaker_XML)
--PATH
CREATE XML INDEX idx_a
ON tblSpeakers (Speaker_XML)
USING XML INDEX idx_1 FOR PATH
Now that we have our
index, remember that XQuery works more efficiently if it is strongly
typed, so you should always use an XSD schema on your XML column for the
best performance. Without an XSD schema, the SQL Server 2005 XQuery
engine assumes that everything is untyped and treats it as string data.
Lastly, we need to get some data into the table by using some T-SQL INSERT statements. The last INSERT, ‘Bad Speaker’, will fail because it violates the classes_xsd schema and does not contain a <classes> element:
Insert into tblSpeakers Values('Stephen Forte', 'USA',
'
<classes>
<class name="Writing Secure Code for ASP .NET "/>
<class name="Using XQuery to Query and Manipulate XML Data in SQL Server 2005"/>
<class name="SQL Server and Oracle Working Together"/>
<class name="Protecting against SQL Injection Attacks "/>
</classes>
'
)
Insert into tblSpeakers Values('Richard Campbell', 'Canada',
'
<classes>
<class name="SQL Server Profiler"/>
<class name="Advanced SQL Querying Techniques"/>
<class name="SQL Server and Oracle Working Together"/>
<class name="T-SQL Error Handling in Yukon"/>
</classes>
'
)
Insert into tblSpeakers Values('Tim Huckaby', 'USA',
'
<classes>
<class name="Smart Client Stuff"/>
<class name="More Smart Client Stuff"/>
</classes>
'
)
Insert into tblSpeakers Values('Malek Kemmou', 'Morocco',
'
<classes>
<class name="SmartPhone 2005"/>
<class name="Office System 2003"/>
</classes>
'
)
Insert into tblSpeakers Values('Goksin Bakir', 'Turkey',
'
<classes>
<class name="SmartPhone 2005"/>
<class name="Office System 2003"/>
</classes>
'
)
Insert into tblSpeakers Values('Clemens F. Vasters', 'Germany',
'
<classes>
<class name="SOA"/>
<class name="FABRIQ"/>
</classes>
'
)
Insert into tblSpeakers Values('Kimberly L. Tripp', 'USA',
'
<classes>
<class name="SQL Server Index"/>
<class name="SQL Precon"/>
</classes>
'
)
Insert into tblSpeakers Values('Bad Speaker', 'France',
'
<CLASSES>
<class name="SQL Server Index"/>
<class name="SQL Precon"/>
</CLASSES>
'
)
Now
that we have our data, it is time to start writing some XQuery
expressions in SQL Server 2005. To do this, we will use the
aforementioned methods of the XML data type inside a regular T-SQL query.
xml.exist()
Having XML in the
database is almost useless unless you can query the elements and
attributes of the XML data natively. XQuery becomes very useful when you
can use it to search based on the values of a particular element or
attribute. The xml.exist()
function accepts an XQuery as input and returns 0, 1, or NULL,
depending on the result of the query; 0 is returned if no elements
match, 1 is returned if there is a match, and NULL is returned if there
is no data to query on. For example, we will see if a node exists in
this particular XML string of classes.
DECLARE @XML xml
Set @XML='
<classes>
<class name="SQL Server Index"/>
<class name="SQL Precon"/>
</classes>
'
Select @XML.exist('/classes')
The code returns 1 because
the “classes” element exists in the XML variable. If you change the
XQuery expression to search for an XML node that does not exist (Select @XML.exist (’/dogs), for example), it will return 0. You can see this in action as part of a CHECK CONSTRAINT. SQL Server does not allow you to use an xml.exist as part of a CHECK CONSTRAINT. You have to first create a user-defined function (UDF) to perform the action. This UDF accepts an XML field and returns the value of an xml.exist() method looking for an instance of <Orders>:
USE AdventureWorks
GO
CREATE FUNCTION dbo.DoesOrderXMLDataExist
(@XML XML)
RETURNS bit
AS
BEGIN
RETURN @XML.exist('/Orders')
END;
GO
To use this UDF as a CHECK CONSTRAINT, just create a table and pass to the UDF you just created the column you want to apply the constraint to.
--create the table using the function
CREATE TABLE OrdersXMLCheck
(OrderDocID INT PRIMARY KEY,
xOrders XML NOT NULL Default '<Orders/>'
CONSTRAINT xml_orderconstraint
CHECK(dbo.DoesOrderXMLDataExist(xOrders)=1))
You will most likely use the return value of xml.exist() (0, 1, or NULL) as part of a T-SQL WHERE
clause. Think about it: You can run a T-SQL query and restrict the
query on a value of a particular XML element! Going back to our main
example, let’s look for the value of ‘SQL Server and Oracle Working Together’ in the <class> element. Here is the XQuery expression to do this:
/classes/class[@name="SQL Server and Oracle Working Together"]
This is how you put it to work:
Select * From tblSpeakers
Where Speaker_XML.exist('/classes/
class[@name="SQL Server and Oracle Working Together"]')=1
The results look like this:
1 Stephen Forte USA < classes>data</classes>
2 Richard Campbell Canada <classes/>data</classes>
The XML returned in these results look like this for Stephen:
<classes>
<class name="Writing Secure Code for ASP .NET " />
<class name="Using XQuery to Query and Manipulate XML Data in SQL Server 2005" />
<class name="SQL Server and Oracle Working Together" />
<class name="Protecting against SQL Injection Attacks " />
</classes>
xml.value()
The xml.value()
function takes a valid XQuery expression and returns an SQL Server
scalar value that you specify. For example, let’s say you have an XML
column and inside of a T-SQL query you want to return some data from the
XML as an intrinsic SQL data type. You call the xml.value()
function on that column by passing in the XQuery expression and the
data type you want to convert the output to. This requires you to know
and understand the data in your XML column. Here’s an example of the
syntax against our current XML document in the database:
xml.value('/classes[1]/class[1]/@name', 'varchar(40)')
This XQuery contains an XPath expression that navigates the first class’s name attribute and a cast to varchar(40).
You must
perform an XQuery expression on an XML column as part of a regular T-SQL
query, as shown here. What is cool is that SQL Server combines both
relational queries and XQuery in one query because in this example we
use a traditional T-SQL WHERE clause to show only speakers from the USA.
USE AdventureWorks
select Speaker_ID, Speaker_NM,
Speaker_Country,
Speaker_XML.value('/classes[1]/class[1]/@name', 'varchar(40)') as Sessions
From tblSpeakers
where speaker_country ='USA'
The results are shown here:
Speaker_ID Speaker_NM Speaker_Country Session
------------------------------------------------------------------------------------
1 Stephen Forte USA Writing Secure Code for ASP .NET
3 Tim Davis USA Smart Client Stuff
7 Kimberly Smith USA SQL Server Index
(3 row(s) affected)
Let’s dissect the XQuery expression. As you’ll recall from our earlier listing, the XML for Stephen looks like this:
<classes>
<class name="Writing Secure Code for ASP .NET "/>
<class name="Using XQuery to Query and Manipulate XML Data in SQL Server 2005"/>
<class name="SQL Server and Oracle Working Together"/>
<class name="Protecting against SQL Injection Attacks "/>
</classes>
The following XQuery path expression returns the value of the first class’s name attribute (Writing Secure Code for ASP.NET) as a varchar(40). So in the preceding query, the XQuery expression is placed in the value()
method of the XML column (Speaker_XML in our T-SQL query). The
following XQuery expression does all the work of getting the first
class’s name:
/classes[1]/class[1]/@name
This approach is
useful when you want to pull standard data out of the XML column and
display it as regular scalar SQL Server data.
xml.query()
The xml.query() function works much like the xml.value() function, except it returns an XML
data type value, so you have a lot more flexibility. It is useful only
if you want the end result of the column in the query to be XML; if you
want scalar data, use xml.value().
If you want to return
the same data as in the previous example, but you want to present the
summary column in XML format, run the exact same query except with the xml.query():
--xml.query
--returns XML data type
--same as previous example but returns XML
select Speaker_ID, Speaker_NM,
Speaker_Country,
Speaker_XML.query('/classes[1]/class[1]') as Sessions
From tblSpeakers
where speaker_country ='USA'
XML.query()
works by passing in an XQuery expression that will result in XML
output. The XQuery expression can return a single element, all the
elements, or use a RETURN (the R in FLOWR)
expression to completely transform the results. In this example, the
first instance of the class element is returned. The results of the
“sessions” column for Stephen’s records are the same as in the previous
example except that the Sessions column is now formatted in XML:
<class name="Writing Secure Code for ASP .NET " />
Instead of using xml.value(), you can return a larger XML result of many nodes by leaving out the [1] ordinal position indicators in your path expression.
--same as previous but returns all
select Speaker_ID, Speaker_NM,
Speaker_Country,
Speaker_XML.query('/classes/class') as Sessions
From tblSpeakers
where speaker_country ='USA'
The results are the same except for Stephen’s classes; we get all of the XML results as an XML column:
<class name="Writing Secure Code for ASP .NET " />
<class name="Using XQuery to Query and Manipulate XML Data in SQL Server 2005" />
<class name="SQL Server and Oracle Working Together" />
<class name="Protecting Against SQL Injection Attacks" />
You can gain further control over your XQuery path expression by using the FLOWR expressions. For example, you can write an expression like this:
for $b in /classes/class
return ($b)
The expression uses the for and return
keywords to loop through all of the class elements and return the
values, and it yields the same results as the preceding example, which
has ’/classes/class’
as its XQuery expression. (Of course, you can come up with much more
interesting examples.) You can incorporate this expression into your
T-SQL query:
Select Speaker_ID, Speaker_NM, Speaker_Country, Speaker_XML.query('
for $b in /classes/class
return ($b)
') As Sessions
From tblSpeakers
Let’s say you want to have more control over the XML output using xml.query() or xml.value() as well as combining the XQuery and a traditional T-SQL WHERESELECT and WHERE clauses of a single query: clause to show only the speakers in the United States. You can use an XML method in both the
Select Speaker_ID, Speaker_NM, Speaker_Country, Speaker_XML.query('/classes/
class') As Sessions
From tblSpeakers
Where Speaker_Country='USA'
and Speaker_XML.exist('/classes/
class[@name="SQL Server and Oracle Working Together"]')=1
The T-SQL WHERE clause restricts on the Speaker_Country column, and our XQuery expression filters only for the <class> element we are interested in.
<class name="Writing Secure Code for ASP .NET " />
<class name="Using XQuery to Query and Manipulate XML Data in SQL Server 2005" />
<class name="SQL Server and Oracle Working Together" />
<class name="Protecting against SQL Injection Attacks " />
xml.nodes()
The xml.nodes() method takes an XQuery expression just like exist(), value(), and query(). xml.nodes() then returns instances of a special XML
data type, each of which has its context set to a different node that
the XQuery expression you supplied evaluates to. The special XML data
type is used for subqueries of XML data in your result set.
SQL Server 2005 XQuery Extensions
Microsoft has made
some extensions to the XQuery specification that are implemented in SQL
Server 2005 to aid the developer in tasks such as referencing SQL data
inside of your XQuery expression to XML DML statements.
Sql:column()
The sql:column()
function, which is specific to SQL Server 2005 XQuery, allows the
XQuery expression to refer to a column in the current row of the
dataset. The syntax is {sql:column(“Column Name”)}, as shown here:
for $b in /classes/class
where $b/@name="SQL Server and Oracle Working Together"
return (<Sessions>{$b}<Speaker id="{sql:column("Speaker_ID")}">
{sql:column("Speaker_NM")}</Speaker></Sessions>)
Bringing it all together, let’s create an attribute called Speaker id
with the value from the Speaker_ID column for the current row. The
<Sessions> node is a node constructor, which means you can create a
node as part of your return in your for expression:
Select Speaker_ID, Speaker_NM, Speaker_Country, Speaker_XML.query('
for $b in /classes/class
where $b/@name="SQL Server and Oracle Working Together"
return (<Sessions>{$b}<Speaker id="{sql:column("Speaker_ID")}">
{sql:column("Speaker_NM")}</Speaker></Sessions>)
') As Sessions
From tblSpeakers
Where Speaker_XML.exist('/classes/
class[@name="SQL Server and Oracle Working Together"]')=1
In the XML result for Stephen’s record, shown next, notice that our xml.query expression in our SELECT is in tune with our T-SQL XQuery WHERE clause, and we limit our results to show the same element as in our WHERE clause:
<Sessions>
<class name="SQL Server and Oracle Working Together" />
<Speaker id="1">Stephen Forte</Speaker>
</Sessions>
Sql:variable()
The sql:variable() function, which is unique to the SQL Server implementation of XQuery, works exactly like sql:column() but refers to a variable, not a value of a column in a row. You only have a single value for the current query.
Declare @Conf varchar(20)
Set @Conf= 'NYC .NET User Group'
Select Speaker_ID, Speaker_NM, Speaker_XML.query('
for $b in /classes/class
where $b/@name="SQL Server and Oracle Working Together"
return (<Sessions
conference="{sql:variable("@Conf")}">
{$b}<Speaker id="{sql:column("Speaker_ID")}">
{sql:column("Speaker_NM")}</Speaker></Sessions>)
') As Sessions
From tblSpeakers
Where Speaker_XML.exist('/classes/
class[@name="SQL Server and Oracle Working Together"]')=1
The results look like this:
<Sessions conference="NYC .NET User Group">
<class name="SQL Server and Oracle Working Together" />
<Speaker id="1">Stephen Forte</Speaker>
</Sessions>