One of the most
useful in this family of standards is XML Schema. XML Schema defines the
rules to which a specific XML document should conform, such as the
allowable elements and attributes, the order of elements, and the data
type of each element. You define these requirements in an XML Schema
document (XSD).
When you're creating an XML file on your own, you
don't need to create a corresponding XSD file—instead, you might just
rely on the ability of your code to behave properly. Although this is
sufficient for tightly controlled environments, if you want to open your
application to other programmers or allow it to interoperate with other
applications, you should create an XSD. Think of it this way: XML
allows you to create a custom language for storing data, and XSD allows
you to define the syntax of the language you create.
1. XML Namespaces
Before you can create an XSD, you'll need to understand one other XML standard, called XML namespaces.
The core idea behind XML namespaces is that every XML
markup language has its own namespace, which is used to uniquely
identify all related elements. Technically, namespaces disambiguate
elements by making it clear what markup language they belong to. For
example, you could tell the difference between your SuperProProductList
standard and another organization's product catalog because the two XML
languages would use different namespaces.
Namespaces are particularly useful in compound
documents, which contain separate sections, each with a different type
of XML. In this scenario, namespaces ensure that an element in one
namespace can't be confused with an element in another namespace, even
if it has the same element name. Namespaces are also useful for
applications that support different types of XML documents. By examining
the namespace, your code can determine what type of XML document it's
working with and can then process it accordingly.
NOTE
XML namespaces aren't related to .NET namespaces.
XML namespaces identify different XML languages. NET namespaces are a
code construct used to organize types.
Before you can place your XML elements in a
namespace, you need to choose an identifying name for that namespace.
Most XML namespaces use Universal Resource Identifiers (URIs).
Typically, these URIs look like a web page URL. For example, http://www.mycompany.com/mystandard
is a typical name for a namespace. Though the namespace looks like it
points to a valid location on the Web, this isn't required (and
shouldn't be assumed).
The reason that URIs are used for XML namespaces is
because they are more likely to be unique. Typically, if you create a
new XML markup, you'll use a URI that points to a domain or website you
control. That way, you can be sure that no one else is likely to use
that URI. For example, the namespace http://www.SuperProProducts.com/SuperProProductList is much more likely to be unique than just SuperProProductList if you own the domain www.SuperProProducts.com.
Namespace names must match exactly. If you change the
capitalization in part of a namespace, add a trailing/character, or
modify any other detail, it will be interpreted as a different namespace
by the XML parser.
|
|
To specify that an element belongs to a specific
namespace, you simply need to add the xmlns attribute to the start tag
and indicate the namespace. For example, the <Price> element shown
here is part of the http://www.SuperProProducts.com/SuperProProductList namespace:
<Price xmlns="http://www.SuperProProducts.com/SuperProProductList">
49.33
</Price>
If you don't take this step, the element will not be part of any namespace.
It would be cumbersome if you needed to type the full
namespace URI every time you wrote an element in an XML document.
Fortunately, when you assign a namespace in this fashion, it becomes the
default namespace for all child
elements. For example, in the XML document shown here, the
<SuperProProductList> element and all the elements it contains are
placed in the http://www.SuperProProducts.com/SuperProProductList namespace:
<?xml version="1.0"?>
<SuperProProductList
xmlns="http://www.SuperProProducts.com/SuperProProductList">
<Product>
<ID>1</ID>
<Name>Chair</Name>
<Price>49.33</Price>
<Available>True</Available>
<Status>3</Status>
</Product>
<!-- Other products omitted. -->
</SuperProProductList>
In compound documents, you'll have markup from more
than one XML language, and you'll need to place different sections into
different namespaces. In this situation, you can use namespace prefixes to sort out the different namespaces.
Namespace prefixes are short character sequences that
you can insert in front of a tag name to indicate its namespace. You
define the prefix in the xmlns attribute by inserting a colon (:)
followed by the characters you want to use for the prefix. Here's the
SuperProProductList document rewritten to use the prefix super:
<?xml version="1.0"?>
<super:SuperProProductList
xmlns:super="http://www.SuperProProducts.com/SuperProProductList">
<super:Product>
<super:ID>1</super:ID>
<super:Name>Chair</super:Name>
<super:Price>49.33</super:Price>
<super:Available>True</super:Available>
<super:Status>3</super:Status>
</super:Product>
<!-- Other products omitted. -->
</super:SuperProProductList>
Namespace prefixes are simply used to map an element
to a namespace. The actual prefix you use isn't important as long as it
remains consistent throughout the document. By convention, the
attributes that define XML namespace prefixes are usually added to the
root element of an XML document.
Although the xmlns attribute looks like an ordinary
XML attribute, it isn't. The XML parser interprets it as a namespace
declaration. (The reason XML namespaces use XML attributes is a
historical one. This design ensured that old XML parsers that didn't
understand namespaces could still read newer XML documents that use
them.)
NOTE
Attributes act a little differently than elements
when it comes to namespaces. You can use namespace prefixes with both
elements and attributes. However, attributes don't pay any attention to
the default namespace of a document. That means if you don't add a
namespace prefix to an attribute, the attribute will not be placed in the default namespace. Instead, it will have no namespace.
1.1. Writing XML Content with Namespaces
You can use the XmlTextWriter and XDocument classes you've already learned about to create XML content that uses a namespace.
The XmlTextWriter includes an overloaded version of
the WriteStartElement() method that accepts a namespace URI. Here's how
it works:
Dim ns As String = "http://www.SuperProProducts.com/SuperProProductList"
w.WriteStartDocument()
w.WriteStartElement("SuperProProductList", ns)
' Write the first product.
w.WriteStartElement("Product", ns)...
The only trick is remembering to use the namespace for every element.
The XDocument class deals with namespaces using a
similar approach. First, you define an XNamespace object. Then, you add
this XNamespace object to the beginning of the element name every time
you create an XElement (or an XAttribute) that you want to place in that
namespace. Here's an example:
Dim ns As XNamespace = "http://www.SuperProProducts.com/SuperProProductList"
Dim doc As New XDocument( _
New XDeclaration("1.0", Nothing, "yes"), _
New XComment("Created with the XDocument class."), _
New XElement(ns & "SuperProProductList", _
New XElement(ns & "Product", _
New XAttribute("ID", 1), _
New XAttribute("Name", "Chair"), _
New XElement(ns & "Price", "49.33")), _
...
You may also need to change your XML reading code. If
you're using the straightforward XmlTextReader, life is simple, and
your code will work without any changes. If necessary, you can use the
XmlTextReader.NamespaceURI property to get the namespace of the current
element (which is important if you have a compound document that fuses
together elements from different namespaces).
If you're using the XDocument class, you need to take
the XML namespace into account when you search the document. For
example, when using the XmlElement.Element() method, you must supply the
fully qualified element name by adding the appropriate XNamespace
object to the string with the element name:
Dim ns As XNamespace = "http://www.SuperProProducts.com/SuperProProductList"
...
Dim superProProductListElement As XElement = doc.Element(ns & "SuperProProductList")
NOTE
Technically, you don't need to use the XNamespace
class, although it makes your code clearer. When you add the XNamespace
to an element name string, the namespace is simply wrapped in curly
braces. In other words, when you combine the namespace http://www.somecompany.com/DVDList with the element name Title, it's equivalent to the string {http://www.somecompany.com/DVDList}
Title. This syntax works because the curly brace characters aren't
allowed in ordinary element names, so there's no possibility for
confusion.
2. XML Schema Definition
An XSD, or schema,
defines what elements and attributes a document should contain and the
way these nodes are organized (the structure). It can also identify the
appropriate data types for all the content. Schema documents are written
using an XML syntax with specific element names. All the XSD elements
are placed in the http://www.w3.org/2001/XMLSchema namespace. Often, this namespace uses the prefix xsd: or xs:, as in the following example.
The
following is a slightly abbreviated SuperProProductList.xsd file that
defines the rules for SuperProProductList documents:
<?xml version="1.0"?>
<xs:schema
targetNamespace="http://www.SuperProProducts.com/SuperProProductList"
xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" >
<xs:element name="SuperProProductList">
<xs:complexType>
<xs:sequence maxOccurs="unbounded">
<xs:element name="Product">
<xs:complexType>
<xs:sequence>
<xs:element name="Price" type="xs:double" />
</xs:sequence>
<xs:attribute name="ID" use="required" type="xs:int" />
<xs:attribute name="Name" use="required" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
At first glance, this markup looks a bit
intimidating. However, it's actually not as complicated as it looks.
Essentially, this schema indicates that a SuperProProductList document
consists of a list of <Product> elements. Each <Product>
element is a complex type made up of a string (Name), a decimal value
(Price), and an integer (ID). This example uses the second version of
the SuperProProductList document to demonstrate how to use attributes in
a schema file.
2.1. Dissecting the Code . . .
By examining the SuperProProductList.xsd schema, you can learn a few important points:
Schema documents use their own form of XML
markup. In the previous example, you'll quickly see that all the
elements are placed in the http://www.w3.org/2001/XMLSchema namespace using the xs namespace prefix.
Every schema document starts with a root <schema> element.
The
schema document must specify the namespace of the documents it can
validate. It specifies this detail with the targetNamespace attribute on
the root <schema> element.
The
elements inside the <schema> element describe the structure of the
target document. The <element> element represents an element,
while the <attribute> element represents an attribute. To find out
what the name of an element or attribute is, look at the name
attribute. For example, you can tell quite easily that the first
<element> has the name SuperProProductList. This indicates that
the first element in the validated document must be
<SuperProProductList>.
If an element can contain other elements or has attributes, it's considered a complex type. Complex types are represented in a schema by the <complexType> element. The simplest complex type is a sequence,
which is represented in a schema by the <sequence> element. It
requires that elements are always in the same order—the order that's set
in the schema document.
When defining
elements, you can define the maximum number of times an element can
appear (using the maxOccurs attribute) and the minimum number of times
it must occur (using the minOccurs
attribute). If you leave out these details, the default value of both is
1, which means that every element must appear exactly once in the
target document. Use a maxOccurs value of unbounded
if you want to allow an unlimited list. For example, this allows there
to be an unlimited number of <Product> elements in the
SuperProProductList catalog. However, the <Price> element must
occur exactly once in each <Product>.
When defining an attribute, you can use the use attribute with a value of required to make that attribute mandatory.
When
defining elements and attributes, you can specify the data type using
the type attribute. The XSD standard defines 44 data types that map
closely to the basic data types in .NET, including the double, int, and
string data types used in this example.