1. Character Data as XML
XML, in all its dialects, is stored ultimately as string (character) data. Before the xml data type, XML data could only be stored in SQL Server using ordinary string data types, such as varchar(max) and text,
and doing so raises several challenges. The first issue is validating
the XML that is persisted (and by this we mean validating the XML
against an XSD schema). SQL Server has no means of performing such a
validation using ordinary strings, so the XML data can’t be validated
except by an outside application which can be a risky proposition (the
true power of a relational database management system, or RDBMS, is
applying rules at the server level).
The
second issue is querying the data. Sure, you could look for data using
character and pattern matching by using functions such as CharIndex or PatIndex, but these functions cannot efficiently or dependably find specific data in a structured XML
document. The developer could also implement full-text search, which
could also index the text data, but this solution would make things only
a little better while adding the overhead of the full-text search
engine. It would still be very difficult to extract data from a specific
attribute in a specific child element in the XML
content, and it certainly wouldn’t be very efficient. You would not be
able to write a query that said “Show me all data where the ‘Author’
attribute is set to ‘Lukas Keller’.”
The third issue is modifying the XML data. The developer could simply replace the entire XML contents—which is not at all efficient—or use the UpdateText function to do in-place changes. However, UpdateText
requires that you know the exact locations and length of data you are
going to replace, which, as we just stated, would be difficult and slow
to do.
The
natural evolution of persisting native XML data in the database has been
realized since SQL Server 2005, with powerful T-SQL extensions that
address all three of the aforementioned issues. Not only can SQL Server
persist native XML data in the database, but it can index the data,
query it using XPath and XQuery, and even modify it efficiently.
2 The xml Data Type
Using the xml
data type, you can store XML in its native format, query the data
within the XML, efficiently and easily modify data within the XML
without having to replace the entire contents, and index the data in the
XML. You can use xml as any of the following:
There are some limitations of the xml
data type to be aware of. Although this data type can contain and be
checked for null values, unlike other native types, you cannot directly
compare an instance of an xml data type to another instance of an xml data type. (You can, however, convert that instance to a text data type and then do a compare.) Any such equality comparisons require first casting the xml type to a character type. This limitation also means that you cannot use ORDER BY or GROUP BY with an xml data type. There are several other restrictions, which we will discuss in more detail later.
These might seem like fairly severe restrictions, but they don’t really affect the xml data type when it is used appropriately. The xml data type also has a rich feature set that more than compensates for these limitations.
2.1 Working with the xml Data Type as a Variable
Let’s start by writing some code that uses the xml data type as a variable. As with any other T-SQL variable, you simply declare it and assign data to it. Example 1 shows an example that uses a generic piece of XML to represent basic order information.
Example 1. Creating XML and storing it in an xml variable using T-SQL.
DECLARE @XmlData AS xml = '
<Orders>
<Order>
<OrderId>5</OrderId>
<CustomerId>60</CustomerId>
<OrderDate>2008-10-10T14:22:27.25-05:00</OrderDate>
<OrderAmount>25.90</OrderAmount>
</Order>
</Orders>'
SELECT @XmlData
Example 1 shows an xml variable being declared and assigned like any other native SQL Server character data type by using the DECLARE statement. The XML is then returned to the caller via a SELECT
statement, and the results appear with the XML in a single column in a
single row of data. Another benefit of having the database recognize
that you are working with XML (rather than raw text that happens to be XML) is that XML results in SQL Server Developer Tools (SSDT) and SQL
Server Management Studio (SSMS) are rendered as a hyperlink. Clicking
the hyperlink then opens a new window displaying nicely formatted XML
with color-coding and collapsible/expandable nodes.
2.2 Working with XML in Tables
Now you will define an actual column as XML in a new AdventureWorks database table. Execute the code shown in Example 2 to create the new OrdersXML table.
Example 2. Creating a table to store XML in the database.
USE AdventureWorks2012
GO
CREATE TABLE OrdersXML(
OrdersId int PRIMARY KEY,
OrdersDoc xml NOT NULL DEFAULT '<Orders />')
GO
As we stated earlier, the xml data type has a few other restrictions—in this case, when it is used as a column in a table:
It cannot be used as a primary key.
It cannot be used as a foreign key.
It cannot be declared with a UNIQUE constraint.
It cannot be declared with the COLLATE keyword.
We also stated earlier that you can’t compare two instances of the xml
data type. Primary keys, foreign keys, and unique constraints all
require that you must be able to compare any included data types;
therefore, XML cannot be used in any of those situations. The SQL Server COLLATE statement is meaningless with the xml
data type because SQL Server does not store the XML as text; rather, it
uses a distinct type of encoding particular to XML. Note however that
you can designate a DEFAULT value, as in this case, where an empty <Orders /> element will be assigned by default if no value is supplied for OrdersDoc in an INSERT statement.
Now get some data into the column. Example 3 takes some simple static XML and inserts it into the OrdersXML table you just created, using the xml data type as a variable.
Example 3. Storing XML in the database.
DECLARE @XmlData AS xml = '
<Orders>
<Order>
<OrderId>5</OrderId>
<CustomerId>60</CustomerId>
<OrderDate>2008-10-10T14:22:27.25-05:00</OrderDate>
<OrderAmount>25.9O</OrderAmount>
</Order>
</Orders>'
INSERT INTO OrdersXML (OrdersId, OrdersDoc) VALUES (1, @XmlData)
You can insert data into xml
columns in a variety of other ways: XML Bulk Load , loading from an XML variable (as shown
here), or loading from a SELECT statement using the FOR XML TYPE
feature, which we will discuss shortly. Only well-formed XML (including
fragments) can be inserted—any attempt to insert malformed XML will
result in an exception, as shown in this fragment where there is a
case-sensitivity problem in the end tag (the word Orders is not capitalized, as it is in the start tag):
INSERT INTO OrdersXML (OrdersId, OrdersDoc) VALUES (2, '<Orders></orders>')
The results produce the following error from SQL Server:
Msg 9436, Level 16, State 1, Line 1
XML parsing: line 1, character 17, end tag does not match start tag