SQL Server supports an enhancement to the T-SQL syntax that enables
normal relational queries to output their result set as XML, using any
of these four approaches:
FOR XML RAW
FOR XML AUTO
FOR XML EXPLICIT
FOR XML PATH
The
first three of these options were introduced with the very first XML
support in SQL Server 2000. We’ll start with these options and then
cover later XML enhancements added in SQL Server 2008, which includes
the fourth option (FOR XML PATH).
FOR XML RAW produces attribute-based XML. FOR XML RAW
essentially creates a flat representation of the data in which each row
returned becomes an element and the returned columns become the
attributes of each element. FOR XML RAW also doesn’t interpret joins in any special way. (Joins become relevant in FOR XML AUTO.) Example 1 shows an example of a simple query that retrieves customer and order header data.
Example 1. Using FOR XML RAW to produce flat, attribute-based XML.
SELECT TOP 10
Customer.CustomerID, OrderHeader.SalesOrderID, OrderHeader.OrderDate
FROM
Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
ORDER BY
Customer.CustomerID
FOR XML RAW
Both SSDT in Visual Studio and SSMS render the query
results as a hyperlink that you can click on to see the output rendered
as properly formatted XML in a color-coded window that supports
expanding and collapsing nodes.
<row CustomerID="11000" SalesOrderID="43793" OrderDate="2005-07-22T00:00:00" />
<row CustomerID="11000" SalesOrderID="51522" OrderDate="2007-07-22T00:00:00" />
<row CustomerID="11000" SalesOrderID="57418" OrderDate="2007-11-04T00:00:00" />
<row CustomerID="11001" SalesOrderID="43767" OrderDate="2005-07-18T00:00:00" />
<row CustomerID="11001" SalesOrderID="51493" OrderDate="2007-07-20T00:00:00" />
<row CustomerID="11001" SalesOrderID="72773" OrderDate="2008-06-12T00:00:00" />
<row CustomerID="11002" SalesOrderID="43736" OrderDate="2005-07-10T00:00:00" />
<row CustomerID="11002" SalesOrderID="51238" OrderDate="2007-07-04T00:00:00" />
<row CustomerID="11002" SalesOrderID="53237" OrderDate="2007-08-27T00:00:00" />
<row CustomerID="11003" SalesOrderID="43701" OrderDate="2005-07-01T00:00:00" />
As you can see, you get flat results in which each row returned from the query becomes a single element named row
and all columns are output as attributes of that element. Odds are,
however, that you will want more structured XML output, which leads us
to FOR XML AUTO.
FOR XML AUTO
also produces attribute-based XML (by default), but its output is
hierarchical rather than flat—that is, it can create nested results
based on the tables in the query’s join clause. For example, using the same query just demonstrated, you can simply change the FOR XML clause to FOR XML AUTO, as shown in Example 2.
Example 2. Using FOR XML AUTO to produce hierarchical, attribute-based XML.
SELECT TOP 10 -- limits the result rows for demo purposes
Customer.CustomerID, OrderHeader.SalesOrderID, OrderHeader.OrderDate
FROM
Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
ORDER BY
Customer.CustomerID
FOR XML AUTO
Execute this query, click the XML hyperlink in the results, and you will see the following output:
<Customer CustomerID="11000">
<OrderHeader SalesOrderID="43793" OrderDate="2005-07-22T00:00:00" />
<OrderHeader SalesOrderID="51522" OrderDate="2007-07-22T00:00:00" />
<OrderHeader SalesOrderID="57418" OrderDate="2007-11-04T00:00:00" />
</Customer>
<Customer CustomerID="11001">
<OrderHeader SalesOrderID="43767" OrderDate="2005-07-18T00:00:00" />
<OrderHeader SalesOrderID="51493" OrderDate="2007-07-20T00:00:00" />
<OrderHeader SalesOrderID="72773" OrderDate="2008-06-12T00:00:00" />
</Customer>
<Customer CustomerID="11002">
<OrderHeader SalesOrderID="43736" OrderDate="2005-07-10T00:00:00" />
<OrderHeader SalesOrderID="51238" OrderDate="2007-07-04T00:00:00" />
<OrderHeader SalesOrderID="53237" OrderDate="2007-08-27T00:00:00" />
</Customer>
<Customer CustomerID="11003">
<OrderHeader SalesOrderID="43701" OrderDate="2005-07-01T00:00:00" />
</Customer>
As you can see, the XML data has main elements named Customer (based on the alias assigned in the query) and child elements named OrderHeader (again from the alias). Note that FOR XML AUTO determines the element nesting order based on the order of the columns in the SELECT clause. You can rewrite the SELECT clause so that an OrderHeader column comes before a Customer column, by changing the order of the columns returned by the query, as shown in Example 3.
Example 3. Changing the hierarchy returned by FOR XML AUTO by reordering query columns.
SELECT TOP 10
OrderHeader.SalesOrderID, OrderHeader.OrderDate, Customer.CustomerID
FROM
Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
ORDER BY
Customer.CustomerID
FOR XML AUTO
The output (as viewed in the XML viewer) now looks like this:
<OrderHeader SalesOrderID="43793" OrderDate="2005-07-22T00:00:00">
<Customer CustomerID="11000" />
</OrderHeader>
<OrderHeader SalesOrderID="51522" OrderDate="2007-07-22T00:00:00">
<Customer CustomerID="11000" />
</OrderHeader>
<OrderHeader SalesOrderID="57418" OrderDate="2007-11-04T00:00:00">
<Customer CustomerID="11000" />
</OrderHeader>
<OrderHeader SalesOrderID="43767" OrderDate="2005-07-18T00:00:00">
<Customer CustomerID="11001" />
</OrderHeader>
<OrderHeader SalesOrderID="51493" OrderDate="2007-07-20T00:00:00">
<Customer CustomerID="11001" />
</OrderHeader>
<OrderHeader SalesOrderID="72773" OrderDate="2008-06-12T00:00:00">
<Customer CustomerID="11001" />
</OrderHeader>
<OrderHeader SalesOrderID="43736" OrderDate="2005-07-10T00:00:00">
<Customer CustomerID="11002" />
</OrderHeader>
<OrderHeader SalesOrderID="51238" OrderDate="2007-07-04T00:00:00">
<Customer CustomerID="11002" />
</OrderHeader>
<OrderHeader SalesOrderID="53237" OrderDate="2007-08-27T00:00:00">
<Customer CustomerID="11002" />
</OrderHeader>
<OrderHeader SalesOrderID="43701" OrderDate="2005-07-01T00:00:00">
<Customer CustomerID="11003" />
</OrderHeader>
These results are probably not what you
wanted. To keep the XML hierarchy matching the table hierarchy, you must
list at least one column from the parent table before any column from a
child table. If there are three levels of tables, at least one other
column from the child table must come before any from the grandchild
table, and so on.
FOR XML EXPLICIT is the most complex but also the most powerful and flexible of the three original FOR XML options. We cover it now for completeness, but recommend using the simpler FOR XML PATH feature added in SQL Server 2008 (covered shortly). As you’ll see, FOR XML PATH can shape query results into virtually any desired XML with much less effort than using FOR XML EXPLICIT.
With FOR XML EXPLICIT, SQL Server constructs XML based on a UNION query of the various levels of output elements. So, if again you have the Customer and SalesOrderHeader tables and you want to produce XML output, you must have two SELECT statements with a UNION. If you add the SalesOrderDetail table, you must add another UNION statement and SELECT statement.
As we said, FOR XML EXPLICIT
is more complex than its predecessors. For starters, you are
responsible for defining two additional columns that establish the
hierarchical relationship of the XML: a Tag column that acts as a row’s identifier and a Parent column that links child records to the parent record’s Tag value (similar to EmployeeID and ManagerID). You must also alias all columns to indicate the element, Tag, and display name for the XML output, as shown in Example 4. Keep in mind that only the first SELECT statement must follow these rules; any aliases in subsequent SELECT statements in a UNION query are ignored.
Example 4. Shaping hierarchical XML using FOR XML EXPLICIT.
SELECT
1 AS Tag, -- Tag this resultset as level 1
NULL AS Parent, -- Level 1 has no parent
CustomerID AS [Customer!1!CustomerID], -- level 1 value
NULL AS [SalesOrder!2!SalesOrderID], -- level 2 value
NULL AS [SalesOrder!2!OrderDate] -- level 2 value
FROM Sales.Customer AS Customer
WHERE Customer.CustomerID IN(11077, 11078)
UNION ALL
SELECT
2, -- Tag this resultset as level 2
1, -- Link to parent at level 1
Customer.CustomerID,
OrderHeader.SalesOrderID,
OrderHeader.OrderDate
FROM Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
WHERE Customer.CustomerID IN(11077, 11078)
ORDER BY
[Customer!1!CustomerID], [SalesOrder!2!SalesOrderID]
FOR XML EXPLICIT
Execute this query and click the XML hyperlink to see the following output:
<Customer CustomerID="11077">
<SalesOrder SalesOrderID="44407" OrderDate="2005-10-16T00:00:00" />
<SalesOrder SalesOrderID="51651" OrderDate="2007-07-29T00:00:00" />
<SalesOrder SalesOrderID="60042" OrderDate="2007-12-14T00:00:00" />
</Customer>
<Customer CustomerID="11078">
<SalesOrder SalesOrderID="52789" OrderDate="2007-08-19T00:00:00" />
<SalesOrder SalesOrderID="53993" OrderDate="2007-09-08T00:00:00" />
<SalesOrder SalesOrderID="54214" OrderDate="2007-09-12T00:00:00" />
<SalesOrder SalesOrderID="54268" OrderDate="2007-09-13T00:00:00" />
<SalesOrder SalesOrderID="56449" OrderDate="2007-10-21T00:00:00" />
<SalesOrder SalesOrderID="57281" OrderDate="2007-11-02T00:00:00" />
<SalesOrder SalesOrderID="57969" OrderDate="2007-11-15T00:00:00" />
<SalesOrder SalesOrderID="58429" OrderDate="2007-11-23T00:00:00" />
<SalesOrder SalesOrderID="58490" OrderDate="2007-11-24T00:00:00" />
<SalesOrder SalesOrderID="61443" OrderDate="2008-01-04T00:00:00" />
<SalesOrder SalesOrderID="62245" OrderDate="2008-01-17T00:00:00" />
<SalesOrder SalesOrderID="62413" OrderDate="2008-01-20T00:00:00" />
<SalesOrder SalesOrderID="67668" OrderDate="2008-04-05T00:00:00" />
<SalesOrder SalesOrderID="68285" OrderDate="2008-04-15T00:00:00" />
<SalesOrder SalesOrderID="68288" OrderDate="2008-04-15T00:00:00" />
<SalesOrder SalesOrderID="73869" OrderDate="2008-06-27T00:00:00" />
<SalesOrder SalesOrderID="75084" OrderDate="2008-07-31T00:00:00" />
</Customer>
This result resembles the output generated by the FOR XML AUTO sample in Example 2. So what is gained by composing a more complex query with FOR XML EXPLICIT? Well, FOR XML EXPLICIT allows for some alternative outputs that are not achievable using FOR XML AUTO. For example, you can specify that certain values be composed as elements instead of attributes by appending !ELEMENT to the end of the aliased column, as shown in Example 5.
Example 5. Using !ELEMENT to customize the hierarchical XML generated by FOR XML EXPLICIT.
SELECT
1 AS Tag, -- Tag this resultset as level 1
NULL AS Parent, -- Level 1 has no parent
CustomerID AS [Customer!1!CustomerID], -- level 1 value
NULL AS [SalesOrder!2!SalesOrderID], -- level 2 value
NULL AS [SalesOrder!2!OrderDate!ELEMENT
] -- level 2 value rendered as an
element
FROM Sales.Customer AS Customer
WHERE Customer.CustomerID IN(11077, 11078)
UNION ALL
SELECT
2, -- Tag this resultset as level 2
1, -- Link to parent at level 1
Customer.CustomerID,
OrderHeader.SalesOrderID,
OrderHeader.OrderDate
FROM Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
WHERE Customer.CustomerID IN(11077, 11078)
ORDER BY
[Customer!1!CustomerID], [SalesOrder!2!SalesOrderID]
FOR XML EXPLICIT
Only one minor change was made (the OrderDate column alias has !ELEMENT appended to the end of it). Aliasing a column with !ELEMENT in a FOR XML EXPLICIT query results in that column being rendered as an element instead of an attribute, as shown here:
<Customer CustomerID="11077">
<SalesOrder SalesOrderID="44407">
<OrderDate>2005-10-16T00:00:00</OrderDate>
</SalesOrder>
<SalesOrder SalesOrderID="51651">
<OrderDate>2007-07-29T00:00:00</OrderDate>
</SalesOrder>
<SalesOrder SalesOrderID="60042">
<OrderDate>2007-12-14T00:00:00</OrderDate>
</SalesOrder>
</Customer>
<Customer CustomerID="11078">
<SalesOrder SalesOrderID="52789">
<OrderDate>2007-08-19T00:00:00</OrderDate>
</SalesOrder>
<SalesOrder SalesOrderID="53993">
<OrderDate>2007-09-08T00:00:00</OrderDate>
</SalesOrder>
<SalesOrder SalesOrderID="54214">
<OrderDate>2007-09-12T00:00:00</OrderDate>
</SalesOrder>
:
Notice that the OrderDate is now being rendered as a child element of the SalesOrder element. Thus, FOR XML EXPLICIT
mode enables greater customization, but it also requires creating
complex queries to achieve custom results. For example, to add a few
more fields from OrderHeader and to add some additional fields from OrderDetail (a third hierarchical table), you would have to write the query as shown in Example 6.
Example 6. Using FOR XML EXPLICIT to produce three-level hierarchical XML order data.
SELECT
1 AS Tag,
NULL AS Parent,
CustomerID AS [Customer!1!CustomerID],
NULL AS [SalesOrder!2!SalesOrderID],
NULL AS [SalesOrder!2!TotalDue],
NULL AS [SalesOrder!2!OrderDate!ELEMENT],
NULL AS [SalesOrder!2!ShipDate!ELEMENT],
NULL AS [SalesDetail!3!ProductID],
NULL AS [SalesDetail!3!OrderQty],
NULL AS [SalesDetail!3!LineTotal]
FROM Sales.Customer AS Customer
WHERE Customer.CustomerID IN(11077, 11078)
UNION ALL
SELECT
2,
1,
Customer.CustomerID,
OrderHeader.SalesOrderID,
OrderHeader.TotalDue,
OrderHeader.OrderDate,
OrderHeader.ShipDate,
NULL,
NULL,
NULL
FROM Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
WHERE Customer.CustomerID IN(11077, 11078)
UNION ALL
SELECT
3,
2,
Customer.CustomerID,
OrderHeader.SalesOrderID,
OrderHeader.TotalDue,
OrderHeader.OrderDate,
OrderHeader.ShipDate,
OrderDetail.ProductID,
OrderDetail.OrderQty,
OrderDetail.LineTotal
FROM Sales.Customer AS Customer
INNER JOIN Sales.SalesOrderHeader AS OrderHeader
ON OrderHeader.CustomerID = Customer.CustomerID
INNER JOIN Sales.SalesOrderDetail AS OrderDetail
ON OrderDetail.SalesOrderID = OrderHeader.SalesOrderID
WHERE Customer.CustomerID IN(11077, 11078)
ORDER BY [Customer!1!CustomerID], [SalesOrder!2!SalesOrderID]
FOR XML EXPLICIT
This query produces the following XML:
<Customer CustomerID="11077">
<SalesOrder SalesOrderID="44407" TotalDue="3729.3640">
<OrderDate>2005-10-16T00:00:00</OrderDate>
<ShipDate>2005-10-23T00:00:00</ShipDate>
<SalesDetail ProductID="778" OrderQty="1" LineTotal="3374.990000" />
<SalesDetail ProductID="781" OrderQty="1" LineTotal="2319.990000" />
<SalesDetail ProductID="880" OrderQty="1" LineTotal="54.990000" />
</SalesOrder>
<SalesOrder SalesOrderID="51651" TotalDue="2624.3529">
<OrderDate>2007-07-29T00:00:00</OrderDate>
<ShipDate>2007-08-05T00:00:00</ShipDate>
</SalesOrder>
<SalesOrder SalesOrderID="60042" TotalDue="2673.0613">
<OrderDate>2007-12-14T00:00:00</OrderDate>
<ShipDate>2007-12-21T00:00:00</ShipDate>
<SalesDetail ProductID="969" OrderQty="1" LineTotal="2384.070000" />
<SalesDetail ProductID="707" OrderQty="1" LineTotal="34.990000" />
</SalesOrder>
</Customer>
<Customer CustomerID="11078">
<SalesOrder SalesOrderID="52789" TotalDue="71.2394">
<OrderDate>2007-08-19T00:00:00</OrderDate>
<ShipDate>2007-08-26T00:00:00</ShipDate>
<SalesDetail ProductID="923" OrderQty="1" LineTotal="4.990000" />
<SalesDetail ProductID="707" OrderQty="1" LineTotal="34.990000" />
<SalesDetail ProductID="860" OrderQty="1" LineTotal="24.490000" />
<SalesDetail ProductID="922" OrderQty="1" LineTotal="3.990000" />
<SalesDetail ProductID="877" OrderQty="1" LineTotal="7.950000" />
</SalesOrder>
:
As you can see, the code has become quite complex, and
will become even more complex as you add additional data to the output.
Although this query is perfectly valid, the same result can be achieved
with far less effort using the FOR XML PATH statement.