Shredding XML Into Relational Rows
By Yuli Vasiliev
This article reveals how you might use SQL/XML functions in your SQL statements to access XML data, shredding it into relational rows and then joining them with regular relational data stored in an Oracle database. In particular, it explains how you might use SQL/XML function XMLTable to derive relational data from XML. The examples provided in this article assume you have Oracle database demonstration schemas installed. In particular, you’ll need the OE and HR demonstration database schemas.
Let’s start with an example in which the XMLTable SQL/XML function is used to query the PurchaseOrder XML schema-based XMLType table from the OE demonstration database schema, referring to an XMLTable construct without the COLUMNS clause. In the XQuery expression passed to XMLTable in the example below, you apply XPath expression /PurchaseOrder/Reference to the XML document processed, retrieving only those Reference elements that belong to the /PurchaseOrder elements whose /PurchaseOrder/User element’s value is AWALSH
SELECT ttab.COLUMN_VALUE AS PO FROM purchaseorder,
XMLTable(
‘for $i in /PurchaseOrder
where $i/User = “AWALSH”
return
<PO>
{$i/Reference}
</PO>’
PASSING OBJECT_VALUE
) ttab;
Not using the COLUMNS clause in XMLTable is the key point here. As a result, the XQuery expression simply outputs a <PO><Reference> XML element for each purchase order requested by user AWALSH.
Another interesting point to note here is the use of the OBJECT_VALUE pseudocolumn in the PASSING clause of the XMLTable function to pass the purchaseorder table as context item to the XQuery expression. To access the produced XML, you use the COLUMN_VALUE pseudocolumn in the SELECT list. As a result, you should have the following output comprised of XML fragments:
PO
——————————————————————————–
<PO><Reference>AWALSH-2002100912333570PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123337203PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123337303PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123337954PDT</Reference></PO>
<PO><Reference>AWALSH-2002100912333844PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123337483PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123336642PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123335871PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123335911PDT</Reference></PO>
<PO><Reference>AWALSH-20021009123336101PDT</Reference></PO>
10 rows selected.
In practice however, you most likely will want to have the data you need extracted from XML tags, looking more like relational data. This is where the COLUMNS clause comes into play. So, the following example illustrates how you can use SQL/XML function XMLTable to query XMLType data, decomposing the resulting XML into relational data:
SELECT ttab.reference AS PO FROM purchaseorder,
XMLTable(‘for $i in /PurchaseOrder
where $i/User = “AWALSH”
return
$i’ PASSING OBJECT_VALUE
COLUMNS reference VARCHAR2(32) PATH ‘/PurchaseOrder/Reference’
) ttab;
Now the output comes in a more relational friendly format:
PO
—————————-
AWALSH-2002100912333570PDT
AWALSH-20021009123337203PDT
AWALSH-20021009123337303PDT
AWALSH-20021009123337954PDT
AWALSH-2002100912333844PDT
AWALSH-20021009123337483PDT
AWALSH-20021009123336642PDT
AWALSH-20021009123335871PDT
AWALSH-20021009123335911PDT
AWALSH-20021009123336101PDT
10 rows selected.
Now that you have XML data shredded into relational rows, you can join it with any other relational data. Suppose you want to output each purchase order you had in the preceding query, along with the full name of the user. This can be done by joining the hr.employees table with the virtual table generated by XMLTable, as follows:
SELECT emp.first_name, emp.last_name,ttab.reference FROM oe.purchaseorder,
XMLTable(‘for $i in /PurchaseOrder
where $i/User = “AWALSH”
return
$i’ PASSING OBJECT_VALUE
COLUMNS reference VARCHAR2(32) PATH ‘/PurchaseOrder/Reference’
) ttab, hr.employees emp
WHERE ttab.reference LIKE emp.email ||’%’;
The output should look like this:
FIRST_NAME LAST_NAME REFERENCE
——————– ————————- —————————-
Alana Walsh AWALSH-2002100912333570PDT
Alana Walsh AWALSH-20021009123337203PDT
Alana Walsh AWALSH-20021009123337303PDT
Alana Walsh AWALSH-20021009123337954PDT
Alana Walsh AWALSH-2002100912333844PDT
Alana Walsh AWALSH-20021009123337483PDT
Alana Walsh AWALSH-20021009123336642PDT
Alana Walsh AWALSH-20021009123335871PDT
Alana Walsh AWALSH-20021009123335911PDT
Alana Walsh AWALSH-20021009123336101PDT
10 rows selected.
About the Author
Yuli Vasiliev is a software developer, freelance author, and consultant currently specializing in open source development, Java technologies, business intelligence (BI), databases, and service-oriented architecture (SOA). He is the author of Oracle Business Intelligence: The Condensed Guide to Analysis and Reporting (Packt, 2010) and PHP Oracle Web Development (Packt, 2007) as well as a series of other books on the Oracle technology.