Parsing XML Stored as Character Data



If your database does not support the XML type and you store XML documents as character data, you must parse the XML before it can be queried. This can be done with a Java external function called from within an XQuery query. The following Java external function creates a DOM tree by parsing its input:

 public static Document txt2xml(String txt) {
 DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
 DocumentBuilder builder;
 try {
 builder = factory.newDocumentBuilder();
 }
 
 catch (ParserConfigurationException e) {
 e.printStackTrace();
 return null;
 }
 Document doc = null;
 try {
 doc = builder.parse(new InputSource(new StringReader(txt)));
 }
 catch (SAXException e1) {
 e1.printStackTrace();
 return null;
 }
 catch (IOException e1) {
 e1.printStackTrace();
 }
 return doc;
 }

Using DataDirect XQuery®, you can call this function by declaring it in the prolog and using it in a query. For example:

declare namespace p='ddtekjava:txt2xml';
declare function p:txt2xml($inp as xs:string) as document-node() external;
for $row in collection('HOLDINGSXML')/HOLDINGSXML/XMLCOL
return 
 p:txt2xml($row/XMLCOL)/HOLDINGS/SHARE[@COMPANY='Amazon.com, Inc.']