Home Services Partners Company
XQJ Part V - Serializing query results

XQJ Part V - Serializing query results

August 23, 2007 0 Comments

The XQuery 1.0 specification consists out of multiple books, one is XSLT 2.0 and XQuery 1.0 Serialization. Given a data model instance, it defines how to serialize it into a sequence of octets. To mention a typical use case, it provides for example guidelines on how to write query results using XML syntax into a file.

Serialization defines a number of parameters which influence this process. The specification includes a detailed description for each of these parameters. We'll explain some through examples later on,

  • byte-order-mark
  • cdata-section-elements
  • doctype-public
  • doctype-public
  • encoding
  • escape-uri-attributes
  • include-content-type
  • indent
  • media-type
  • method
  • normalization-form
  • omit-xml-declaration
  • standalone
  • undeclare-prefixes
  • use-character-maps
  • version

Note that XQuery Serialization is an Optional Feature in XQuery. However, XQJ is more strict and requires every implementation to support serialization. XQJ does does not require every parameter defined in the XQuery Serialization spec to be supported in its full extend, but at least a default value for each of the parameters needs to be documented and behave conformant to the spec. For DataDirect XQuery all parameters are documented here.

Suppose you want to serialize your query results in a file, fairly simple as shown in the next example,

[cc lang="java"]... XQExpression xqe; XQSequence xqs; xqe = xqc.createExpression(); xqs = xqe.executeQuery( "doc('orders.xml')/*/ORDERS[O_ORDERKEY = '39']"); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.xml"), new Properties()); ... [/cc]

Note the second argument of writeSequence() is an empty Properties object. You can also specify null. Both an empty Properties object and null are implying that the XQJ driver uses the default values for each of the serialization parameters.

You might get something as follows (assume this to be one line),

[cc lang="xquery"]398177O 307811.89 1996-09-20T00:00:003-MEDIUM Clerk#000000659 0furiously unusual pinto beans above the furiously ironic asymptot [/cc]

Not really readable, some indentation would help. It's also good practice to add the XML declaration including an encoding. Suppose we want to encode the XML file as UTF-16,

[cc lang="java"]... Properties serializationProps = new java.util.Properties(); // make sure we output xml serializationProps.setProperty("method", "xml"); // pretty printing serializationProps.setProperty("indent", "yes"); // serialize as UTF-16 serializationProps.setProperty("encoding", "UTF-16"); // want an XML declaration serializationProps.setProperty("omit-xml-declaration", "no"); XQExpression xqe; XQSequence xqs; xqe = xqc.createExpression(); xqs = xqe.executeQuery( "doc('orders.xml')/*/ORDERS[O_ORDERKEY = '39']"); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.xml"), serializationProps); ...[/cc]

Much better what we get now,

[cc lang="xquery"]

39 8177 O 307811.89 1996-09-20T00:00:00 3-MEDIUM Clerk#000000659 0 furiously unusual pinto beans above the furiously ironic asymptot [/cc]

Note that during serialization characters are escaped as needed for the specified encoding. Suppose a query returning a document with a registered trademark character, and the specified encoding is US-ASCII,

[cc lang="java"]... Properties serializationProps = new java.util.Properties(); serializationProps.setProperty("method", "xml"); serializationProps.setProperty("encoding", "ASCII"); XQExpression xqe; XQSequence xqs; xqe = xqc.createExpression(); xqs = xqe.executeQuery( "DataDirect XQuery®"); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.xml"), serializationProps); ...[/cc]

And you'll get the following, note that the ® character is serialized as a character reference because it is not defined in the ASCII character set,

[cc lang="java"]DataDirect XQuery&#xae[/cc]

In some use cases, the cdata-section-elements parameter is useful. Suppose you're serializing some XML elements including ampersand characters. By default the & characters will be escaped, using CDATA sections might be preferable to make the XML file more human readable.

[cc lang="java"]... Properties serializationProps = new java.util.Properties(); serializationProps.setProperty("method", "xml"); serializationProps.setProperty("cdata-section-elements", "product"); XQExpression xqe; XQSequence xqs; xqe = xqc.createExpression(); xqs = xqe.executeQuery( "DataDirect XQuery & XML Converters"); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.xml"), null); ...[/cc]

Is serialized as follows,

[cc lang="java"][/cc]

Note that multiple elements can be specified through the cdata-section-elements parameter, separated by a semi-colon character. And in case the element is in a namespace, add the namespace uri using the James Clark notation, "{"+namespace uri+"}"localname

[cc lang="java"]... Properties serializationProps = new java.util.Properties(); serializationProps.setProperty("method", "xml"); serializationProps.setProperty("encoding", "UTF-8"); serializationProps.setProperty("omit-xml-declaration", "no"); serializationProps.setProperty("cdata-section-elements", "product;{uri}product"); XQExpression xqe; XQSequence xqs; xqe = xqc.createExpression(); xqs = xqe.executeQuery( " " + " DataDirect XQuery & XML Converters" + "

DataDirect XQuery & XML Converters

" + ""); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.xml"), null); ...[/cc]

Yields the following result,

[cc lang="java"]

[/cc]

In addition to the XML output method, the XQuery serialization defines other output methods like HTML and XHTML. Note that these serialization methods will not "automagically" produce (X)HTML. It is still the query's responsibility to produce results conform to (X)HTML. But the serializer will consider the (X)HTML rules outputting the results. For example elements will be serialized without a closing . Note for example the difference between the following result.xml and result.html[cc lang="java"]... Properties serializationProps = new java.util.Properties(); XQPreparedExpression xqpe = xqc.createPreparedExpression( "line1 line2"); XQSequence xqs = xqpe.executeQuery(); serializationProps.setProperty("method", "xml"); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.xml"), serializationProps); XQSequence xqs = xqpe.executeQuery(); serializationProps.setProperty("method", "html"); xqs.writeSequence( new FileOutputStream("/home/jimmy/result.html"), serializationProps); ...[/cc]

result.xml is as follows,[cc lang="java"]line1 line2[/cc]

where results.html will look as follows,[cc lang="java"]line1 line2[/cc]

If your interested in all the details about (X)HTML serialization, look here for HTML and here for XHTML.

In all previous examples, we've serialized the query results in a FileOutputStream. In addition an XQSequence can also be serialized into a java.io.Writer using the writeSequence() method. And getSequenceAsString() serializes to a java.lang.String.

Similar to serializing the complete XQSequence, there are methods to serialize the current item in the XQSequence. In the following example, the items in the query result are saved into individual files, result1.xml, result2.xml, and so on.[cc lang="java"]... Properties serializationProps = new java.util.Properties(); serializationProps.setProperty("method", "xml"); serializationProps.setProperty("indent", "yes"); serializationProps.setProperty("encoding", "UTF-8"); serializationProps.setProperty("omit-xml-declaration", "no"); XQExpression xqe; XQSequence xqs; xqe = xqc.createExpression(); xqs = xqe.executeQuery("doc('orders.xml')/*/ORDERS"); int i = 1; while (xqs.next()) { FileOutputStream file; file = new FileOutputStream("/home/jimmy/result" + i + ".xml"); xqs.writeItem(file, serializationProps); file.close(); } ...[/cc]

To conclude this post, note that XML serialization doesn’t always result in a well-formed XML document. More precisely it is either a well-formed XML document or a well-formed XML external general parsed entity. This is further explained in the serialization specification.

In the next upcoming post, we'll talk about manipulating the XQuery Static Context through the XQJ API.

digg_skin = 'compact';

Marc Van Cappellen

View all posts from Marc Van Cappellen on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.

Comments
Comments are disabled in preview mode.