Controlling Output Options

January 05, 2012 Data Platform

In the XML world, MarkLogic natively supports both XQuery and XSLT, and both languages use the same data model (XPath). In the XPath data model, XML is represented as an abstract tree of nodes. The abstract nature of the model means that certain details about XML are not included, such as a document’s encoding, its DOCTYPE declarations, whether or not you use quotes or apostrophes for your attributes, etc. But when it comes time to output the result of your program (XML, or sometimes, HTML), the resulting tree needs to be serialized. And all those gritty details about how to represent the XPath tree as a stream of bytes need to be resolved. How do you control (at least some of) these details?

In XSLT, you have some control right at the language level (assuming your processor is the one responsible for serializing the result, as MarkLogic is). You can use the <xsl:output> element to control the output. The following output declaration tells the XSLT processor to output its result in ASCII encoding, with extra indentation for better readability, using the HTML “output method” (e.g., br elements appear as <br> instead of <br/> or <br></br>):

<xsl:output encoding="us-ascii" method="html" indent="yes"/>

The XSLT 2.0 spec lists the full range of output options. But what about XQuery? As it turns out, even though they use the same data model and serialization concerns are equally well-defined for both languages, XQuery doesn’t include built-in language support for controlling these options. Fortunately, XQuery provides a generic extension mechanism for declaring processor-specific options, and even more fortunately, MarkLogic provides the exact options you need. Here’s how you’d make the same explicit determination in your XQuery code:

<xsl:output encoding="us-ascii" method="html" indent="yes"/>

Oftentimes, the default output options will serve you just fine. For example, unless you specify otherwise, XSLT will automatically use the HTML output method when the document element of the result is <html> or <HTML>.

Another way you can control the output options in MarkLogic without having to make code edits is to define the defaults at the app server level, as shown below.

You can find this screen in the left-hand menu for your app server:

Hope you enjoyed this random tip!

Evan Lenz