XQJ Tutorial Part VII: XQuery Type System



This chapter describes how XQJ interacts with the XQuery type system. XQuery is a strongly typed language; the type system is based on XML Schema. As it is an inherent part of XQuery, you'll need some notions of XML Schema to properly understand the XQuery type system. However, it is out of scope for this XQJ tutorial to go into all the details.

Sequence and Item Types

XQuery defines a sequence type as a type that can be expressed using the SequenceType syntax. It consists of an item type that constrains the type of each item in the sequence, and a cardinality that constrains the number of items in the sequence. Having sequences and items in the XQuery type system, XQJ defines two corresponding interfaces: XQSequenceType and XQItemType.

XQSequenceType is a rather simple interface with only 3 methods:

  • getItemType() retrieves the item type of the sequence type.
  • getItemOccurrence() retrieves the cardinality that constraints the number of items.
  • toString() yields a string representation of the sequence type.

XQItemType encapsulates more information:

  • getItemKind() returns whether it is an element, attribute, atomic type, and so on.
  • getBaseType() specifies the built-in schema type closest matching this item type. For example, xs:anyType, xs:string, and so on.
  • getNodeName() yields the name of the node, which is a QName.
  • getPIName() yields the name of a processing instruction, which is a String.
  • getTypeName() specifies a QName identifying the XML Schema type of the item type. This can be either a built-in XML Schema type or user defined.
  • toString() yields a string representation of the item type.

There are some more attributes defined on XQItemType related to user defined schema types, but that would bring us too far afield in the context of this introductory tutorial.

XQSequenceType and XQItemType objects are used in two different contexts:

  • The representation of the static type of an external variable defined in a query and the query result. In this context, the type is possibly abstract, like item(), node()+, or xs:anyAtomicType?.
  • The concrete type of an item; here abstract types are not applicable.

XQItemType Usage

Let's have a closer look at XQItemType, which specifies the item kind and base type:

...
XQSequenceType xqtype = ...
XQItemType xqitype = xqtype.getItemType();
int itemKind = xqitype.getItemKind();
int schemaType = xqitype.getBaseType();
...

XQJ defines constants for each of the item kinds representable in XQuery SequenceType syntax:

Sequence Type XQJ Definition
QName XQITEMKIND_ATOMIC
element(...) XQITEMKIND_ELEMENT
attribute(...) XQITEMKIND_ATTRIBUTE
comment() XQITEMKIND_COMMENT
document-node() XQITEMKIND_DOCUMENT
document-node(element(...)) XQITEMKIND_DOCUMENT_ELEMENT
processing-instruction(...) XQITEMKIND_PI
text() XQITEMKIND_TEXT
item() XQITEMKIND_ITEM
node() XQITEMKIND_NODE


getBaseType()
is used to determine more precisely the type in case of, for example, XQITEMKIND_ATOMIC — when you have an atomic type, is it an xs:string or xs:integer? XQJ defines constants for all the built-in XML Schema and XQuery types. It's a long list; here is a small excerpt:

XML Schema Type XQJ Definition
xs:string XQBASETYPE_STRING
xs:integer XQBASETYPE_INTEGER
xs:untypedAtomic XQBASETYPE_UNTYPEDATOMIC

Working with Dynamic Types

Iterating over query results, XQJ allows you to request precise type information about each item. Suppose you want to use a different getXXX() method, depending on the item type:

XQSequence xqs = ...
while (xqs.next()) {
XQItemType xqtype = xqs.getItemType();
if (xqtype.getItemKind() == XQItemType.XQITEMKIND_ATOMIC) {
// We have an atomic type
switch (xqtype.getBaseType()) {
case XQItemType.XQBASETYPE_STRING:
case XQItemType.XQBASETYPE_UNTYPEDATOMIC: {
String s = (String)xqs.getObject();
...
break;
}
case XQItemType.XQBASETYPE_INTEGER: {
long l = xqs.getLong();
...
break;
}
...
}
} else {
// We have a node, retrieve it as a DOM node
org.w3c.dom.Node node = xqs.getNode();
...
}
}

This can make your code rather complex and long. Sometimes it can't be avoided, but most of the time a number of shortcuts are available to you. As explained in XQJ Tutorial Part IV: Processing Query Results, you can use some of the more the general purpose methods.

Suppose you need a DOM node in case the query returns a node, and the string value for all atomic values. The next simple example shows how to do this:

XQSequence xqs = ...
while (xqs.next()) {
XQItemType xqtype = xqs.getItemType();
if (xqtype.getItemKind() == XQItemType.XQITEMKIND_ATOMIC) {
// We have an atomic type
String s = xqs.getAtomicValue();
...
} else {
// We have a node, retrieve it as a DOM node
org.w3c.dom.Node node = xqs.getNode();
...
}
}

Working with Static Types

That's it for the dynamic type of items. The next example shows how to retrieve the static type of a query (for the JDBC, ODBC, and SQL users, this is somewhat similar to "describe information"):

...
XQPreparedExpression xqe = xqc.prepareExpression("1+2");
XQSequenceType xqtype = xqe.getStaticResultType();
System.out.println(xqtype.toString());
...

With DataDirect XQuery, this example outputs xs:integer to stdout. Similarly, you can use the prepared expression to retrieve information about the external variables. As shown in the next examples, first we determine the external variables declared in the query; next we retrieve the static type of each of the external variables:

...
XQPreparedExpression xqe = xqc.prepareExpression(
"declare variable $i as xs:integer external; $i+1");
QName variables[] = xqe.getAllExternalVariables();
for (int i=0; i<variables.length; i++) {
XQSequenceType xqtype = xqe.getStaticVariableType(variables[i]);
System.out.println(variables[i] + ": " + xqtype.toString());
}
...

Why is this useful at all? Let's have a quick look at a use case.

The idea of exposing XQueries as web services is not new; see, for example, the paper XQuery at Your Web Service. A fully functional example of such an "XQuery Web Service" is available on www.xquery.com. It is basically a servlet that reads XQueries from a specific directory (or from the classpath), and makes each of the queries available as functions accessible through SOAP or REST. The servlet needs to determine the external variables in each of the queries in order to generate the Web Services Description Language (WSDL), which contains an XML Schema definition describing the parameters for each operation — something like this (assuming an XQuery with two external variables, $employeeName and $hiringDate, declared as xs:string and xs:date):

<xs:element name="XXX">
<xs:complexType>
<xs:sequence>
<xs:element name="employeeName" type="xs:string"/>
<xs:element name="hiringDate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>

All the information required to generate such an XML schema definition is available in the sequence type of each declared variable. And XQJ makes this information immediately accessible. We could write a piece of code translating the relevant item kinds and base types to an XML Schema definition as shown above — it's only a matter of writing a number of Java switch statements. But is there an easier way?

Using the toString() Method

XQJ defines toString() on XQItemType as implementation dependent. More precisely, it is a requirement to return a human-readable string. In any case, with DataDirect XQuery the string representation is based on the XQuery sequence type syntax, where the QName prefixes are as follows:

  • For QNames representing built-in XML schema types, the xs prefix is always used.
  • For QNames representing element or attribute names, the prefixes as defined in the query are used. In case of duplicates, one is chosen in an implementation dependent manner.

Returning to our XQuery Web Service use case, the strategy to map the external variable declaration to the WSDL becomes rather simple using toString():

  • If the XQItemType is an atomic type, use the string representation.
  • If the XQItemType is anything else, use xs:anyType.

Applications have also the ability to create XQItemType objects. XQJ Tutorial Part IX: Creating XDM Instances show how this can be done and describes other use cases.