Nux 1.6

nux.xom.xquery
Class XQuery

java.lang.Object
  extended by nux.xom.xquery.XQuery

public class XQuery
extends Object

Compiled representation of a W3C XQuery (thread-safe). Since XQuery can be seen as a superset of XPath 2.0 this class can also be used with plain XPath expressions as queries.

Instances are considered immutable and thread-safe; The same compiled query may be executed (evaluated) many times in series or in parallel, just like XSLTransform objects. A compiled query is conceptually similar to a JDBC PreparedStatement.

Example usage

     Document doc = new Builder().build(new File("samples/data/periodic.xml"));
  
     // find the atom named 'Zinc' in the periodic table:
     Node result = XQueryUtil.xquery(doc, "/PERIODIC_TABLE/ATOM[NAME = 'Zinc']").get(0);
     System.out.println("result=" + result.toXML());
 
     // equivalent via the more powerful underlying API:
     XQuery xquery = new XQuery("/PERIODIC_TABLE/ATOM[NAME = 'Zinc']", null);
     Node result = xquery.execute(doc).next();
 
     // count the numer of elements in a document tree:
     int count = XQueryUtil.xquery(doc, "//*").size();
     System.out.println("count=" + count);
 
     // equivalent via the XPath count() function:
     int count = Integer.parseInt(XQueryUtil.xquery(doc, "count(//*)").get(0).getValue());
     System.out.println("count=" + count);
 
A query to find the links of all images (or all JPG images) in a XHTML-like document:
     Document doc = new Builder().build(new File("/tmp/test.xml"));
     Nodes results = XQueryUtil.xquery(doc, "//*:img/@src");
     // Nodes results = XQueryUtil.xquery(doc, "//*:img/@src[matches(., '.jpg')]");
 
     for (int i=0; i < results.size(); i++) {
         System.out.println("node "+i+": "+results.get(i).toXML());
         //System.out.println("node "+i+": "+ XOMUtil.toPrettyXML(results.get(i)));
     }
 

Namespaces

A query can use namespaces. Here is an example that lists the titles of Tim Bray's blog articles via the Atom feed:
 declare namespace atom = "http://www.w3.org/2005/Atom"; 
 declare namespace xsd = "http://www.w3.org/2001/XMLSchema";
 doc("http://www.tbray.org/ongoing/ongoing.atom")/atom:feed/atom:entry/atom:title
 
Namespace declarations can be defined inline within the query prolog via declare namespace, and declare default element namespace directives, as described above. They can also be defined via the declareNamespace() methods and setDefaultElementNamespace() method of a StaticQueryContext.

Passing variables to a query

A query can declare local variables, for example:
     declare variable $i := 7;
     declare variable $j as xs:integer := 7;
     declare variable $math:pi as xs:double := 3.14159E0;
     declare variable $bookName := 'War and Peace';
 
A query can access variable values via the standard $varName syntax, as in return ($x, $math:pi) or /books/book[@name = $bookName]/author[@name = $authorName].

A query can declare external global variables, for example:

     declare variable $foo     as xs:string external; 
     declare variable $size    as xs:integer external; 
     declare variable $myuri   as xs:anyURI external;
     declare variable $mydoc   as document-node() external;
     declare variable $myelem  as element() external;
     declare variable $mynodes as node()* external;
 
External global variables can be bound and passed to the query as follows:
     Map vars = new HashMap();
     vars.put("foo", "hello world");
     vars.put("size", new Integer(99));
     vars.put("myuri", "http://www.w3.org/2001/XMLSchema");
     vars.put("mydoc", new Document(new Element("xyz")));
     vars.put("myelem", new Element("abc"));
     vars.put("mynodes", new Node[] {new Document(new Element("elem1")), new Element("elem2"))});
     vars.put("mydocs", new Node[] {
         new Builder().build(new File("samples/data/articles.xml")), 
         new Builder().build(new File("samples/data/p2pio.xml")) });
     
     String query = "for $d in $mydocs return $size * count($d)";
     Nodes results = new XQuery(query, null).execute(doc, null, vars).toNodes();
     new ResultSequenceSerializer().write(results, System.out);
 

Standard functions, user defined functions and extension functions

The Standard XQuery functions can be used directly. Also note that XPath 2.0 supports regular expressions via the standard fn:matches, fn:replace, and fn:tokenize functions. For example:
 string-length('hello world');
 

A query can employ user defined functions, for example:

     declare namespace ipo = "http://www.example.com/IPO";
     
     declare function local:total-price( $i as element(item)* ) as xs:double {
         let $subtotals := for $s in $i return $s/quantity * $s/USPrice
         return sum($subtotals)
     }; 
     
     for $p in doc("ipo.xml")/ipo:purchaseOrder
     where $p/shipTo/name="Helen Zoe" and $p/@orderDate = xs:date("1999-12-01")
     return local:total-price($p//item) 
 
Custom extension functions written in Java can be defined and used as explained in the Saxon Extensibility Functions documentation. For example, here is query that outputs the square root of a number via a method in java.lang.Math, as well as calls static methods and constructors of java.lang.String, java.util.Date as well as other extension functions.
     declare namespace exslt-math = "http://exslt.org/math";
     declare namespace math   = "java:java.lang.Math";
     declare namespace date   = "java:java.util.Date"; 
     declare namespace string = "java:java.lang.String"; 
     declare namespace saxon  = "http://saxon.sf.net/";
 
     declare variable $query := string(doc("query.xml")/queries/query[1]);
 
     (
     exslt-math:sin(3.14)
     math:sqrt(16),
     math:pow(2,16),
     string:toUpperCase("hello"),
     date:new(),                    (: print current date :)
     date:getTime(date:new())       (: print current date in milliseconds :)
 
     saxon:eval(saxon:expression($query))  (: run a dynamically constructed query :)
     )
 

Modules

An XQuery module is a file containing a set of variable and function declarations. For decomposition and reuse of functionality a module can import declarations from other modules. Here are two example modules generating the factorial of a number:
     (: file modules/factorial.xq :)
     module namespace factorial = "http://example.com/factorial";
     declare function factorial:fact($i as xs:integer) as xs:integer {
         if ($i <= 1)
             then 1
             else $i * factorial:fact($i - 1)
     };

     (: file main.xq :)
     import module namespace factorial = "http://example.com/factorial" at "modules/factorial.xq";
     factorial:fact(4)
 
     [hoschek /Users/hoschek/unix/devel/nux] fire-xquery main.xq
     <atomic-value xsi:type="xs:integer" xmlns="http://dsd.lbl.gov/nux" 
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">24</atomic-value>
 

Customizing the doc() function

A custom document URI resolver for the XQuery/XPath doc() function can be defined in the constructor of this class. Other miscellaneous options can be made available to the query by calling configuration methods on a DynamicQueryContext (per execution), or on the configuration object of a StaticQueryContext (per query).

Performance

For simple XPath expressions you can get a throughput of up to 1000-20000 (100000) executions/sec over 200 (0.5) KB input documents, served from memory (commodity PC 2004, JDK 1.5, server VM). Be aware that this is an example ballpark figure at best, because use cases, documents and the complexity of queries vary wildly in practise. In any case, it is safe to assume that this XQuery/XPath implementation is one of the fastest available. For details, see XQueryBenchmark.

Author:
whoschek.AT.lbl.DOT.gov, $Author: hoschek3 $

Constructor Summary
XQuery(String query, URI baseURI)
          Constructs a new compiled XQuery from the given query.
XQuery(String query, URI baseURI, StaticQueryContext staticContext, DocumentURIResolver resolver)
          Constructs a new compiled XQuery from the given query, base URI, static context and resolver.
 
Method Summary
 ResultSequence execute(Node contextNode)
          Executes (evaluates) the query against the given node.
 ResultSequence execute(Node contextNode, DynamicQueryContext dynamicContext, Map variables)
          Executes (evaluates) the query against the given node, using the given dynamic context and external variables.
 String explain()
          Returns a description of the compiled and optimized expression tree; useful for advanced performance diagnostics only.
protected  ResultSequence newResultSequence(XQueryExpression expression, DynamicQueryContext dynamicContext)
          Callback that returns a result sequence for the current query execution.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XQuery

public XQuery(String query,
              URI baseURI)
       throws XQueryException
Constructs a new compiled XQuery from the given query.

Parameters:
query - the query to compile
baseURI - an absolute URI, used when necessary in the resolution of relative URIs found in the query. Used by the XQuery doc function. (May be null in which case it defaults to the current working directory).
Throws:
XQueryException - if the query has a syntax error, or if it references namespaces, variables, or functions that have not been declared, or contains other static errors such as type mismatches.

XQuery

public XQuery(String query,
              URI baseURI,
              StaticQueryContext staticContext,
              DocumentURIResolver resolver)
       throws XQueryException
Constructs a new compiled XQuery from the given query, base URI, static context and resolver.

Parameters:
query - the query to compile
baseURI - an absolute URI, used when necessary in the resolution of relative URIs found in the query. Used by the XQuery doc function, and hence the resolver. May be null in which case it defaults to the current working directory.
staticContext - the context and configuration to use; per query (may be null).
resolver - an object that is called by the XQuery processor to turn a URI passed to the XQuery doc() function into a XOM Document. May be null in which case non-validating non-pooled default resolution is used.
Throws:
XQueryException - if the query has a syntax error, or if it references namespaces, variables, or functions that have not been declared, or contains other static errors such as type mismatches.
Method Detail

execute

public ResultSequence execute(Node contextNode)
                       throws XQueryException
Executes (evaluates) the query against the given node. Results are returned in document order, unless specified otherwise by the query.

Parameters:
contextNode - the context node to execute the query against. The context node is available to the query as the value of the query expression ".". If this parameter is null, the context node will be undefined.
Returns:
a result sequence iterator producing zero or more results
Throws:
XQueryException - if an error occurs during execution, for example division overflow, or a type error caused by type mismatch, or an error raised by the XQuery function fn:error().

execute

public ResultSequence execute(Node contextNode,
                              DynamicQueryContext dynamicContext,
                              Map variables)
                       throws XQueryException
Executes (evaluates) the query against the given node, using the given dynamic context and external variables. Results are returned in document order, unless specified otherwise by the query.

Argument variables specifies external global variables in the form of zero or more variableName --> variableValue map associations. Each map entry's key and value are interpreted as follows:

Parameters:
contextNode - the context node to execute the query against. The context node is available to the query as the value of the query expression ".". If this parameter is null, the context node will be undefined.
dynamicContext - optional dynamic context of this execution (may be null). If not null, the Configuration object of the dynamic context must be the same as the Configuration object that was used when creating the StaticQueryContext.
variables - optional external global variables to be bound on the dynamic context; per execution (may be null).
Returns:
a result sequence iterator producing zero or more results
Throws:
XQueryException - if an error occurs during execution, for example division overflow, or a type error caused by type mismatch, or an error raised by the XQuery function fn:error().

explain

public String explain()
Returns a description of the compiled and optimized expression tree; useful for advanced performance diagnostics only.

Returns:
a string description

newResultSequence

protected ResultSequence newResultSequence(XQueryExpression expression,
                                           DynamicQueryContext dynamicContext)
                                    throws XQueryException
Callback that returns a result sequence for the current query execution.

An XQuery result sequence may, apart from "normal" nodes, also contain top-level values of atomic types such as xs:string, xs:integer, xs:double, xs:boolean, etc, all of which are not XML nodes. Hence, a way to convert atomic values to normal XML nodes is needed.

This method's result sequence implementation converts each top-level atomic value to an Element named "atomic-value" with a child Text node holding the atomic value's standard XPath 2.0 string representation. An "atomic-value" element is decorated with a namespace and a W3C XML Schema type attribute. "Normal" nodes and anything not at top-level is returned "as is", without conversion.

Overrride this default implementation if you need custom conversions in result sequence implementations. Note however, that conversions of atomic values should rarely be used. It is often more desirable to avoid such atomic conversions altogether. This can almost always easily be achieved by formulating the xquery string such that it wraps atomic values via standard XQuery constructors (rather than via Java methods) into an element or attribute or text or document. That way, the query always produces a "normal" XML node sequence as output, and never produces a sequence of atomic values as output, and thus custom conversion by this class are never needed and invoked.

For example, by default the following query

     for $i in (1, 2, 3) return $i
 
yields this (perhaps somewhat unexpected) output:
     <item:atomic-value xsi:type="xs:integer" xmlns:item="http://dsd.lbl.gov/nux" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">1</atomic-value>
     <item:atomic-value xsi:type="xs:integer" xmlns:item="http://dsd.lbl.gov/nux" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">2</atomic-value>
     <item:atomic-value xsi:type="xs:integer" xmlns:item="http://dsd.lbl.gov/nux" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">3</atomic-value>
 

This format is good for software processing, but not very human readable. Hence, most likely you will want to rewrite the query along the following lines:

     for $i in (1, 2, 3) return <item> {$i} </item>
 
now yielding as output three Element nodes:
     <item>1</item> 
     <item>2</item>
     <item>3</item>
 
Or you might want to rewrite the query along the following lines:
     for $i in (1, 2, 3) return text {$i}
 
now yielding as output three Text nodes:
     1 
     2
     3
 
Observe that the rewritten query converts top-level atomic values precisely as desired by the user, without needing any Java-level conversion.

Parameters:
expression - the compiled query expression
dynamicContext - the dynamic context of this execution
Returns:
a result sequence implementation
Throws:
XQueryException - if an error occurs during execution

Nux 1.6