Nux 1.6

nux.xom.io
Class StaxParser

java.lang.Object
  extended by nux.xom.io.StaxParser

public class StaxParser
extends Object

Similar to the XOM Builder except that it builds a XOM document using an underlying StAX pull parser rather than a SAX push parser, inverting control flow.

StAX allows to explicitly iterate over the nodes of a document, in document order, via streaming methods such as next() and hasNext(). Processing can be stopped and resumed, and parts of a document can easily be skipped or filtered. In particular, individual nodes or fragments (i.e. subtrees) can be pulled and converted to XOM via methods buildNode() and buildFragment(), respectively.

Perhaps more importantly, control flow, data flow as well as state and resource management can often be controlled more tightly with a pull iterator API, rather than a callback driven push API such as SAX. For example, database query execution subsystems are typically based on (distributed) pull operator trees. Similarly, modular SOAP stacks typically prefer StAX, as outlined in AXIOM StAX introduction. Requiring an application to convert a push API to a pull API is both complex and inefficient (whereas the reverse is not true).

This class requires the StAX interfaces and a StAX parser implementation to be on the classpath. For example Woodstox (recommended) or Sun's sjsxp. Woodstox is the only StAX parser known to be exceptionally conformant, reliable, complete and efficient. At this time, other underlying StAX parsers may not perform full wellformedness checking, tend to have incomplete or buggy support for DTD, entities, external references, and are in general not as mature as underlying SAX parsers such as Xerces.

An instance of this class is not thread-safe.

Example Usage: Print each article in a list of millions of articles via buildFragment():

 InputStream in = new FileInputStream("samples/data/articles.xml");
 XMLStreamReader reader = StaxUtil.createXMLStreamReader(in, null);
 reader.require(XMLStreamConstants.START_DOCUMENT, null, null);
 reader.nextTag(); // move to "articles" root element
 reader.require(XMLStreamConstants.START_ELEMENT, null, "articles");
 
 while (reader.nextTag() == XMLStreamConstants.START_ELEMENT) { // yet another article
     reader.require(XMLStreamConstants.START_ELEMENT, null, "article");
     
     Document fragment = new StaxParser(reader, new NodeFactory()).buildFragment();
      
     // do something useful with the fragment...
     System.out.println("fragment = "+ fragment.getRootElement().toXML());    
 }      
 
 reader.close();
 in.close();
 
Example: Print all events in document order via buildNode():
 InputStream in = new FileInputStream("samples/data/articles.xml");
 XMLStreamReader reader = StaxUtil.createXMLStreamReader(in, null);
 StaxParser parser = new StaxParser(reader, new NodeFactory()); 
 int depth = 0;
 int ev;
 while ((ev = reader.getEventType()) != XMLStreamConstants.END_DOCUMENT) {
     if (ev == XMLStreamConstants.START_ELEMENT) depth++;
     
     // do something useful with the node...
     Node node = parser.buildNode();
     System.out.println(depth + ":" + StaxUtil.toString(ev) + ":" + node.toXML());
     
     if (ev == XMLStreamConstants.END_ELEMENT) depth--;
     reader.next();
 }
 
 reader.close();
 in.close();
 
Using JDBC 4's SQLXML data type, you could retrieve a user's blog entries from a database as follows:
 Connection conn = myDataSource.getConnection();
 PreparedStatement st = conn.prepareStatement("select userid, blog_entry from user_has_blog");
 ResultSet rs = st.executeQuery();
 while (rs.next()) {
     SQLXML blog = st.getSQLXML("blog_entry");
     javax.xml.stream.XMLStreamReader reader = blog.createXMLStreamReader();
     Document doc = new StaxParser(reader, new NodeFactory()).build();
     System.out.println(doc.toXML());
     blog.free();
 }
 

Author:
whoschek.AT.lbl.DOT.gov, $Author: hoschek $

Constructor Summary
StaxParser(XMLStreamReader reader, NodeFactory factory)
          Constructs a new instance that pushes into the given node factory.
 
Method Summary
 Document build()
          Builds the current document until the corresponding END_DOCUMENT event is seen.
 Document buildFragment()
          Builds the current element subtree until the corresponding END_ELEMENT event is seen; returns a document rooted at that element.
 Node buildNode()
          Creates and returns a new shallow XOM Node for the current StAX event the cursor is positioned over.
 XMLStreamReader getXMLStreamReader()
          Returns the StAX pull parser previously given on instance construction.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StaxParser

public StaxParser(XMLStreamReader reader,
                  NodeFactory factory)
Constructs a new instance that pushes into the given node factory.

Parameters:
reader - the underlying StAX pull parser to read from
factory - the node factory to stream into. May be null in which case the default XOM NodeFactory is used, building the full XML tree.
Method Detail

getXMLStreamReader

public XMLStreamReader getXMLStreamReader()
Returns the StAX pull parser previously given on instance construction.

Returns:
the underlying StAX pull parser.

build

public Document build()
               throws ParsingException
Builds the current document until the corresponding END_DOCUMENT event is seen. Requires that the reader is positioned over a START_DOCUMENT event.

Example usage:

 InputStream in = new FileInputStream("samples/data/articles.xml");
 XMLStreamReader reader = StaxUtil.createXMLStreamReader(in, null);
 Document doc = new StaxParser(reader, new NodeFactory()).build();
 System.out.println(doc.toXML());
 in.close();
 

Returns:
the parsed XOM document
Throws:
IllegalStateException - if reader.getEventType() != XMLStreamConstants.START_DOCUMENT
ParsingException - if there is an error processing the underlying XML source

buildFragment

public Document buildFragment()
                       throws ParsingException
Builds the current element subtree until the corresponding END_ELEMENT event is seen; returns a document rooted at that element. Requires that the reader is positioned over a START_ELEMENT event.

If this method returns successfully the cursor will be positioned over the corresponding END_ELEMENT.

Returns:
the parsed XOM document
Throws:
IllegalStateException - if reader.getEventType() != XMLStreamConstants.START_ELEMENT
ParsingException - if there is an error processing the underlying XML source

buildNode

public Node buildNode()
               throws ParsingException
Creates and returns a new shallow XOM Node for the current StAX event the cursor is positioned over.

If the current event is a START_ELEMENT, defined attributes and namespaces are added to the returned element. If the current event is an END_ELEMENT, only defined namespaces are added to the returned element.

This method does not advance the cursor/iterator, and it does not use a NodeFactory. Currently ignores XMLStreamConstants.ENTITY_DECLARATION and XMLStreamConstants.NOTATION_DECLARATION, returning null for these cases.

Returns:
a shallow XOM Node corresponding to the current StAX event.
Throws:
ParsingException - if there is an error processing the underlying XML source

Nux 1.6