Nux 1.6

nux.xom.pool
Class XOMUtil

java.lang.Object
  extended by nux.xom.pool.XOMUtil

public class XOMUtil
extends Object

Various utilities avoiding redundant code in several classes.

Author:
whoschek.AT.lbl.DOT.gov, $Author: hoschek3 $

Nested Class Summary
static class XOMUtil.Normalizer
          Standard XML algorithms for text and whitespace normalization (but not for Unicode normalization); type safe enum.
 
Method Summary
static DOMImplementation getDOMImplementation()
          Returns a namespace-aware DOMImplementation via the default JAXP lookup mechanism.
static NodeFactory getIgnoreWhitespaceOnlyTextNodeFactory()
          Returns a node factory that removes each Text node that is empty or consists of whitespace characters only (boundary whitespace).
static NodeFactory getLoggingNodeFactory(NodeFactory child, PrintStream log, String logName)
          Returns a factory that delegates all calls to the given child factory, logging each call to the given log stream (typically System.err) for simple debugging purposes.
static NodeFactory getNullNodeFactory()
          Returns a node factory for pure document validation.
static NodeFactory getRedirectingNodeFactory(StreamingSerializer serializer)
          Returns a node factory that redirects its input onto the output of a streaming serializer.
static NodeFactory getTextTrimmingNodeFactory()
          Returns a node factory that removes leading and trailing whitespaces in each Text node, altogether removing a Text node that becomes empty after said trimming (ala String.trim()).
static Document jaxbMarshal(Marshaller marshaller, Object jaxbObj)
          Marshals (serializes) the given JAXB object via the given marshaller into a new XOM Document (convenience method).
static Object jaxbUnmarshal(Unmarshaller unmarshaller, ParentNode node)
          Unmarshals (deserializes) the given XOM node via the given unmarshaller into a new JAXB object (convenience method).
static byte[] toCanonicalXML(Document doc)
          Returns the W3C Canonical XML representation of the given document.
static String toDebugString(Node node)
          Returns a properly indented debug level string representation of the entire given XML node subtree, decorated with node types, node names, children, etc.
static Document toDocument(String xml)
          Returns the XOM document obtained by parsing from the content of the given XML string.
static String toPrettyXML(Node node)
          Returns a pretty-printed String representation of the given node (subtree).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getDOMImplementation

public static DOMImplementation getDOMImplementation()
Returns a namespace-aware DOMImplementation via the default JAXP lookup mechanism.

Returns:
a namespace-aware DOMImplementation

toPrettyXML

public static String toPrettyXML(Node node)
Returns a pretty-printed String representation of the given node (subtree).

Parameters:
node - the node (subtree) to convert.
Returns:
a pretty-printed String representation
See Also:
Serializer

toCanonicalXML

public static byte[] toCanonicalXML(Document doc)
Returns the W3C Canonical XML representation of the given document.

Parameters:
doc - the document to convert.
Returns:
the bytes representing canonical XML
See Also:
Canonicalizer

toDebugString

public static String toDebugString(Node node)
Returns a properly indented debug level string representation of the entire given XML node subtree, decorated with node types, node names, children, etc. For instance, this can be used to find structural diffs, to detect anomalies wrt. empty texts, whitespace text, etc. Applications could use this in combination with a XOMUtil.Normalizer.

Parameters:
node - the subtree to display
Returns:
a string representation for debugging purposes.

toDocument

public static Document toDocument(String xml)
Returns the XOM document obtained by parsing from the content of the given XML string. Useful for quick'n dirty inline examples and tests. The document is parsed with a non-validating Builder, and the baseURI of the document will be the empty string.

Example usage:

 String xml = 
     "<foo>" +
         "<bar size='123'>" +
             "hello world" +
         "</bar>" +
     "</foo>";
 Document doc = toDocument(xml);
 System.out.println(doc.toXML());
 

Parameters:
xml - the string to parse from
Returns:
the corresponding XOM document
Throws:
XMLException - if the content of the string to parse is not well-formed XML.
See Also:
Builder.build(String, String)

getIgnoreWhitespaceOnlyTextNodeFactory

public static NodeFactory getIgnoreWhitespaceOnlyTextNodeFactory()
Returns a node factory that removes each Text node that is empty or consists of whitespace characters only (boundary whitespace). This method fully preserves narrative Text containing whitespace along with other characters.

Otherwise this factory behaves just like the standard NodeFactory.

Ignoring whitespace-only nodes reduces memory footprint for documents that are heavily pretty printed and indented, i.e. human-readable. Remember that without such a factory, every whitespace sequence occurring between element tags generates a mostly useless Text node.

Finally, note that this method's whitespace pruning is appropriate for many, but not all XML use cases (round-tripping). For example, the blank between <p><strong>Hello</strong> <em>World!</em></p> will be removed, which might not be what you want. This is because this method does not look across multiple Text nodes.

Returns:
a node factory

getLoggingNodeFactory

public static NodeFactory getLoggingNodeFactory(NodeFactory child,
                                                PrintStream log,
                                                String logName)
Returns a factory that delegates all calls to the given child factory, logging each call to the given log stream (typically System.err) for simple debugging purposes.

Parameters:
child - the factory to delegate to
log - the print stream to log to (typically System.err)
logName - a name for this logger (typically "log" or similar)
Returns:
a logging node factory

getTextTrimmingNodeFactory

public static NodeFactory getTextTrimmingNodeFactory()
Returns a node factory that removes leading and trailing whitespaces in each Text node, altogether removing a Text node that becomes empty after said trimming (ala String.trim()). For example a text node of " hello world " becomes "hello world", and a text node of " " is removed.

Otherwise this factory behaves just like the standard NodeFactory.

Finally, note that this method's whitespace pruning is appropriate for many, but not all XML use cases (round-tripping).

Returns:
a node factory

getNullNodeFactory

public static NodeFactory getNullNodeFactory()
Returns a node factory for pure document validation. This factory does not generate a document on Builder.build(...), which is not required anyway for pure validation. Ignores all input and builds an empty document instead. This improves validation performance.

Returns:
a node factory

getRedirectingNodeFactory

public static NodeFactory getRedirectingNodeFactory(StreamingSerializer serializer)
Returns a node factory that redirects its input onto the output of a streaming serializer. For example can be used to convert standard textual XML to and from bnux binary XML. Works in a fully streaming fashion, that is, without building a complete temporary XOM main memory tree.

The document returned on finishMakingDocument will be empty.

Parameters:
serializer - the streaming serializer to write to
Returns:
a redirecting node factory

jaxbMarshal

public static Document jaxbMarshal(Marshaller marshaller,
                                   Object jaxbObj)
                            throws JAXBException
Marshals (serializes) the given JAXB object via the given marshaller into a new XOM Document (convenience method).

This implementation is somewhat inefficient but correctly does the job. There is no connection between the JAXB object tree and the XOM object tree; they are completely independent object trees without any cross-references. Hence, updates in one tree are not automatically reflected in the other tree.

Parameters:
marshaller - a JAXB serializer (note that a marshaller is typically not thread-safe and expensive to construct; hence the recommendation is to use a ThreadLocal to make it thread-safe and efficient)
jaxbObj - the JAXB object to serialize
Returns:
the new XOM document
Throws:
JAXBException - If an unexpected problem occurred in the conversion.
MarshalException - If an error occurred while performing the marshal operation. Whereever possible, one should prefer the MarshalException over the JAXBException.
See Also:
Marshaller.marshal(java.lang.Object, org.w3c.dom.Node)

jaxbUnmarshal

public static Object jaxbUnmarshal(Unmarshaller unmarshaller,
                                   ParentNode node)
                            throws JAXBException
Unmarshals (deserializes) the given XOM node via the given unmarshaller into a new JAXB object (convenience method).

This implementation is somewhat inefficient but correctly does the job. There is no connection between the JAXB object tree and the XOM object tree; they are completely independent object trees without any cross-references. Hence, updates in one tree are not automatically reflected in the other tree.

Parameters:
unmarshaller - a JAXB deserializer (note that an unmarshaller is typically not thread-safe and expensive to construct; hence the recommendation is to use a ThreadLocal to make it thread-safe and efficient)
node - the XOM node to deserialize
Returns:
the new JAXB object
Throws:
JAXBException - If an unexpected problem occurred in the conversion.
UnmarshalException - If an error occurred while performing the unmarshal operation. Whereever possible, one should prefer the UnmarshalException over the JAXBException.
See Also:
Unmarshaller.unmarshal(org.w3c.dom.Node)

Nux 1.6