Programming Tutorials Browser Tutorials Articles Struts Tutorials Hibernate Tutorials

  Tutorial: XML documents on the run, Part 2

XML documents on the run, Part 2

Tutorial Details:

XML documents on the run, Part 2
XML documents on the run, Part 2
By: By Dennis M. Sosnoski
Better SAX2 handling and the pull-parser alternative
s event-driven programming for SAX2 (Simple API for XML) endangering your sanity? After Part 1 of this three-part series introduced SAX2 parsing, you should feel more in touch with reality! In that article, I supplied basic handler techniques, which we'll build on in this article, to keep your code manageable.
Read the whole "XML Documents on the Run" series:
Part 1: SAX speeds through XML documents with parse-event streams
Part 2: Better SAX2 handling and the pull-parser alternative
Part 3: How do SAX2 parsers perform compared to new XMLPull parsers?
In this article, I extend the SAX2 handling approach suggested in Part 1 to cope with multiple nested-structure levels within an XML document. Using that approach, you can implement a class for each structure type you need to handle, keeping your code clean and eliminating event-driven programming's messiness.
Our quest for improved XML event-stream processing doesn't end with SAX2, though. I also introduce the pull-parser approach that's increasingly gaining attention as a SAX2 alternative. With pull parsing, your program keeps control, rather than relinquishing it to the parser -- letting you avoid the event-driven hassles completely!
Note: You can download this article's example source code from Resources .
Handling SAX2
I ended Part 1 with ways to extend the event-driven handling model we'd started to develop. I mentioned that you could enhance the interface to include start elements and nesting, and promised I'd address that in Part 2. So, let's get to it!
First, here's a more complicated version of the trade history documents from Part 1 :



SUNW




XA

call
100
9
13.47
500


SUNW




XA

86.24
500

...

The new version uses a new tracking element present in both the stock-trade and option-trade elements. The tracking element provides information applicable to all trade types. The types include the trade time, which we'd previously included directly in the stock and option trade information, as well as additional items to track the parties involved and the trade exchange.
In Part 1, to keep things simple, I stuck to the basics of using element content for our information. Now that you've seen the basics, in this article I extend the coverage to attributes. Using attributes for information rather than element content depends mainly on your style, and you'll often need to work with documents that combine the two approaches. I set up the trade-history document's new format with that in mind, and I included attribute values for useful information in the added tracking-element substructure. We'll look at how to handle such information in the following code examples.
A better interface
The last example from Part 1 employed a simple interface for our own handler classes:
public interface EndElementHandler
{
public void endElement(String lname, String content);
}
Since that code set the handler directly and used only element content, just one simple method was necessary. Now we want to handle attributes as well as content. We also want to nest handlers -- have one handler pass off control to a separate handler for processing a substructure, like the tracking information in our revised document. Figure 1 shows how this should work when we're processing an option-trade element, for example.
Figure 1. Stackable handlers in action: tracking element within the stock-trade element
We'll need a more complex interface to handle these requirements; it'll need support for attributes and some way to set a nested handler. Here's the definition of the StructureHandler interface we'll use for this purpose:
public interface StructureHandler
{
// Start of the root element in the structure being handled.
public void startElement(String lname, Attributes attrs);
// End of the root element in the structure being handled.
public void endElement(String lname, String content);
// Start of child element in the structure being handled -- this can
// invoke a nested handler, by passing back a non-null value.
public StructureHandler startChild(String lname, Attributes attrs);
// End of child element handled directly.
public void endDirectChild(String lname, String content);
// End of child element handled by nested handler.
public void endStructureChild(String lname, StructureHandler handler);
}
The above interface gives us the information necessary for handling our new, more complicated document format. The startElement() method call informs us that we're beginning our handling and gives a convenient hook for any initialization code. The endElement() method call then informs us when we're finished.
The startChild() method supplies the information for a child element start tag, and gives us the choice of handling it directly (by returning null ) or invoking a nested handler (by returning the handler instance). If we handle the child directly, we'll get a call to endDirectChild() on the end tag; if we invoke a nested handler, we'll get a call on endStructureChild() on the end tag. If we use the nested handler, we won't be called for anything between the start and end of the child element -- the nested handler will instead be used for any contained children.
Build a base
We want several classes to implement the StructureHandler interface; most won't actually use all the methods, though. To simplify our later code, we can define a simple base class with dummy interface-method implementations. Our other classes can then subclass that base and override only those methods they actually need to use.
Here's the base class implementation:
public class StructureHandlerBase implements StructureHandler
{
public void startElement(String lname, Attributes attrs) {}
public void endElement(String lname, String content) {}
public StructureHandler startChild(String lname, Attributes attrs) {
return null;
}
public void endDirectChild(String lname, String content) {}
public void endStructureChild(String lname, StructureHandler handler) {}
}
Not the most sophisticated code in the world, but it saves us duplicating these dummy methods in classes that don't need them.
Drive the interface
To use the spiffy new interface, we must modify our SAX2 handler class from the examples in Part 1 . The following class extends the SAX2 DefaultHandler base class and overrides the methods we use for our application. Here's the new version:
public class StructuredDocumentHandler extends DefaultHandler
{
/** Structure handler context stack. */
protected Stack m_contextStack;
/** Current nested element depth. */
protected int m_nestingDepth;
/** Depth at which to pop handler context. */
protected int m_contextDepth;
/** Active structure handler. */
protected StructureHandler m_handler;
/** Character data collection buffer. */
protected StringBuffer m_contentBuffer = new StringBuffer();
public StructuredDocumentHandler(StructureHandler handler) {
// set base handler for document
m_handler = handler;
m_contextDepth = -1;
m_contextStack = new Stack();
}
public void startElement(String uri, String lname, String qname,
Attributes attributes) {
// Initialize content and check handler.
m_contentBuffer.setLength(0);
StructureHandler next = m_handler.startChild(lname, attributes);
if (next != null) {
// Save current handler context.
HandlerContext context =
new HandlerContext(m_contextDepth, m_handler);
m_contextStack.push(context);
// Change to new nested handler.
m_handler = next;
m_contextDepth = m_nestingDepth;
next.startElement(lname, attributes);
}
// Bump the nested element count.
m_nestingDepth++;
}
public void characters(char[] chars, int start, int length) {
m_contentBuffer.append(chars, start, length);
}
public void endElement(String uri, String lname, String qname) {
// Clean up content and check if context end.
String content = m_contentBuffer.toString().trim();
m_nestingDepth--;
if (m_nestingDepth == m_contextDepth) {
// Report end element for current handler.
m_handler.endElement(lname, content);
// Restore higher level handler context.
HandlerContext context = (HandlerContext)m_contextStack.pop();
StructureHandler last = m_handler;
m_handler = context.getHandler();
m_contextDepth = context.getDepth();
// Report child structure end to higher level handler.
m_handler.endStructureChild(lname, last);
} else {
// Report end of child element.
m_handler.endDirectChild(lname, content);
}
}
protected class HandlerContext
{
private final int m_depth;
private final StructureHandler m_handler;
protected HandlerContext(int depth, StructureHandler handler) {
m_depth = depth;
m_handler = handler;
}
protected int getDepth() {
return m_depth;
}
protected StructureHandler getHandler() {
return m_handler;
}
}
}
StructuredDocumentHandler works with a stack of StructureHandler instances. m_handler always references this stack's current instance, which stays valid as long as we stay within the top-level element where it was first supplied. m_contextDepth gives the top-level element's depth for the current handler, and m_nestingDepth tracks our current depth within the document structure -- how many start tags we've seen without seeing the correspon


 

Read Tutorial at: Click here to view the tutorial

Rate Tutorial:
XML documents on the run, Part 2

View Tutorial:
XML documents on the run, Part 2

Related Tutorials:

XML JavaBeans, Part 1 - JavaWorld February 1999
XML JavaBeans, Part 1 - JavaWorld February 1999
 
XML JavaBeans, Part 2 - JavaWorld March 1999
XML JavaBeans, Part 2 - JavaWorld March 1999
 
Java makes the most of XML's extensibility - JavaWorld July 1999
Java makes the most of XML's extensibility - JavaWorld July 1999
 
Programming XML in Java, Part 3 - JavaWorld July 2000
Programming XML in Java, Part 3 - JavaWorld July 2000
 
Easy Java/XML integration with JDOM, Part 2 - JavaWorld July 2000
Easy Java/XML integration with JDOM, Part 2 - JavaWorld July 2000
 
Alternative deployment methods, Part 2: The best of both worlds - JavaWorld July 2000
Alternative deployment methods, Part 2: The best of both worlds - JavaWorld July 2000
 
Mapping XML to Java, Part 1 - JavaWorld August 2000
Mapping XML to Java, Part 1 - JavaWorld August 2000
 
Validation with Java and XML Schema, Part 2 - JavaWorld October 2000
Validation with Java and XML Schema, Part 2 - JavaWorld October 2000
 
Jato: The new kid on the open source block - JavaWorld March 2001
Jato: The new kid on the open source block - JavaWorld March 2001
 
Clean up your wire protocol with SOAP, Part 1 - JavaWorld March 2001
Clean up your wire protocol with SOAP, Part 1 - JavaWorld March 2001
 
Jato: The new kid on the open source block, Part 2 - JavaWorld April 2001
Jato: The new kid on the open source block, Part 2 - JavaWorld April 2001
 
XML messaging, Part 3
XML messaging, Part 3
 
XSLT blooms with Java
XSLT blooms with Java
 
Use XML data binding to do your laundry
Use XML data binding to do your laundry
 
XML documents on the run, Part 1
XML documents on the run, Part 1
 
XML documents on the run, Part 2
XML documents on the run, Part 2
 
JavaWorld article
JavaWorld article
 
Yes, you can secure your Web services documents, Part 1
Yes, you can secure your Web services documents, Part 1
 
Yes, you can secure your Web services documents, Part 2
Yes, you can secure your Web services documents, Part 2
 
JSP 2.0: The New Deal, Part 3
JSP 2.0: The New Deal, Part 3 More Flexible JSP Document Format Rules The JSP specification supports two types of JSP pages: regular JSP pages containing any type of text or markup, and JSP Documents, which are well-formed XML documents; i.e., docum
 
Site navigation
 

 

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright © 2006. All rights reserved.