Java makes the most of XML's extensibility - JavaWorld July
1999
Tutorial Details:
XML and Java: A potent partnership, Part 2
XML and Java: A potent partnership, Part 2
By: By Todd Sundsted
Learn how to use Java to build applications that handle XML's extensibility
ast month I presented my case for the place of XML in the enterprise (for last month's column, see Resources ). I intentionally tried to look beyond the publishing aspects of XML in order to focus on application integration and data exchange issues. I demonstrated how easy it is to parse and validate XML using commonly available Java tools, and I compared those methods to more traditional ad hoc methods.
How-To XML & Java: Read the whole series!
Part 1. Why the XML-Java combo has captured the minds of enterprise application developers
Part 2. Use Java to build applications that handle XML's extensibility
Part 3. Integrate Java and JavaScript, two popular programming languages
Part 4. Use Java, laced with JavaScript, to push XML's flexibility into new dimensions
This month, I wish to carry the thread further -- parsing and validating are fine as far as they go, but they don't go very far. The problem at hand typically involves doing something with the parsed information. But what if you don't understand the tags used to generate the information? Come walk with me a bit farther along the border between Java and XML, and I'll show you how to use Java to solve that problem, too.
XML tags: What to do?
Let's proceed straight to the heart of the matter. The feature of XML we need to address is its ability to define new tags. A tag in XML says something about the meaning of the content (and about the other tags) it associates with. Because the set of tags in XML is open (unlike HTML, where the set is closed), it's impossible to build an application that handles the entire tag set right out of the box. This introduces a bit of uncertainty into the process. What exactly do you do with tags you don't understand?
Applications can ignore novel tags. This was the approach typically taken by browser vendors during the height of the browser wars. Leading browser vendors merrily defined new tags with each release of their product, and each browser distribution quietly ignored those tags it didn't understand. This approach is safe but not very satisfactory.
Organizations can standardize on a set of tags. This approach cleverly sidesteps the entire problem. You define a set of standard tags and a document type definition (DTD), and then reject any XML that doesn't fit the mold. This is actually the right solution for many problems. Sales orders, for example, fit a well-defined pattern. Nothing is gained by allowing e-commerce partners to define new tags (at least without constraint -- the case could be made for the applicability of certain well-constrained tag definitions, such as macros). Unfortunately, not all applications -- XML browsers, for example -- fit within this box.
Applications can try to figure out what to do with novel tags. Browsers and content-presentation tools as well as other general-purpose XML tools must behave correctly in the presence of novel but valid tags. There are several ways to solve this problem. The Extensible Stylesheet Language (XSL) is one such attempt. XSL provides a translation toolkit, which allows you to define a mapping or translation from a tag set you don't understand to a tag set you do understand (for example, XML to HTML). This solution, however, has its own limitations.
You can build a new framework. While each of the solutions above has its place, we'll explore another solution altogether. Our solution calls for enabling the browser or XML tool to look for and download code designed to handle the novel tags and then integrate that code into the application. To do this, we'll build a new framework.
Before we can get down to the business of building our solution, we need to understand a little more about XML. In particular, we need to understand how to manipulate XML within an application. We need to understand the Document Object Model (DOM).
The DOM
The DOM is a platform-independent, programming-language-neutral API that allows programs to access and modify the content and the structure of XML documents from within applications.
At its core, the DOM defines a family of types that represent all the objects that make up an XML document: elements, attributes, entity references, comments, textual data, processing instructions, and the rest. (I use the word object throughout this article to loosely refer to the building-blocks of an XML document.) The DOM, originally envisioned as living inside a browser, has turned out to have a much broader impact. It is also worth noting that the DOM isn't specific to XML. It applies equally well to HTML.
To comprehend the DOM you need to remember that a key characteristic of XML is the notion that many documents can be represented as a hierarchical structure of content and markup. The code below, for example, represents a valid XML document:
This is the title.
This is a headline.
This is the body.
And this is more of the body.
I don't want to provide you with an XML primer, but I do want to make one point clear: A key requirement of XML (and HTML) is that tags must nest -- they may not overlap. Therefore counts as valid XML but does not. As a consequence, well-formed XML documents map cleanly to a tree-like data structure. Next, we can transform the document above into the tree in Figure 1, below.
Figure 1. An XML tree
The DOM provides the mechanism we need to dynamically interact with the elements and content in an XML document. Consider the tree in Figure 1. I have mapped the tags ( elements, in DOM parlance) that make up our initial XML document to the nodes of the tree in Figure 1.
Each tag has meaning within the context of the enclosing tags and the document as a whole. Consider the tags again. These tags clearly define presentation-related elements within a browser. As such, they have well-understood behavior associated with them. We expect the browser to know how to draw them within the browser window. The code that implements the behavior is present within the browser.
Now consider the code below, which represents an XML document with novel tags:
friend
friend
GET RICH QUICK!!!
Dear Friend,
PLEASE READ THIS!!! It's easy to make money on the Internet. Just
follow this proven three-step plan.
This is another small piece of XML. We clearly don't expect a browser to know what to do with these tags. In order to deal with them, the browser (or other general-purpose XML appliance) must be modified.
Figure 2, below, illustrates the general framework we'll employ to pull this off.
Figure 2. Framework for modifying the browser
In this example, each element of the DOM hierarchy on the left side maps to an element of the hierarchy on the right side. The DOM elements on the left represent the structure of the document. The elements on the right side represent the behavior of the structure elements. The behavior elements are arranged in a hierarchy as well, so that they can interact with each other in a manner that reflects the organization of the DOM model.
The Hook class
The building block of the behavior hierarchy is the Hook class. This class provides a behavioral "hook" into a behaviorless DOM tree. Here's the code for the Hook class:
import java.util.Vector;
import java.util.Enumeration;
import org.w3c.dom.Element;
public
class Hook {
private
Hook _hookParent = null;
private
Vector _vectorChildren = new Vector();
private
Element _element = null;
public
void
setElement(Element element) {
_element = element;
}
protected
Element
getElement() {
return _element;
}
public
void
setParent(Hook hookParent) {
_hookParent = hookParent;
}
protected
Hook
getParent() {
return _hookParent;
}
public
void
addChild(Hook hookChild) {
_vectorChildren.addElement(hookChild);
}
protected
Enumeration
getChildren() {
return _vectorChildren.elements();
}
public
Object
build(Object object) {
object = doOnNodeStart(object);
Enumeration enumeration = _vectorChildren.elements();
while (enumeration.hasMoreElements()) {
Hook hook = (Hook)enumeration.nextElement();
hook.build(object);
}
object = doOnNodeEnd(object);
return object;
}
public
Object
doOnNodeStart(Object object) {
return object;
}
public
Object
doOnNodeEnd(Object object) {
return object;
}
}
Hook is meant to be the supertype of a family of related subtypes. The Hook class itself doesn't implement any behavior. In fact, it doesn't implement any methods other than those needed to link parents and children. Families of subtypes should build on these primitives and define a collection of classes based around a common behavioral architecture.
The Filter class
The Filter class's sole method recursively builds the behavior tree from the DOM tree.
Let's take a look at the Filter class:
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public
class Filter {
public
void
filter(Hook hookParent, NodeList nodelist) {
if (nodelist != null) {
for (int i = 0; i < nodelist.getLength(); i++) {
Node node = nodelist.item(i);
// This handles what appears to be a bug in at least one
// vendor's implementation (IBM's xml4j v. 2.0.6) of the
// DOM. This bug seems to effect text nodes: the reported
// length is nonzero but the returned list contains no valid
// elements.
if (node == null) break;
if (node instanceof Element) {
Element element = (Element)node;
Hook hook = HookManager.createHook(element);
filter(hook, element.getChildNodes());
hook.setParent(hookParent);
hookParent.addChild(hook);
} else {
filter(hookParent, node.getChildNodes());
}
}
}
}
}
The HookManager class
The HookManager c
Read
Tutorial at: Click here to view the tutorial
Rate Tutorial: Java makes the most of XML's extensibility - JavaWorld July
1999
View Tutorial: Java makes the most of XML's extensibility - JavaWorld July
1999
Related
Tutorials:
3D graphics programming in
Java, Part 3: OpenGL
3D graphics programming in
Java, Part 3: OpenGL |
Twelve
rules for developing more secure Java code
Twelve
rules for developing more secure Java code |
XSLT blooms with
Java
XSLT blooms with
Java |
XML documents on
the run, Part 1
XML documents on
the run, Part 1 |
Use Web services
to integrate Web applications with
EISs
Use Web services
to integrate Web applications with
EISs |
Take the sting out of SAX
Take the sting out of SAX |
Jini's relevance emerges, Part
2
Jini's relevance emerges, Part
2 |
J2SE 1.4
breathes new life into the CORBA community, Part
1
J2SE 1.4
breathes new life into the CORBA community, Part
1 |
XML glossary
XML glossary |
Sun boosts
Sun boosts enterprise Java |
Quite poor
testing
Quite poor
testing |
The Java Web Services Tutorial
This tutorial is a beginner\'s guide to developing Web services and Web applications using the Java Web Services Developer Pack (Java WSDP). |
Excellent
tutorial on Struts and Tiles
Excellent tutorial on Struts and Tiles
This tutorial assumes knowledge of Java, JDBC, Servlets, J2EE (with regards to Web applications) and JSP Struts in a holistic manner, minus the beads and crystals.
The Tiles framework makes creating reusable pages |
XStream
XStream is a simple library to serialize objects to XML and back again. |
The State of JAXB: Availability, Suitability, Analysis, and Architecture
The State of JAXB: Availability, Suitability, Analysis, and Architecture
When working with XML in OO languages, there is little doubt that objects provide distinct advantages as compared to SAX, DOM, or raw XML. This process of working with XML and obj |
Parsing and Processing Large XML Documents with Digester Rules
Parsing and Processing Large XML Documents with Digester Rules
XML is commonly used for integration with third-party applications or web services, especially those that are running on non-Java platforms. On the other hand, if the code is running in a man |
The JavaTM Web Services Tutorial
A beginner's guide to developing Web services and Web applications on the Java Web Services Developer Pack |
JavaServer Faces Technology
JavaServer Faces technology is a server-side user interface component framework for Java technology-based Web applications. |
Generating an XML Document with JAXB
In this tutorial, JAXB is used to generate Java classes from an XML Schema. An example XML document shall be created from the Java classes. |
Open Source Web Frameworks in Java
Open Source Web Frameworks in Java
Open Source Web Frameworks in Java
Struts
Struts Frame work is the implementation of Model-View-Controller (MVC) design pattern for the JSP. Struts is maintained as a part of Apache Jakarta project and is open |
|
|
|