Programming Tutorials Browser Tutorials Articles Struts Tutorials Hibernate Tutorials

  Tutorial: Transparently cache XSL transformations with JAXP

Transparently cache XSL transformations with JAXP

Tutorial Details:

Transparently cache XSL transformations with JAXP
Transparently cache XSL transformations with JAXP
By: By Alexey Valikov
Boost performance and retain usability by implementing implicit caching inside transformer factories
o doubt, XSLT (Extensible Stylesheet Language Transformations) is a powerful technology that has many applications in the XML world. In particular, numerous Web developers can take advantage of XSLT at the presentation layer to gain convenience and flexibility. However, the price of these advantages is higher memory and CPU load, which makes developers more attentive to optimization and caching techniques when using XSLT. Caching is even more important in Web environments, where numerous threads share stylesheets.
In these cases, proper transformation caching proves vital for performance. A usual recommendation when using the Java API for XML Processing (JAXP) is to load transformations into a Templates object and then use this object to produce a Transformer rather than instantiate a Transformer object directly from the factory. That way, a Templates object may be reused to produce more transformers later and save time on stylesheet parsing and compilation. In " Top Ten Java and XSLT Tips ," Eric Burke gives the following code in Tip 1:
Source xsltSource = new StreamSource(xsltFile);
TransformerFactory transFact = TransformerFactory.newInstance();
Templates cachedXSLT = transFact.newTemplates(xsltSource);
Transformer trans = cachedXSLT.newTransformer();
In this example, transformation from the xsltFile is first loaded into the cachedXSLT Templates object, which is afterwards used to create a new transformer object, trans . The advantage is that later, when we need yet another transformer object, parsing and compilation phases may be skipped:
Transformer anotherTrans = cachedXSLT.newTransformer();
Although this technique positively influences performance (especially when using the same stylesheets repeatedly, like in Web applications), honestly, it is not convenient for the developer. The reason is, apart from the Templates -based transformer instantiation, you must care about observing the date of the last stylesheet modification, reloading outdated transformations, providing safe and efficient multithreaded access to the stylesheet cache, and many other small details. Even a natural move?encapsulating all the required functionality into a standalone transformer cache implementation?will not save a developer from third-party modules, which use standard JAXP routines without any caching. A good example of such a module is a JSTL x:transform tag: its current implementation in the org.apache.taglibs.standard.tag.common.xml.TransformSupport and org.apache.taglibs.standard.tag.el.xml.TransformTag classes directly uses the TransformerFactory 's newTransformer(...) method. Obviously, x:transform will not be able to take advantage of any external caching implementation.
There is, however, a simple and elegant solution to this problem. As long as JAXP allows us to replace a used implementation of the TransformerFactory , why don't we simply write a factory that would have intrinsic caching capabilities?
This idea is not difficult to implement. We could extend any suitable TransformerFactory implementation (I use Michael Kay's Saxon 7.3 ) and override the parent's newTransformer(...) method so that transformations loaded from the file-based stream sources are cached and returned from the cache, if the transformations were not modified since the last load. A new version of the newTransformer(...) method looks like the following:
public Transformer newTransformer(final Source source)
throws TransformerConfigurationException
{
// Check that source in a StreamSource
if (source instanceof StreamSource)
try
{
// Create URI of the source
final URI uri = new URI(source.getSystemId());
// If URI points to a file, load transformer from the file
// (or from the cache)
if ("file".equalsIgnoreCase(uri.getScheme()))
return newTransformer(new File(uri));
}
catch (URISyntaxException urise)
{
throw new TransformerConfigurationException(urise);
}
return super.newTransformer(source);
}
As you can see, if the transformer's source is not a stream source or does not point to a file, a parent implementation of newTransformer(...) returns the transformer. But, if the source is a file-based stream source, it gives us the possibility to implement more intelligent transformation loading with the help of a cache.
The caching algorithm for file-based stylesheets is quite simple: for a given file, we check if the transformation's Templates object with the same absolute file name is already stored in the cache. If it is not, we create and cache a new Templates object for this file. If something is already in the cache, we check if the file was updated since Templates was last loaded, comparing the date of the file's last modification with the cache entry. If the file was updated, Templates must be reloaded, otherwise it may be taken from the cache. Finally, with the Templates object (loaded from the cache or from the disk, depending on the situation), we simply produce a new transformer. An implementation of this algorithm is the following method:
protected Transformer newTransformer(final File file)
throws TransformerConfigurationException
{
// Search the cache for the templates entry
TemplatesCacheEntry templatesCacheEntry = read(file.getAbsolutePath());
// If entry is found
if (templatesCacheEntry != null)
{
// Check timestamp of modification
if (templatesCacheEntry.lastModified
< templatesCacheEntry.templatesFile.lastModified())
// Clear entry, if it is obsolete
templatesCacheEntry = null;
}
// If no templatesEntry is found or this entry was obsolete
if (templatesCacheEntry == null)
{
logger.debug("Loading transformation [" + file.getAbsolutePath() + "].");
// If this file does not exists, throw the exception
if (!file.exists())
{
throw new TransformerConfigurationException(
"Requested transformation ["
+ file.getAbsolutePath()
+ "] does not exist.");
}
// Create new cache entry
templatesCacheEntry =
new TemplatesCacheEntry(newTemplates(new StreamSource(file)), file);
// Save this entry to the cache
write(file.getAbsolutePath(), templatesCacheEntry);
}
else
{
logger.debug("Using cached transformation [" + file.getAbsolutePath() + "].");
}
return templatesCacheEntry.templates.newTransformer();
}
However, we must consider another issue: thread safety. As long as many concurrent threads share the cache, we must take certain precautions to make read (retrieving cache entries from the cache) and write (saving newly loaded stylesheets into the cache) operations safe. If speaking about the code above, read(...) and write(...) must not cause conflicts, even if running in several threads in parallel.
Although Java offers advanced synchronization services, the problem here is not synchronization as is, but the balance between synchronization and performance. The simplest solution is full synchronization: we declare the whole newTransformer(...) method synchronized and use a synchronized container to store the cache entries or access the cache in synchronized blocks, but all of this proves inefficient. As long as a limited number of stylesheets exists and they do not often change, the transformations cache will be more frequently read than written into. And full synchronization will block concurrent readers, which, first, is not always necessary and, second, may lead to a bottleneck.
On the other hand, using unsynchronized containers, like HashMap , to store cache entries is dangerous. If we don't take any measures, simultaneous reading and writing will (with a certain probability) cause a conflict leading to system instability.
What we basically have here is a classic readers/writers problem: for a given resource, there might be only one writer or several readers at any moment in time. This classic problem has a classic solution, which we will take from Doug Lea's Concurrent Programming in Java . The idea is to track execution state by counting active or waiting reading and writing threads, and allow reading only when no active writers exist and writing only when neither active readers nor writers exist.
To do that, we extract access to the cache into two methods, read() and write() :
protected TemplatesCacheEntry read(final String absolutePath)
{
beforeRead();
final TemplatesCacheEntry templatesCacheEntry =
(TemplatesCacheEntry) templatesCache.get(absolutePath);
afterRead();
return templatesCacheEntry;
}
protected void write(final String absolutePath, final TemplatesCacheEntry
templatesCacheEntry)
{
beforeWrite();
templatesCache.put(absolutePath, templatesCacheEntry);
afterWrite();
}
Two pairs of before / after , read / write methods perform thread synchronization, ensuring safe but efficient access to the cache:
protected synchronized void beforeRead()
{
while (activeWriters > 0)
try
{
wait();
}
catch (InterruptedException iex)
{
}
++activeReaders;
}
protected synchronized void afterRead()
{
--activeReaders;
notifyAll();
}
protected synchronized void beforeWrite()
{
while (activeReaders > 0 || activeWriters > 0)
try
{
wait();
}
catch (InterruptedException iex)
{
}
++activeWriters;
}
protected synchronized void afterWrite()
{
--activeWriters;
notifyAll();
}
Having realized access to the cache as shown above, we finally receive a transformer factory that transparently implements efficient caching of file-based stylesheets (you can download the full source code from Resources ). The only thing left is to make our factory available through standard JAXP routines.
Several approaches are available for making the TransformerFactory.newInstance() method return an instance of a custom transformer factory implementation. The most straightforward way specifies the factory's class name in the javax.xml.transform.Tran


 

Read Tutorial at: Click here to view the tutorial

Rate Tutorial:
Transparently cache XSL transformations with JAXP

View Tutorial:
Transparently cache XSL transformations with JAXP

Related Tutorials:

XSL gives your XML some style - JavaWorld June 2000
XSL gives your XML some style - JavaWorld June 2000
 
XML document processing in Java using XPath and XSLT - JavaWorld September 2000
XML document processing in Java using XPath and XSLT - JavaWorld September 2000
 
Leverage legacy systems with a blend of XML, XSL, and Java - JavaWorld October 2000
Leverage legacy systems with a blend of XML, XSL, and Java - JavaWorld October 2000
 
Solve your servlet-based presentation problems - JavaWorld November 2000
Solve your servlet-based presentation problems - JavaWorld November 2000
 
Cache in on faster, more reliable JSPs - JavaWorld May 2001
Cache in on faster, more reliable JSPs - JavaWorld May 2001
 
XML APIs for databases - JavaWorld January 2000
XML APIs for databases - JavaWorld January 2000
 
XSLT blooms with Java
XSLT blooms with Java
 
Boost Struts with
Boost Struts with XSLT and XML
 
XML glossary
XML glossary
 
Should you go with JMS?
Should you go with JMS?
 
Sun boosts
Sun boosts enterprise Java
 
Transparently cache XSL transformations with JAXP
Transparently cache XSL transformations with JAXP
 
FastParser 1.6.3
FastParser 1.6.9.1 XML Edition FastParser is a Java Xml parser High performance XML parser (benchmarks* : up to +100% faster compared to Xerces and JDK1.4 integrated parser) SAX Level 1 and 2 compliant DOM support JAXP compatibility Names
 
Template-Based Code Generation with Apache Velocity, Part 1
Template-Based Code Generation with Apache Velocity, Part I'm going to discuss template-based code generation, explain basic concepts related to templates and transformations, and demonstrate the huge benefits they can bring in code generation.
 
Parsing and Processing Large XML Documents with Digester Rules
Parsing and Processing Large XML Documents with Digester Rules XML is commonly used for integration with third-party applications or web services, especially those that are running on non-Java platforms. On the other hand, if the code is running in a man
 
XML Document Validation with an XML Schema
This tutorial explains the procedure of validating an XML document with an XML schema.
 
ehcache
Overview Ehcache is a pure Java, in-process cache with the following features: Fast Simple Acts as a pluggable cache for Hibernate 2.1. Small foot print. Both in terms of size and memory requirements. Minimal dependencies. Fully documented. S
 
Core Java Data Objects Excerpt
This book excerpt is from Core Java Data Objects,
 
FOP is the world's first print formatter driven by XSL formatting objects.
It is a Java application that reads a formatting object tree and then turns it into a PDF document. The formatting object tree, can be in the form of an XML document (output by an XSLT engine like XT or Xalan) or can be passed in memory as a DOM Document
 
Urchin RSS Aggregator
Urchin is a Web based, customisable, RSS aggregator and filter. It\'s primary purpose is to allow the generation of new RSS feeds by running queries against the collection of items in the Urchin database.
 
Site navigation
 

 

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright © 2006. All rights reserved.