Programming Tutorials Browser Tutorials Articles Struts Tutorials Hibernate Tutorials

  Tutorial: Java decompilers compared - JavaWorld - July 1997

Java decompilers compared - JavaWorld - July 1997

Tutorial Details:

Java decompilers compared
Java decompilers compared
By: By Dave Dyer
Our detailed examples of how 3 top decompilers handle an extensive test suite will help you determine which, if any, meet your needs
he object of a Java decompiler is to convert Java class files into Java source code. In the chaotic world of software development there are many reasons, legitimate and otherwise, to wish for such a tool. Decompilers can save the day when you have the binary for your own code, but have misplaced or otherwise lost the corresponding source code. On the other hand, decompilers are the prized components of any good software piracy kit. Most often, however, decompilers help programmers clarify poor documentation (one decompiled function is worth a thousand words) or provide a means for creating not-yet-written documentation. When was the last time you thought the documentation for any software was complete and correct?
In any case, the transparent and information-rich structure of Java class files -- a feature that makes Java's dynamic linking much better than previous models -- also makes such tools particularly easy to build. In fact, there is an arms race brewing between decompilers and so-called obfuscators , which profess to provide Java code some measure of protection from decompilers. In essence, obfuscators remove all non-essential symbolic information from your class files and, optionally, replace it with fake symbolic information designed to confuse the decompiler. Crema, the companion obfuscator to the Mocha decompiler, was examined in detail in the December issue of JavaWorld . (See the Resources section at the end of this column for a link to this article and to several obfuscator products.)
Product overview
I'll be reviewing three Java decompilers in this article: DejaVu, Mocha, and WingDis. These products are the only commercial decompilers I'm aware of, but surely there are more to come.
DejaVu, distributed as part of Innovative Software's OEW for Java development environment, appears to be completely independent of it. DejaVu is available on a trial basis for free.
Mocha, the first and most widely known decompiler, is free. Although Mocha's creator, Hanpeter van Vliet, met with an untimely demise, you can still obtain a copy of the program free of charge on the Web. An official descendant of Mocha will probably be commercially available before long.
WingDis version 2.06, a product from WingSoft, is available free as a crippled demo version and as a time-limited fully capable trial version. The full version costs $29.95.
See the Resources section at the end of this article for more information on where to find each of these products.
Each of these tools is 100% Pure Java, so the essential distribution consists of a Java class library and instructions to invoke it. They're all a little quirky to set up and use, a characteristic shared by many standalone Java applications.
These are all command-line-oriented tools, so the most practical way to invoke them is to embed the detailed class path and other invocation instructions in a command file. Unfortunately, there is no standardized way to do this; the details vary depending on your choice of operating system. However, once you've conquered the setup, the decompilers easily produce output that is virtually compiler-ready.
Testing method
I chose a small utility library, consisting of about 15 classes, as my standard test set. I compiled the library using JDK 1.02, with optimization (with the -o switch) and without debugger information (without the -g switch); settings which correspond to how most Java code would actually be delivered. I decompiled the class files with each of the three decompilers, then manually edited the decompiled sources until they could be successfully recompiled. I then decompiled these three sets of "second-generation" binaries with each of the three decompilers, yielding nine sets of "third-generation" sources. Once I had my data, I manually compared various pairs of sources, looking for inconsistencies that might indicate incorrectly decompiled code.
Keep in mind that in performing this set of tests I had the luxury of referring to the original sources at any time, and the double luxury of having written these sources myself -- two advantages not generally available to anyone using a decompiler in earnest.
I organized decompilation errors into the categories described below. I've based the class error types 1 through 6 (class 1 being the least offensive) on my assumption that easy-to-spot and easy-to-fix errors are less significant than hidden or hard-to-fix errors. In the last portion of this article I'll examine detailed code examples of these error types.
Class 1 errors
Description: Errors flagged by the compiler that are easily fixed
Examples: Boolean variable incorrectly identified as an int ; missing, but trivial, type cast
Class 2 errors
Description: Errors flagged by the compiler that are not easily fixed
Example: Generating code containing goto
Class 3 errors
Description: Errors that create ugly and incomprehensible, but correct code
Examples: Unreconstructed flow control; unreconstructed use of
+ for string appends
Class 4 errors
Description: Errors that cause subtle misprints and create subtly incorrect code
Examples: Failing to use \ to escape characters in string constants; misprinting
character constants
Class 5 errors
Description: Errors that cause total failure
Example: Crashing without producing output
Class 6 errors
Description: Errors not flagged by the compiler that result in severely damaged
semantics
Example: Misuse or non-use of this , and other patently incorrect code
The following table shows you which decompiler is guilty of which type of error.
Decompiler errors by type
Class 1 errors
Class 2 errors
Class 3 errors
Class 4 errors
Class 5 errors
Class 6 errors
DejaVu version 1.0
Several
No
Major problem with flow analysis
Yes
No
No
Mocha version beta 1
Several
No
No
No
Crashes on some class files
No
WingDis version 2.06
One
No
Overuse of if(x!=false) and similar construction
No
No
Misuse or non-use of super ; mistranslation of x=a++ to a++; x=a;
Caveat emptor: The test set was not specifically designed to validate or torture the decompilers, and it is impossible to know if the results here are representative of all classes, or if the list of problems encountered is complete.
Let's get to the heart of the matter and see some of my testing in action. The remainder of this article provides the actual code examples of the tests, which will allow you to see how the individual decompilers fared on each class of error.
Class 1 errors: Errors flagged by the compiler that are easily fixed
All three decompilers sometimes failed to infer a Boolean type for integer operations, although it is interesting to note that they failed in different places.
Example 1: Missed inference to Boolean
PrintStream PrintStream()
{
return new PrintStream(outputstream, 1); // 1 should be true
}
At the level of bytecodes, Boolean does not exist as a type; rather, Boolean exists as a special subclass of integer , and the Boolean nature of variables has to be deduced. In the case shown above, 1 should have been true, which could have been deduced by examing the definition of Printstream . Example 2: Beautiful, but it's not Java
Mocha transformed a static initializer into an elegant, but illegal, construction:
public ConsoleWindow(String string, int i1)
{
dead = false;
styles = { "Plain", "Bold", "Italic" };
sizes = { "8", "9", "10", "12", "14", "16", "18", "24" };
...
Bracketed initializer lists for arrays are valid only as initializers for variable declarations (either class or local), not for other assignments. The reason for this differentiation is obscure to me, but I'm sure Sun must have had a reason. In any case, it's apparent that these initializers are actually implemented by inline code inside constructors, generated by the compiler.
When decompiling this same static initializer, WingDis produced equally beautiful and syntactically correct code. Unfortunately, the code was not semantically correct, which results in a class 6 error type .
Using this same static initializer, DejaVu emitted perfectly legal (but ugly) code, as shown in this snippet:
public ConsoleWindow(String arg1, int arg2) {
...
String[] Har1;
Har1 = new String[3];
Har1[0] = "Plain";
Har1[1] = "Bold";
Har1[2] = "Italic";
this.styles = Har1;
...
Class 2 errors: Errors flagged by the compiler that are not easily fixed
The ability to reverse-engineer code and reproduce the same for , while , or if statements as the original code is the most surprising (approaching magical) capability of Java decompilers. Java is "decompiler friendly" in several ways:
At the level of bytecodes, the much-maligned goto statement is the workhorse within any function, so the task of inferring the original structure from raw goto s is daunting indeed. In Java, however, there are no explicit goto statements added by the programmer. If any goto s do exist in the code to be decompiled, they must be part of some higher-level construction.
The set of control structures in Java is small, and compilers compile them in fairly stylized ways.
The Java compiler technology is immature. Highly optimizing compilers (which will eventually appear) will be able to transform code much more significantly than do current compilers.
There is a close semantic match between Java source code and Java bytecode.
Earlier versions of WingDis sometimes produced code containing incorrect goto statements. These erroneous statements were nearly impossible to understand and a royal pain to recode correctly. I'm pleased to report that this class of error seems to be extinct. The reviewed version of WingDis seems to flawlessly job handle flow analysis, as does Mocha. However, despite their success on my test cases, I'm su


 

Read Tutorial at: Click here to view the tutorial

Rate Tutorial:
Java decompilers compared - JavaWorld - July 1997

View Tutorial:
Java decompilers compared - JavaWorld - July 1997

Related Tutorials:

SQLJ: The 'open sesame' of Java database applications
SQLJ: The 'open sesame' of Java database applications
 
3D graphics programming in Java, Part 3: OpenGL
3D graphics programming in Java, Part 3: OpenGL
 
Enhance your Java application with Java Native Interface (JNI)
Enhance your Java application with Java Native Interface (JNI)
 
Track wireless sessions with J2ME/MIDP
Track wireless sessions with J2ME/MIDP
 
Ilog JRules 4.0: Working by the rules
Ilog JRules 4.0: Working by the rules
 
J2ME devices: Real-world performance
J2ME devices: Real-world performance
 
Update distributed applications
Update distributed applications
 
Sort it out
Sort it out
 
Make the Java-Oracle9i connection
Make the Java-Oracle9i connection
 
Worth reading
Worth reading
 
FastParser 1.6.3
FastParser 1.6.9.1 XML Edition FastParser is a Java Xml parser High performance XML parser (benchmarks* : up to +100% faster compared to Xerces and JDK1.4 integrated parser) SAX Level 1 and 2 compliant DOM support JAXP compatibility Names
 
The State of JAXB: Availability, Suitability, Analysis, and Architecture
The State of JAXB: Availability, Suitability, Analysis, and Architecture When working with XML in OO languages, there is little doubt that objects provide distinct advantages as compared to SAX, DOM, or raw XML. This process of working with XML and obj
 
Comparing The Performance of J2EE Servers
Performance ReportThe standardization of the application server, thanks to Sun\'s J2EE specifications, has spawned a wealth of implementations. There are offerings from big players such as Sun, IBM, BEA and Oracle as well as numerous offerings from low-co
 
Jython
Get to know Jython, in this first article in a new series introducing alternate languages for the Java Runtime Environment, alt.lang.jre. Jython is an implementation of the popular scripting language Python, but running on a JVM. For Python developers Jyt
 
PDFTextStream v1.1.2 Released; Fast Java PDF Text Extraction
PDFTextStream is the ideal solution for Java applications and J2EE web services that need to rapidly and accurately extract text and document metadata from PDF files.
 
JavaMatch
What is JavaMatch? JavaMatch is an engine that can search inside a runtime Java data structures, and look for objects that best match the criteria that you specify. JavaMatch is a generic match engine, not targeted at a specific domain. It can be applied
 
Using the ASM Toolkit for Bytecode Manipulation
Using the ASM Toolkit for Bytecode Manipulation Sometimes Java developers need to generate or change Java bytecode in the runtime. Is can be necessary for AOP or debugging, or even for performance optimization. There are several frameworks available that
 
Develop WAP Applications with Java Servlets and JavaServer PagesTM
WAP, the Wireless Application Protocol, was designed to take advantage of the several data-handling approaches already in use. WAP integrates the Handheld Device Markup Language (HDML) and the Handheld Device Transport Protocol (HDTP) developed by Unwired
 
Power Messaging, Maps and more...
BuddySpace is an instant messenger with four novel twists: (1) it allows optional maps for geographical & office-plan visualizations in addition to standard 'buddy lists'; (2) it is built on open source Jabber, which makes it interoperable with ICQ, MSN,
 
Service Orchestration - Cornerstone for Building Service-Oriented Architecture
This Web Cast explains the Service-Oriented Architecture (SOA). Service-oriented architecture is rapidly becoming the cornerstone for enterprise infrastructure, bringing cost reductions and increasing IT and business responsiveness.
 
Site navigation
 

 

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright © 2006. All rights reserved.