Tweak your IO performance
for faster runtime - JavaWorld November
2000
Tutorial Details:
Tweak your IO performance for faster runtime
Tweak your IO performance for faster runtime
By: By Brian Goetz
Increase the speed of Java programs by tuning IO performance
ost of you are probably tired of hearing the standard claim that Java programs are slower than C programs. In reality, the situation is more complex than that trite assertion. Many Java programs are slow, but that is not necessarily an intrinsic characteristic of programs written in Java. Many Java programs can perform just as fast as comparable programs written in C or C++, but only when the designer and programmer pay careful attention to performance issues throughout the development process.
In this article, I will focus on tuning IO performance. Many applications spend much of their runtime on network or file IO, and poorly performing IO code can run many times slower than well-tuned IO code.
General performance-tuning concepts
Some general concepts appear over and over again when assessing the performance of Java programs. The examples in this article focus specifically on tuning an IO-bound application, but you can apply those principles to many other performance situations.
Perhaps the most important principle is this: Measure early, measure often . You can't effectively manage performance if you don't know the source of your problem. Programmers are notoriously bad at guessing where performance problems lie. Spending days tuning a subsystem that accounts for 1 percent of an application's total runtime simply cannot yield more than a 1 percent improvement in application performance. Instead of guessing, use performance measurement tools -- such as profiling or nonintrusive time-stamped logging -- to identify where your application spends its time and to focus your energy on those hot spots. After you tune, measure again. Not only will measuring help you concentrate your energy where it will count the most, but it will also provide evidence of the success -- or lack thereof -- of your efforts.
In tuning a program for performance, you may want to measure a number of quantities such as total runtime, average memory usage, peak memory usage, throughput, request latency, and object creations. Which factors you should focus on depend on your situation and performance requirements. Several good commercial profiling tools are available that can help you measure many of those quantities, but you don't need to buy an expensive package to collect useful performance data.
In the performance data gathered for this article, I focused entirely on runtime and, to gather my measurements, I used a class similar to the simple Timer class below. (It can easily be extended to support additional operations such as pause() and restart() .) Timer also allows you to gather timing information without cluttering your output with time-stamped logging statements, which can also distort runtime measurements since the logging statements may create objects and perform IO.
public class Timer {
// A simple "stopwatch" class with millisecond accuracy
private long startTime, endTime;
public void start() { startTime = System.currentTimeMillis(); }
public void stop() { endTime = System.currentTimeMillis(); }
public long getTime() { return endTime - startTime; }
}
One of the most common causes of Java performance problems is the excessive creation of temporary objects. While newer Java VMs more effectively reduce the performance impact of creating many small objects, the fact remains that object creation is still an expensive operation. Since strings are immutable, the String class is one of the biggest offenders; every time a String is modified, one or more new objects is created. That brings us to our second performance principle: Avoid excessive object instantiations .
IO performance tuning
Many applications process large volumes of data, and IO is one of those areas where small differences in implementation can have a large impact on performance. I derived this article's examples from tuning a text-processing application that consumes and analyzes large quantities of text. The amount of time spent reading and processing the input was significant, and the methods employed in tuning the application provide good examples of the performance principles stated above.
One of the principal culprits that affects Java IO performance is the use of character-by-character IO -- calling the InputStream.read() or the Reader.read() methods to read one character. Java inherited that idiom from C programming, in which it is quite common -- and efficient -- to read a file by repeatedly calling getc() . In C, character IO is fast because the getc() and putc() functions are implemented as macros and provide buffered file access, which means they only take a few machine instructions to execute. In Java, you encounter a quite different situation. Not only do you incur the cost of one or more method calls for each character but, more significantly, if you don't use any sort of buffering, you also suffer the cost of a system call to obtain the character. While a Java program that relies on read() may look and function just like its C counterpart, it will not perform in the same way. Fortunately, Java offers several easy approaches to achieving higher performance IO.
You can address buffering in one of two ways -- use the standard BufferedReader and BufferedInputStream classes or use the block-read methods to read larger blocks of data at a time. The former offers a quick and easy solution, and substantially improves performance with little additional coding and opportunity for error. The do-it-yourself technique offers a somewhat more complicated -- although still not difficult -- remedy, and it is even faster. It also features some additional advantages, which I will describe later.
To measure the effect of different IO buffering strategies, I wrote six small programs that read several hundred files and examine each character. Table 1 shows the runtimes of those six programs, using five commonly used Linux Java VMs -- the Sun 1.1.7, 1.2.2, and 1.3 Java VMs, and the 1.1.8 and 1.3 Java VMs from IBM.
The programs are:
RawBytes : Read data one byte at a time, using FileInputStream.read()
RawChars : Read data one char at a time, using FileReader.read()
BufferedIS : Wrap FileInputStream with BufferedInputStream and read data one byte at a time with read()
BufferedR : Wrap FileReader with BufferedReader and read data one char at a time with read()
SelfBufferedIS : Read data 1 K at a time, using FileInputStream.read(byte[]) , and access the data from the buffer
SelfBufferedR : Read data 1 K at a time, using FileReader.read(char[]) , and access the data from the buffer
Table 1. Runtimes of six programs using common Linux VMs
Runtime
Sun 1.1.7
IBM 1.1.8
Sun 1.2.2
Sun 1.3
IBM 1.3
RawBytes
20.6
18.0
26.1
20.70
62.70
RawChars
100.0
235.0
174.0
438.00
148.00
BufferedIS
9.2
1.8
8.6
2.28
2.65
BufferedR
16.7
2.4
10.0
2.84
3.10
SelfBufferedIS
2.1
0.4
2.0
0.61
0.53
SelfBufferedR
8.2
0.9
2.7
1.12
1.17
We can draw several obvious conclusions from the performance data in Table 1 (measured as total runtime to process several hundred text files after adjusting for Java VM and program startup):
InputStream runs faster than Reader . A char uses two bytes to represent a character, whereas byte requires only one, so using byte to store characters requires less memory and fewer machine instructions. More importantly, using byte eliminates the need to perform Unicode conversion. The choice of character representation will depend on your requirements -- if you need internationalization support, you will have to use char , but if you read from a source that is inherently ASCII (such as HTTP or MIME headers) or you know the input text will be English, you can use the faster byte .
Unbuffered character IO runs really slow. Character IO is bad, but unbuffered character IO is terrible. At the very least, you should wrap your streams with buffered streams, which can make your IO code run up to 10 times faster.
Block IO with self-buffering runs even faster than character IO with buffered streams. While buffered streams eliminate the system-call overhead of reading each character, which is significant, they still require one or more method calls in the middle of a tight loop. Self-buffered IO code runs 2 to 4 times faster than buffered streams and 4 to 40 times faster than unbuffered IO.
One item that is not obvious from the chart is that character IO can undermine the performance advantages of faster Java VMs. In most performance tests, the IBM 1.1.8 Linux Java VM runs approximately twice as fast as the Sun 1.1.7 Linux Java VM. But the RawBytes and RawChars tests show the two as equally slow. The extra time they spend executing system calls to read the data negates the speedup offered by the faster Java VM.
Using block IO results in another subtle benefit. When used, buffered character IO sometimes requires more coordination between components, creating more room for errors. Often, a component will complete the IO on behalf of an application; the application passes a Reader or InputStream to the component, which will then consume the contents of the stream. Some components may erroneously assume the stream will be a buffered stream without documenting that requirement. Or the application programmer may fail to notice when the component does document that condition. In such cases, the IO ends up unintentionally unbuffered, with serious performance consequences. When using block IO instead, that confusion cannot occur. (It is usually better to design components so that they cannot be misused, rather than rely on the documentation to ensure their proper use.)
Those simple tests show that the most obvious way to complete a simple task such as reading text can run 40-60 times slower than a more carefully chosen alternative. In those tests, the programs did
Read
Tutorial at: Click here to view the tutorial
Rate Tutorial: Tweak your IO performance
for faster runtime - JavaWorld November
2000
View Tutorial: Tweak your IO performance
for faster runtime - JavaWorld November
2000
Related
Tutorials:
Accelerate your RMI
programming
Accelerate your RMI
programming |
Reflection
vs. code generation
Reflection
vs. code generation |
XSLT blooms with
Java
XSLT blooms with
Java |
Write once,
persist anywhere
Write once,
persist anywhere |
Put Java in the fast lane
Put Java in the fast lane |
J2ME devices:
Real-world performance
J2ME devices:
Real-world performance |
Got
resources?
Got
resources? |
High-availability mobile applications
High-availability mobile applications |
Worth
reading
Worth
reading |
JView 2004 - J2EE Performance Management Software - Version 1.4
JView 2004 Enterprise Edition available now!
JView 2004 is the leading J2EE Production Performance Monitor and Java Performance Tuning solution, providing precise real-time performance metrics to assist you in ensuring your performance requirements a |
Asynchronous IO for Java
What is Asynchronous IO for Java?
Asynchronous IO for JavaTM (AIO4J) is a package that provides the capability to perform input and output (IO) on sockets and files asynchronously -- that is, where the Java application can request the operation but can |
g4j - GMail API for Java
GMailer API for Java (g4j) is set of API that allows Java programmer to communicate to GMail. With G4J programmers can made Java based application that based on huge storage of GMail. |
Using SSL with Non-Blocking IO
Using SSL with Non-Blocking IO
After the initial experiments with Java NIO, most developers start wondering about security; in particular, how to use SSL with Java NIO. With the traditional blocking sockets API, security is a simple issue: just set up an |
Access Windows Performance Monitor counters from Java, Part 1
Access Windows Performance Monitor counters from Java, Part 1
Use a simple Java API to gather valuable performance statistics
Summary
Windows NT, 2000, 2003, and XP contain a utility called the Performance Monitor that provides a rich array of perform |
Practically Groovy: Unit test your Java code faster with Groovy
Why unit test with Groovy?
What makes Groovy particularly appealing with respect to other scripting platforms is its seamless integration with the Java platform. Because it's based on the Java language (unlike other alternate languages for the JRE, which |
JTimepiece
JTimepiece is the advanced library for working with dates and times in Java. Many easy-to-use methods in this API make it easy for any developer, from beginner to expert, to use JTimepiece. |
Asking "Why" at Sun Laboratories: A Conversation with Director, Glenn Edens
Sun Laboratories Director, Glenn Edens, discusses new research developments in the Java language and the gratifications and trials of running a research lab. |
What is Web Hosting
What is Web Hosting
What is Web Hosting?
What is Web Hosting?
If you have a company and want web presence than you need a website. With the website any one from the world must be able to view your pages, images etc.
Website is actually a |
New Technical Articles: 64-bit Programming on Solaris 10 OS for x86 Platforms
Four technical articles describe the new Sun Studio 10 software's 64-bit programming features on the Solaris 10 OS for x86 and AMD64 platforms. Important issues regarding the AMD64 ABI (Application Binary Interface), debugging, migration to 64-bits, and p |
Sun Incorporates Latest AMD Opteron Processor Enhancements
New performance enhancements, features, and pricing promotions have been added to Sun's x64 systems based on on AMD Opteron processors. These performance updates optimize the innovative features offered in the Solaris 10 OS. |
|
|
|