Programming Tutorials Browser Tutorials Articles Struts Tutorials Hibernate Tutorials

  Tutorial: Master Merlin's new I/O classes

Master Merlin's new I/O classes

Tutorial Details:

Master Merlin's new I/O classes
Master Merlin's new I/O classes
By: By Michael T. Nygard
Squeeze maximum performance out of nonblocking I/O and memory-mapped buffers
ith the recent public beta release of J2SE (Java 2 Platform, Standard Edition) 1.4 (code-named Merlin), Sun has once again unleashed scores of new classes, features, and interfaces on unsuspecting Java developers. Because J2SE 1.3 focused only on performance improvements, J2SE 1.4 incorporates two years' worth of feature enhancements. Also, as the first J2SE release defined by the Java Community Process (JCP), Merlin reflects a wider array of interests than previous JDK releases.
A few obvious additions in Merlin have received most of the press so far, including the XML parser, secure sockets extension, and 2D graphics enhancements. This article introduces an exciting new API many have overlooked. The new I/O (input/output) packages finally address Java's long-standing shortcomings in its high-performance, scalable I/O. The new I/O packages -- java.nio.* -- allow Java applications to handle thousands of open connections while delivering scalability and excellent performance. These packages introduce four key abstractions that work together to solve the problems of traditional Java I/O:
A Buffer contains data in a linear sequence for reading or writing. A special buffer provides for memory-mapped file I/O.
A charset maps Unicode character strings to and from byte sequences. (Yes, this is Java's third shot at character conversion.)
Channel s -- which can be sockets, files, or pipes -- represent a bidirectional communication pipe.
Selector s multiplex asynchronous I/O operations into one or more threads.
A quick review
Before diving into the new API's gory details, let's review Java I/O's old style. Imagine a basic network daemon. It needs to listen to a ServerSocket , accept incoming connections, and service each connection. Assume for this example that servicing a connection involves reading a request and sending a response. That resembles the way a Web server works. Figure 1 depicts the server's lifecycle. At each heavy black line, the I/O operation blocks -- that is, the operation call won't return until the operation completes.
Figure 1. Blocking points in a typical Java server
Let's take a closer look at each step.
Creating a ServerSocket is easy:
ServerSocket server = new ServerSocket(8001);
Accepting new connections is just as easy, but with a hidden catch:
Socket newConnection = server.accept();
The call to server.accept() blocks until the ServerSocket accepts an incoming network connection. That leaves the calling thread sitting for an indeterminate length of time. If this application has only one thread, it does a great impression of a system hang.
Once the incoming connection has been accepted, the server can read a request from that socket, as shown in the code below. Don't worry about the Request object. It is a fiction invented to keep this example simple.
InputStream in = newConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(in);
LineNumberReader lnr = new LineNumberReader(reader);
Request request = new Request();
while(!request.isComplete()) {
String line = lnr.readLine();
request.addLine(line);
}
This harmless-looking chunk of code features problems. Let's start with blocking. The call to lnr.readLine() eventually filters down to call SocketInputStream.read() . There, if data waits in the network buffer, the call immediately returns some data to the caller. If there isn't enough data buffered, then the call to read blocks until enough data is received or the other computer closes the socket. Because LineNumberReader asks for data in chunks (it extends BufferedReader ), it might just sit around waiting to fill a buffer, even though the request is actually complete. The tail end of the request can sit in a buffer that LineNumberReader has not returned.
This code fragment also creates too much garbage, another big problem. LineNumberReader creates a buffer to hold the data it reads from the socket, but it also creates String s to hold the same data. In fact, internally, it creates a StringBuffer . LineNumberReader reuses its own buffer, which helps a little. Nevertheless, all the String s quickly become garbage.
Now it's time to send the response. It might look something like this (imagine that the Response object creates its stream by locating and opening a file):
Response response = request.generateResponse();
OutputStream out = newConnection.getOutputStream();
InputStream in = response.getInputStream();
int ch;
while(-1 != (ch = in.read())) {
out.write(ch);
}
newConnection.close();
This code suffers from only two problems. Again, the read and write calls block. Writing one character at a time to a socket slows the process, so the stream should be buffered. Of course, if the stream were buffered, then the buffers would create more garbage.
You can see that even this simple example features two problems that won't go away: blocking and garbage.
The old way to break through blocks
The usual approach to dealing with blocking I/O in Java involves threads -- lots and lots of threads. You can simply create a pool of threads waiting to process requests, as shown in Figure 2.
Figure 2. Worker threads to handle requests
Threads allow a server to handle multiple connections, but they still cause trouble. First, threads are not cheap. Each has its own stack and receives some CPU allocation. As a practical matter, a JVM might create dozens or even a few hundred threads, but it should never create thousands of them.
In a deeper sense, you don't need all those threads. They do not efficiently use the CPU. In a request-response server, each thread spends most of its time blocked on some I/O operation. These lazy threads offer an expensive approach to keeping track of each request's state in a state machine. The best solution would multiplex connections and threads so a thread could order some I/O work and go on to something productive, instead of just waiting for the I/O work to complete.
New I/O, new abstractions
Now that we've reviewed the classic approach to Java I/O, let's look at how the new I/O abstractions work together to solve the problems we've seen with the traditional approach.
Along with each of the following sections, I refer to sample code (available in Resources ) for an HTTP server that uses all these abstractions. Each section builds on the previous sections, so the final structure might not be obvious from just the buffer discussion.
Buffered to be easier on your stomach
Truly high-performance server applications must obsess about garbage collection. The unattainable ideal server application would handle a request and response without creating any garbage. The more garbage the server creates, the more often it must collect garbage. The more often it collects garbage, the lower its throughput.
Of course, it's impossible to avoid creating garbage altogether; you need to just manage it the best way you know how. That's where buffers come in. Traditional Java I/O wastes objects all over the place (mostly String s). The new I/O avoids this waste by using Buffer s to read and write data. A Buffer is a linear, sequential dataset and holds only one data type according to its class:
java.nio.Buffer
Abstract base class
java.nio.ByteBuffer
Holds bytes. Can be direct or nondirect. Can be read from a ReadableByteChannel . Can be written to a WritableByteChannel .
java.nio.MappedByteBuffer
Holds bytes. Always direct. Contents are a memory-mapped region of a file.
java.nio.CharBuffer
Holds char s. Cannot be written to a Channel .
java.nio.DoubleBuffer
Holds doubles. Cannot be written to a Channel .
java.nio.FloatBuffer
Holds floats. Can be direct or nondirect.
java.nio.IntBuffer
Holds ints. Can be direct or nondirect.
java.nio.LongBuffer
Holds longs. Can be direct or nondirect.
java.nio.ShortBuffer
Holds shorts. Can be direct or nondirect.
Table 1. Buffer classes
You allocate a buffer by calling either allocate(int capacity) or allocateDirect(int capacity) on a concrete subclass. As a special case, you can create a MappedByteBuffer by calling FileChannel.map(int mode, long position, int size) .
A direct buffer allocates a contiguous memory block and uses native access methods to read and write its data. When you can arrange it, a direct buffer is the way to go. Nondirect buffers access their data through Java array accessors. Sometimes you must use a nondirect buffer -- when using any of the wrap methods (like ByteBuffer.wrap(byte[]) ) -- to construct a Buffer on top of a Java array, for example.
When you allocate the Buffer , you fix its capacity; you can't resize these containers. Capacity refers to the number of primitive elements the Buffer can contain. Although you can put multibyte data types (short, int, float, long, and so on) into a ByteBuffer , its capacity is still measured in bytes. The ByteBuffer converts larger data types into byte sequences when you put them into the buffer. (See the next section for a discussion about byte ordering.) Figure 3 shows a brand new ByteBuffer created by the code below. The buffer features a capacity of eight bytes.
ByteBuffer example = ByteBuffer.allocateDirect(8);
Figure 3. A fresh ByteBuffer
The Buffer 's position is the index of the next element that will be written or read. As you can see in Figure 3, position starts at zero for a newly allocated Buffer . As you put data into the Buffer , position climbs toward the limit. Figure 4 shows the same buffer after the calls in the next code fragment add some data.
example.put( (byte)0xca );
example.putShort( (short)0xfeba );
example.put( (byte)0xbe );
Figure 4. ByteBuffer after a few puts
Another of the buffer's important attributes is its limit. The limit is the first element that should not be read or written. Attempting to put() past the limi


 

Read Tutorial at: Click here to view the tutorial

Rate Tutorial:
Master Merlin's new I/O classes

View Tutorial:
Master Merlin's new I/O classes

Related Tutorials:

Saving and retrieving objects with Java
Saving and retrieving objects with Java
 
Programming Java threads in the real world, Part 1 - JavaWorld - September 1998
Programming Java threads in the real world, Part 1 - JavaWorld - September 1998
 
Java Tip 74: Build dynamically extensible frameworks - JavaWorld
Java Tip 74: Build dynamically extensible frameworks - JavaWorld
 
Simple handling of network timeouts - JavaWorld September 1999
Simple handling of network timeouts - JavaWorld September 1999
 
A promise of easier embedded-systems networking - JavaWorld November 1999
A promise of easier embedded-systems networking - JavaWorld November 1999
 
Make room for JavaSpaces, Part 2 - JavaWorld January 2000
Make room for JavaSpaces, Part 2 - JavaWorld January 2000
 
Build an object database - JavaWorld January 2000
Build an object database - JavaWorld January 2000
 
Build an object database, Part 2: Object storage backend - JavaWorld April 2000
Build an object database, Part 2: Object storage backend - JavaWorld April 2000
 
Create dynamic images in Java servlets - JavaWorld May 2000
Create dynamic images in Java servlets - JavaWorld May 2000
 
Make room for JavaSpaces, Part 5 - JavaWorld June 2000
Make room for JavaSpaces, Part 5 - JavaWorld June 2000
 
Sockets programming in Java: A tutorial - JavaWorld December 1996
Sockets programming in Java: A tutorial - JavaWorld December 1996
 
The magic of Merlin - JavaWorld March 2001
The magic of Merlin - JavaWorld March 2001
 
Master Java with these introductory books - JavaWorld May 2001
Master Java with these introductory books - JavaWorld May 2001
 
Master Merlin's new I/O classes
Master Merlin's new I/O classes
 
Unwrap the package statement's potential
Unwrap the package statement's potential
 
Manage users with JMS
Manage users with JMS
 
Use select for high-speed networking
Use select for high-speed networking
 
Nice widget
Nice widget
 
Building Highly Scalable Servers with Java NIO
Building Highly Scalable Servers with Java NIO I/O Event Handling The I/O architecture of our router was strongly inspired by the Swing event-dispatch model. In Swing, events generated by the user interface are received by the JVM and stored in an even
 
JLAN Server v3.5 - Database Filesystems
*JLAN Server* is a high performance Java based file server supporting Windows file sharing (SMB/CIFS), NFS and FTP protocols.
 
Site navigation
 

 

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright © 2006. All rights reserved.