Programming Tutorials Browser Tutorials Articles Struts Tutorials Hibernate Tutorials

  Tutorial: JavaWorld - Distributed applet-based massively parallel processing (DAMPP) - January 1997

JavaWorld - Distributed applet-based massively parallel processing (DAMPP) - January 1997

Tutorial Details:

Create your own supercomputer with Java
Create your own supercomputer with Java
By: By Laurence Vanhelsuwé
By combining Java with the Web's collection of computers, you too can have the processing power you crave -- without buying a Cray
eymour Cray's Cray and Danny Hillis's Connection Machine -- these two types of supercomputers are what dreams are made of, at least to programmers who, like myself, hunger for infinite processing power. These Testarossas of the number-crunching world can simulate weather, car crashes and impacting anti-tank shells, miniature ecosystems and the global economy, the effects of imaginary pollution disasters, the internals of stars, and the detonation of an H-bomb.
Supercomputers are the ultimate kit for more than day-dreaming power-programmers. Superpower governments and giant multinationals rely heavily on them. Their astronomical costs have meant that few programmers have had the pleasure of even writing a single machine instruction for these ivory-tower instruments of knowledge.
But what does all this have to do with Java?
"Knock, knock. Can I borrow your CPU for a minute?"
Consider for a second your own PC: You're most likely staring at a Pentium-class machine, with 16 or more megabytes of RAM. Now look at your humble telephone socket: For many of us, this is your PC's connection to the Internet, linking your machine to millions more of these same Pentium-class machines. From this perspective, the Internet metamorphoses itself as the largest supercomputer in the world. And all that is necessary to harvest its awesome potential is to convince as many as possible of these Internet-connected machines to do some work for you, while you sit back, still staring at your diminutive PC.
But how do you get millions of Internet nodes to execute some code?
Enter the ubiquitous Web browser (which by now is likely to be Java-compatible). Most applets embedded in a Web page these days range in usefulness from the entertaining but excrutiatingly slow to the annoying crash-on-arrival type. Why not distribute some useful applets for a change, like ones tackling a massive computational problem whose solution could benefit the world, instead of the military or some profit-obsessed oil giant?
Raytracing as an example DAMPP application
Since I am hardly a nuclear physicist or a meteorologist or a world-renowned economist, I have chosen to tackle a subject most computer users and programmers will be more comfortable with: raytraced image generation. Raytracing is a technique that to date has produced the most life-like, photorealistic synthetic images to come out of machines. Some of the best, mind-blowing movie special effects in these past few years were done using raytracing, although most are generated using much cheaper (that is, faster) techniques that end up producing less realistic images or animation.
Although raytracing is not what this article is about, I shall give you a very brief summary of the technique: Computer generated images all have at their heart a mathematical description of the scene to be generated. Life-like images are produced from three-dimensional mathematical descriptions, where the scene or model is composed out of simple building blocks like flat polygons, spheres, cubes, cones, and lines. To obtain realistic images, the algorithm must model color and light and all its physical side-effects (shadows, reflections, transparancy, brightness, and darkness) as close to the real phenomenon as possible. This is where the cheaper image-rendering algorithms use shortcuts that trade image quality for speed of generation. Raytracing does the opposite. Its goal is maximum realism at the expense of quick rendering.
Raytracing models reality very closely indeed. It attempts to model rays of light that, after emanating from some light source(s), interact with the model and eventually end up hitting your eye's retina. Concretely, the raytracing algorithm is extremely simple, in principle: For each image pixel to be rendered, it traces an imaginary beam of light from your eye (the viewpoint), going through the pixel and intersecting at some point the computer model of the scene to be drawn. At such an intersection the algorithm determines the pixel's color (and intensity) by tracing more imaginary light beams, this time emanating from the light source(s) in the scene's model. Hence ray-tracing, a very elegant but incredibly compute-intensive solution. (It is essentially a brute-force approach.)
Raytracing has one characteristic that's quite relevant to parallel computation: Each pixel can be computed totally independently from all others. In parallel jargon, there are no data dependencies. This means raytracing can be easily adapted to a parallel distributed environment, which is exactly what I have done with a raytracer written in Java, by Frederico Inacio de Moraes (who kindly gave me permission to use his code for this project).
The DAMPP Design: JobMasters and WorkerApplets
Let us now forget about raytracing and design the system this article is all about: a distributed applet-based (massively) parallel processing (DAMPP) system.
We need applets to calculate some parts of a parallel computation and we need to somehow assemble all the produced results in one central location. That central location will be the WWW server that carries the Web page containing our applets. These millions of applets, working like ants toward a common goal, are in fact just clones. So from now on I will simply talk about the WorkerApplet . (Keep in mind, though, that hundreds or thousands of these cloned applets can be instantiated at any one time, working in parallel on a given problem.)
On the WWW server runs another server program (along with the HTTP server), which I call the JobMaster . It is the command and control center of a DAMPP system. See the DAMPP high-level protocol diagram for the big picture.
DAMPP high-level protocol
Here's the scenario of the two components (WorkerApplet and JobMaster) interacting with each other:
A user surfs the Web and hits the page containing the worker applet. The worker applet loads and starts running immediately. The user is not even aware of this fact since the applet is totally invisible; it does not boast flashy animations or a GUI. It is simply a harmless visitor process in the machine.
The first thing the applet does is call home to tell the JobMaster it arrived safe and sound, and is ready to do some work (Step 1 in the DAMPP high-level protocol diagram). The JobMaster responds by issuing the applet with a job "specification" (Step 2). This is a parcel of information that allows the applet to start working on a sub-part of the big problem being attacked (in our raytracing example application, each applet calculates individual image lines).
Since Web users don't surf the Web for days on end, the applet knows it's living on borrowed time. At any moment the user can decide that surfing real waves is more exciting than surfing cyberwaves, and so kill his browser along with our poor applet. Worker applets therefore are designed to be dispensible, thus conforming nicely to our current throw-away consumer society . DAMPP is designed to cope with applets being killed prematurely; all it does is eventually re-issue the same job to another applet on another surfer's computer.
Whenever an applet does manage to complete its calculations, it contacts the JobMaster again to hand over its results (Step 3) - and then, like the male bee for which there is no more use, commits suicide.
If we switch our perspective to that of the JobMaster, we see two types of incoming communications: job requests and returned results. To simplify implementation and enhance performance, these two very different streams will use separate channels and both rely on UDP (User Datagram Protocol) for the underlying communication protocol. UDP is one of the TCP/IP protocols, sitting between IP and TCP as far as complexity and capabilities are concerned. I chose the UDP protocol over the more-frequently used TCP protocol for three reasons:
UDP's connectionless communication mode is perfectly suited to our design
UDP consumes less Internet bandwidth than TCP for the same amount of information transferred
UDP is faster
UDP has one disadvantage: It does not guarantee delivery of information. (That is why TCP, which does guarantee delivery, is used more often.) But since the number of applet hosts on the Internet is, in all practicality, limitless, any lost results or any applet-server rendezvous gone wrong can be safely ignored. This is analoguous to your body not blinking an eyelid at the death of one of its own cells: Statistically and practically, it is a meaningless event. The design of the DAMPP computation engine can therefore rely, in a similar vein as our bodies, on massive built-in redundancy to achieve its goals with a very high degree of probability.
The overall picture is now clear: A swarm of applets passively and invisibly are being executed by client machines all over the world, contacting their home server to request jobs, executing those jobs and returning results to the server.
DAMPP implementation details
We are now ready to talk Java code!
The whole DAMPP system consists of just five classes: three classes for the applet and three for the server (one class is shared by both). These classes are, for the applet:
WorkerApplet
RayTrace (which extends WorkerApplet, instead of java.applet.Applet)
for the server:
JobMaster
ResultsReceiver
and shared by both:
UDPUtils
At this proof-of-concept stage, the applet side already has been decomposed into a generic platform for parallel processing (WorkerApplet) and the application-specific code (RayTrace). Since this article is all about the distributed parallel processing, and not about raytracing, I shall not go into detail of how RayTrace works. So let's look at WorkerApplet's implementation inste


 

Read Tutorial at: Click here to view the tutorial

Rate Tutorial:
JavaWorld - Distributed applet-based massively parallel processing (DAMPP) - January 1997

View Tutorial:
JavaWorld - Distributed applet-based massively parallel processing (DAMPP) - January 1997

Related Tutorials:

Results of first-ever JVM server benchmark revealed - JavaWorld - December 1997
Results of first-ever JVM server benchmark revealed - JavaWorld - December 1997
 
JavaWorld - Get started with server-side Java - March 1997
JavaWorld - Get started with server-side Java - March 1997
 
GNU offers a new kind of Java IDE - JavaWorld - July 1998
GNU offers a new kind of Java IDE - JavaWorld - July 1998
 
Which Java VM scales best? - JavaWorld - August 1998
Which Java VM scales best? - JavaWorld - August 1998
 
Java gets serial support with the new javax.comm package - JavaWorld - May 1998
Java gets serial support with the new javax.comm package - JavaWorld - May 1998
 
Opening up new ports to Java with javax.comm - JavaWorld - September 1998
Opening up new ports to Java with javax.comm - JavaWorld - September 1998
 
Image processing with Java 2D - JavaWorld - September 1998
Image processing with Java 2D - JavaWorld - September 1998
 
Java Tip 42: Write Java apps that work with proxy-based firewalls - JavaWorld - December 1997
Java Tip 42: Write Java apps that work with proxy-based firewalls - JavaWorld - December 1997
 
Java gets serial support with the new javax.comm package
Java gets serial support with the new javax.comm package
 
XML JavaBeans, Part 1 - JavaWorld February 1999
XML JavaBeans, Part 1 - JavaWorld February 1999
 
Solve real problems with aglets, a type of mobile agent - JavaWorld May 1997
Solve real problems with aglets, a type of mobile agent - JavaWorld May 1997
 
Use the JVM Profiler Interface for accurate timing - JavaWorld
Use the JVM Profiler Interface for accurate timing - JavaWorld
 
Finalists announced for JavaWorld Editors' Choice Awards
Finalists announced for JavaWorld Editors' Choice Awards
 
How to build an interpreter in Java, Part 1: The BASICs (JavaWorld / May 1997 / by Chuck McManis)
How to build an interpreter in Java, Part 1: The BASICs (JavaWorld / May 1997 / by Chuck McManis)
 
Add concurrent processing with message-driven beans
Add concurrent processing with message-driven beans
 
OpenSign
OpenSign OpenSign is a java applet for signing text in a webbrowser using a PKCS-12 key-files or keys located in MS Crypto API CSP's in a Win32 environment. The applet is based on code kindly provided by IT-Practice and everyone is encouraged to subm
 
JGraph - The Java Graph Diagram Component
JGraph - The Java Graph Diagram Component JGraphAddons is a drop-in functional module that adds powerful and configurable layout algorithms to your existing JGraphs. They include hierarchical, circular and tree layouts capable of giving your JGraph app
 
Tools and Libraries for the Advancement of Sciences (was: J.A.D.E.)
Tools and Libraries for the Advancement of Sciences (was: J.A.D.E.) Our Vision To provide the most comprehensive Java TM library for the scientific community. To create synergy between all sciences (e.g. math, physics, sociology, biology, astron
 
Memory Contention in J2EE Applications for Multiprocessor Platforms
Memory Contention in J2EE Applications for Multiprocessor Platforms. With the need for highly scalable J2EE applications in the enterprise environment, parallel processing of threads is required on multi-processor platforms. The memory requirements in th
 
Sun StorEdge 3511 SATA Array With 400 GB Configurations
An additional 12 configurations are now available with the Sun StorEdge 3511 SATA Array. With these new 400 GB configurations, users have access to massive density with configurations that support up to 2.4 TB per rack unit. A full rack of these arrays ca
 
Site navigation
 

 

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright © 2006. All rights reserved.