Programming Tutorials Browser Tutorials Articles Struts Tutorials Hibernate Tutorials

  Tutorial: The Gnutella file-sharing network and Java - JavaWorld October 2000

The Gnutella file-sharing network and Java - JavaWorld October 2000

Tutorial Details:

The Gnutella file-sharing network and Java
The Gnutella file-sharing network and Java
By: By Ken McCrary
Use the JTella API to easily develop applications that access Gnutella
'll begin with some definitions. First, Internet file sharing is an activity performed by a community of connected users. The file-sharing system, at a minimum, allows users to share files, and to search for files within the community. Some file-sharing services offer additional capabilities, such as chatting. And some, like the notorious Napster, only allow users to share certain types of files, namely MP3s.
The Gnutella network supports sharing and searching of any file type, but does not offer any extra functionality, like chatting. Gnutella is a peer-to-peer system, with client software that also acts as a server -- software typically referred to as a servant.
Using Gnutella vs. the Internet
I will make a wild assumption that most of you are familiar with the Internet; specifically, with using an HTTP server to serve files to clients.
To publish files on a Website, you typically use FTP to transfer the documents to the Web server. Then, to make the document accessible to users, you might submit the URL to a search engine for crawling. This means downloading a document, examining it for keywords and such, and creating a searchable index in a database. Now, when a user uses an appropriate query on the Web server, he or she will receive information about the published document and its location.
When publishing files to a file-sharing service, you typically interact only with the servant program, which can access the service. The file-sharing service's servant is connected to Gnutella and is continuously responding to search queries from the network, eliminating the need for an intermediary search engine. The documents remain on the user's computer and are not transferred. Gnutella shares a document by copying the file to a shared directory and having the servant scan and index the file. Since the file-sharing system publishes and indexes documents, the user has much less work to do. Also, the user has the option to share files for a limited amount of time; simply removing the Gnutella servant from the network will end the session.
Origin of Gnutella
It has been widely reported that Gnutella was created by a group of developers at Nullsoft, a subsidiary of America Online. Not surprisingly, AOL put an end to the project. Later, the Nullsoft client's protocol was reverse-engineered and a group of developers on the Internet collaborated to further develop the system. Eventually, those developers produced a number of clients, written in various programming languages and targeted at different operating systems.
Network structure
This is where I would discuss the network's structure and describe its topology -- but there is none! Each servant on the network is connected to at least one other servant on the network; servants can also have both inbound and outbound connections to each other, forming a cyclic connection. However, there is no fixed layout or pattern to the nodes on the network.
So how do messages navigate this unordered environment? Each Gnutella message contains a unique ID, which is used to intelligently route messages through the network. For instance, as a servant forwards messages to its connected servants, it caches each message's ID in memory. When a response arrives, it uses that ID to route the response back to the original sender. Originally, message IDs were Windows's Globally Unique Identifiers (GUIDs), which was an issue for other platforms. But since the GUID is just a 16-byte value, non-Windows code can calculate its own unique data.
The Gnutella protocol
Like many Internet technologies, Gnutella benefits from a protocol specification that is available to the public. The protocol provides a compatibility point that allows Gnutella servants to communicate across operating systems and programming languages. As long as a Gnutella servant implements the protocol, it can participate in the file-sharing network.
First, you connect to a servant on the network by accessing a well-known servant like gnutellahosts.com:6346, which is almost always connected to the network. Several Websites, like gnutellahosts.com, indicate a host/port where you can locate a Gnutella servant. Some of those servants are host caches, a specialized software that will return a collection of hosts currently on the network. You can use this information to maintain a desired number of connections.
When a connection is made, the connecting servant sends a text string that resembles "GNUTELLA CONNECT/0.4\n\n". If the connection is accepted, the reply servant sends the text string "GNUTELLA OK\n\n". Notice the two new line characters; I wonder why it's not the usual "\r\n". Now you have a working connection; the rest of the protocol exchanges consist of mostly binary data.
Most servants attempt to maintain connections to multiple servants. To facilitate this, the protocol provides a mechanism to discover connected servants: the PING message. A servant that sends the PING message will receive response messages known as PONG messages. (No, you have not fallen into a twilight zone of 1970s video games.)
The PONG messages contain a payload that identifies the host and port number of an active Gnutella servant; this allows a servant to maintain a cache of available servants. PING messages represent a significant amount of network traffic. One possible improvement would be for a servant, before disconnecting from the network, to send a disconnect message to servants that had previously received a PONG message.
Once the servant has established connections to the network, you will probably begin searching for files. The search message contains the search criteria and the preferred download speed, which is meant to prevent responses from servants on a slow connection. In practice, the user often sets the download speed incorrectly, effectively preventing the filtering of servants on slow connections.
When a servant responds to a search message, it includes all of the information needed to retrieve the file, including the IP address and the port on which the server is listening for connections.
The file is transferred with HTTP, so all you need to get a file is a GET request. The servants can even resume partial downloads if the GET request ends before the file transfer is complete.
Finally, the protocol supports a specialized message for dealing with firewall issues. There is a special PUSH message you can use to forward a file that was previously found through a search request. The main idea is to communicate enough information, so the servant behind the firewall can establish the connection to the servant requesting the file, thus sidestepping the firewall.
Future enhancements to protocol
Future enhancements to the Gnutella protocol will be made in the areas of scalability and spam defense. As the network has grown, network traffic has increased, due to message broadcasts from individual servants. Some work has been done to create a set of guidelines for servants, in order to limit excessive message traffic without requiring a protocol modification.
Each Gnutella message contains two pieces of information that can help to address these issues. The first is Time to Live (TTL), a value that is set by the servant creating the message and is decremented each time it is forwarded. When the TTL reaches a value less than one, it should be dropped (not forwarded when received). The second value is the Hop count, which starts at zero and is incremented each time the message is forwarded.
The guidelines for servants center on dropping messages that have large TTL and Hop values. A message with a large TTL may be spam; a message with a large Hop value has already flowed over a large part of the network.
Defending against spam is much more difficult. Spam occurs on the network when a servant responds to all or most queries with a text message, instead of an actual file that the servant is serving. The text message usually advertises a Website, and since the spamming servant is responding to a search query, the message is also displayed on other clients that monitor searches.
XML protocol alternative
One could envision an alternative Gnutella-like protocol, based on XML, that could have a number of advantages over today's binary data protocol. For one, you could easily read messages, making it easier to develop software using the protocol. You could also validate messages with a validating XML parser, which would allow you to discard malformed messages. An XML protocol would also make byte-swapping unnecessary, due to the use of little-endian byte ordering. Since Java uses network order, you must swap bytes for some of the numeric values in the messages. One downside of an XML-based protocol is that messages would be larger than those on today's system.
JTella API
I haven't yet discussed anything Java-related, so you may be wondering what this article is doing in JavaWorld. Well, this article introduces JTella, an API designed to enable fast and easy development of Java applications and tools that access the Gnutella network. JTella is still in an early stage of development (version 0.1), but it can already do a few things. Of course, it can form and maintain connections to the network. Second, it offers a search-monitoring function that allows you to monitor searches received by a JTella servant. Third, it can send search queries over the network and process the results.
I'll now show you some code examples for using JTella. Two example applications are shown: one with code to monitor the search requests received, the other to send new search requests over the network. (See Resources for the source code.) Both examples accept two command-line parameters: the first provides the name of a host, the second provides the port used by the remot


 

Read Tutorial at: Click here to view the tutorial

Rate Tutorial:
The Gnutella file-sharing network and Java - JavaWorld October 2000

View Tutorial:
The Gnutella file-sharing network and Java - JavaWorld October 2000

Related Tutorials:

Accelerate your RMI programming
Accelerate your RMI programming
 
The Jxta solution to P2P
The Jxta solution to P2P
 
Test networked code the easy way
Test networked code the easy way
 
Jabber away with instant messaging
Jabber away with instant messaging
 
J2ME devices: Real-world performance
J2ME devices: Real-world performance
 
FindBugs - A Bug Pattern Detector for Java
FindBugs - A Bug Pattern Detector for Java This is the web page for FindBugs, a program which looks for bugs in Java code. It is free software, distributed under the terms of the Lesser GNU Public License.
 
Java SMPP API Homepage
Java SMPP API SMPP (Short Message Peer to Peer) is a protocol used by short message entities (SMEs) to communicate with Short Message Service Centres (SMSC, or just SC) for sending an receiving short messages.
 
JLAN Server v3.3
JLAN Server v3.3 JLAN Server is a high performance JavaTM based file server supporting Windows file sharing (SMB/CIFS), NFS and FTP protocols. Write your own virtual filesystems with the core server handling all protocol exchanges with the client. Incl
 
JLAN Client 3.0
JLAN Client is a JavaTM based library that implements the various protocols used by Windows Networking (NetBIOS over TCP/IP, SMB/CIFS and DCE/RPC).
 
SLAMD Distributed Load Generation Engine
SLAMD Distributed Load Generation Engine The SLAMD Distributed Load Generation Engine (SLAMD) is a Java-based application designed for stress testing and performance analysis of network-based applications. It was originally developed by Sun Microsystems,
 
Access Windows Performance Monitor counters from Java, Part 1
Access Windows Performance Monitor counters from Java, Part 1 Use a simple Java API to gather valuable performance statistics Summary Windows NT, 2000, 2003, and XP contain a utility called the Performance Monitor that provides a rich array of perform
 
Leverage JNLP and SOAP for Java Thick-client Development
Leverage JNLP and SOAP for Java Thick-client Development The hype during the mid-to-late 1990's over Java's utility to run swarms of autonomous applets was greatly exaggerated. This early enthusiasm (and marketing) for Java as a language with which devel
 
JTimepiece
JTimepiece is the advanced library for working with dates and times in Java. Many easy-to-use methods in this API make it easy for any developer, from beginner to expert, to use JTimepiece.
 
Java Servlets: Design Issues
This article covers the principal concepts associated with servlets. This article examines some of the design issues, and offers some guidelines on the applicability of Java servlets for web based application development.
 
Network Programming with JavaTM 2 Platform, Standard Edition 1.4 (J2SETM)
This article provides an overview of the new features and enhancements in the Java 2 Platform, Standard Edition 1.4 (J2SE), and shows you how to use them effectively.
 
Understanding Network Class Loaders Class loaders
One of the cornerstones of Java dynamics, determine when and how classes can be added to a running Java environment.
 
istory of Bioinformatics
istory of Bioinformatics History of Bioinformatics The Modern bioinformatics is can be classified into two broad categories, Bi ological Science and computational Science . Here is the data of hi storical events for both biology and computer
 
We are providing Downloadable Version of K12LTSP Linux
We are providing Downloadable Version of K12LTSP Linux K12LTSP Linux Now Available Linux K12LTSP 4.1.0 CD's We are providing the free downloadable version of K12LTSP 4.1.0, which is distributed under GNU public license. You have to pay only for
 
We are providing Downloadable Version of K12LTSP Linux
We are providing Downloadable Version of K12LTSP Linux K12LTSP Linux Now Available Linux K12LTSP 4.2.0 CD's K12LTSP 4.2.0 is based on Fedora Core 3 and the LTSP terminal server packages. It's easy to install and configure. It's distributed under
 
JLAN Server v3.5 - Database Filesystems
*JLAN Server* is a high performance Java based file server supporting Windows file sharing (SMB/CIFS), NFS and FTP protocols.
 
Site navigation
 

 

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright © 2006. All rights reserved.