WebSPHINX ( Website-Specific Processors for HTML INformation eXtraction) is a Java class library and interactive development environment for web crawlers. A web crawler (also called a robot or spider) is a program that browses and processes Web pages automatically.
WebSPHINX consists of two parts: the Crawler Workbench and the WebSPHINX class library.
The Crawler Workbench is a graphical user interface that lets you configure and control a customizable web crawler.
The WebSPHINX class library provides support for writing web crawlers in Java.