LingPipe is a suite of Java tools designed to perform linguistic analysis on natural language data. While fast and robust enough to be used in a customer-facing commercial system, LingPipe\'s flexibility and included source make it appropriate for resear
Tutorial Details:
Java API
All LingPipe functionality is available through a thoroughly documented Java API. Internal character representations are in unicode for portability.
Command Line
Configurable commands are provided for training, running, and evaluating named-entity models, as well as running sentence-boundary detection and coreference resolution. Commands operate on plain text, HTML or XML output, and produce well-formed XML output.
SAX Filters
SAX Filters are provided for entity detection, sentence detection and coreference resolution. These allow general specifications of elements, rather than requiring fixed formats. Input and output is streamed at the lowest level of text content element for XML and HTML input.
Named-Entity Extraction
Named entity extraction employs a generative statistical model based on word trigrams and tag bigrams. The detector can be trained from XML and plain text sources for new genres and new languages.
Sentence-Boundary Detection
Sentence boundary detection is sensitive to context, and has configurable models for abbreviations and sentence-ending tokens.
Within-Document Coreference
Within-document coreference works on text with sentence boundaries and named-entities annotated, producing unique identifiers for clusters of mentions that corefer. Coreference can be tuned with domain- and tag-specific matching functions.
Read
Tutorial at: Click here to view the tutorial
Rate Tutorial: LingPipe
View Tutorial: LingPipe
Related
Tutorials:
|