Share on Google+Share on Google+

Java Read .doc file using POI library

Advertisement
In this section, you will learn how to read the word document file using POI library.

Java Read .doc file using POI library

In this section, you will learn how to read the word document file using POI library. The class HWPFDocument throw all of the Word file data and the class WordExtractor extract the text from the Word Document. The method  getParagraphText() of WordExtractor class get the text from the word file as an array with one String per paragraph and displayed the data of word file on console.

Here is the code of ReadDoc.java:

import java.io.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class ReadDocFile {
public static void main(String[] args) {
File file = null;
WordExtractor extractor = null ;
try {

file = new File("c:\\New.doc");
FileInputStream fis=new FileInputStream(file.getAbsolutePath());
HWPFDocument document=new HWPFDocument(fis);
extractor = new WordExtractor(document);
String [] fileData = extractor.getParagraphText();
for(int i=0;i<fileData.length;i++){
if(fileData[i!= null)
System.out.println(fileData[i]);
}
}
catch(Exception exep){}
}
}

Advertisements

Advertisement

Posted on: October 22, 2009 If you enjoyed this post then why not add us on Google+? Add us to your Circles

Share this Tutorial Follow us on Twitter, or add us on Facebook or Google Plus to keep you updated with the recent trends of Java and other open source platforms.