Java Read .doc file using POI library


 

Java Read .doc file using POI library

In this section, you will learn how to read the word document file using POI library.

In this section, you will learn how to read the word document file using POI library.

Java Read .doc file using POI library

In this section, you will learn how to read the word document file using POI library. The class HWPFDocument throw all of the Word file data and the class WordExtractor extract the text from the Word Document. The method  getParagraphText() of WordExtractor class get the text from the word file as an array with one String per paragraph and displayed the data of word file on console.

Here is the code of ReadDoc.java:

import java.io.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class ReadDocFile {
public static void main(String[] args) {
File file = null;
WordExtractor extractor = null ;
try {

file = new File("c:\\New.doc");
FileInputStream fis=new FileInputStream(file.getAbsolutePath());
HWPFDocument document=new HWPFDocument(fis);
extractor = new WordExtractor(document);
String [] fileData = extractor.getParagraphText();
for(int i=0;i<fileData.length;i++){
if(fileData[i!= null)
System.out.println(fileData[i]);
}
}
catch(Exception exep){}
}
}

Ads