In this section, you will learn how to read the word document file using POI library. The class HWPFDocument throw all of the Word file data and the class WordExtractor extract the text from the Word Document. The method getParagraphText() of WordExtractor class get the text from the word file as an array with one String per paragraph and displayed the data of word file on console.
Here is the code of ReadDoc.java:
|
import java.io.*; import org.apache.poi.hwpf.HWPFDocument; import org.apache.poi.hwpf.extractor.WordExtractor; public class ReadDocFile { public static void main(String[] args) { File file = null; WordExtractor extractor = null ; try { file = new File("c:\\New.doc"); FileInputStream fis=new FileInputStream(file.getAbsolutePath()); HWPFDocument document=new HWPFDocument(fis); extractor = new WordExtractor(document); String [] fileData = extractor.getParagraphText(); for(int i=0;i<fileData.length;i++){ if(fileData[i] != null) System.out.println(fileData[i]); } } catch(Exception exep){} } } |
If you are facing any programming issue, such as compilation errors or not able to find the code you are looking for.
Ask your questions, our development team will try to give answers to your questions.