Home Tutorial Java Itext Convert pdf to text file using Java

 
 

Convert pdf to text file using Java
Posted on: October 23, 2009 at 12:00 AM
In this section, you will learn how to convert pdf file to text file using Java Programming.

How to Convert pdf to text file in Java

In this section, you will learn how to convert pdf file to text file in Java Programming. We have used itext api for this purpose. To read resume.pdf  file, we have used PDFReader class. The data is first converted into bytes and then with the use of StringBuffer, it will again converted into string and write into the pdf.txt file.

Here is the code:

import java.io.*;
import java.util.*;
import com.lowagie.text.*;
import com.lowagie.text.pdf.*;

public class ConvertPDFToTEXT {
      public static void main(String[] argsthrows IOException {
      try {
      Document document = new Document();
      document.open();
      PdfReader reader = new PdfReader("C:\\resume.pdf");
      PdfDictionary dictionary = reader.getPageN(1);
      PRIndirectReference reference = (PRIndirectReference
                dictionary.get
(PdfName.CONTENTS);
                        PRStream stream = (PRStreamPdfReader.getPdfObject(reference);
                        byte[] bytes = PdfReader.getStreamBytes(stream);
                        PRTokeniser tokenizer = new PRTokeniser(bytes);
                        FileOutputStream fos=new FileOutputStream("pdf.txt");
                        StringBuffer buffer = new StringBuffer();
                        while (tokenizer.nextToken()) {
                        if (tokenizer.getTokenType() == PRTokeniser.TK_STRING) {
                                        buffer.append(tokenizer.getStringValue());
                                        }
                        }
                String test=buffer.toString();
                StringReader stReader = new StringReader(test);
                int t;
                while((t=stReader.read())>0)
                fos.write(t);
                document.add(new Paragraph(".."));
                document.close();
      }
         catch (Exception e) {}
    }
    }
  

Related Tags for Convert pdf to text file using Java:


Ask Questions?

If you are facing any programming issue, such as compilation errors or not able to find the code you are looking for.

Ask your questions, our development team will try to give answers to your questions.