Home Tutorial Java Itext Convert pdf to text file using Java

 
 

Share on Google+Share on Google+
Convert pdf to text file using Java
Posted on: October 23, 2009 at 12:00 AM
Advertisement
In this section, you will learn how to convert pdf file to text file using Java Programming.

How to Convert pdf to text file in Java

In this section, you will learn how to convert pdf file to text file in Java Programming. We have used itext api for this purpose. To read resume.pdf  file, we have used PDFReader class. The data is first converted into bytes and then with the use of StringBuffer, it will again converted into string and write into the pdf.txt file.

Here is the code:

import java.io.*;
import java.util.*;
import com.lowagie.text.*;
import com.lowagie.text.pdf.*;

public class ConvertPDFToTEXT {
      public static void main(String[] argsthrows IOException {
      try {
      Document document = new Document();
      document.open();
      PdfReader reader = new PdfReader("C:\\resume.pdf");
      PdfDictionary dictionary = reader.getPageN(1);
      PRIndirectReference reference = (PRIndirectReference
                dictionary.get
(PdfName.CONTENTS);
                        PRStream stream = (PRStreamPdfReader.getPdfObject(reference);
                        byte[] bytes = PdfReader.getStreamBytes(stream);
                        PRTokeniser tokenizer = new PRTokeniser(bytes);
                        FileOutputStream fos=new FileOutputStream("pdf.txt");
                        StringBuffer buffer = new StringBuffer();
                        while (tokenizer.nextToken()) {
                        if (tokenizer.getTokenType() == PRTokeniser.TK_STRING) {
                                        buffer.append(tokenizer.getStringValue());
                                        }
                        }
                String test=buffer.toString();
                StringReader stReader = new StringReader(test);
                int t;
                while((t=stReader.read())>0)
                fos.write(t);
                document.add(new Paragraph(".."));
                document.close();
      }
         catch (Exception e) {}
    }
    }
  
Advertisement

Related Tags for Convert pdf to text file using Java:


Follow us on Twitter, or add us on Facebook or Google Plus to keep you updated with the recent trends of Java and other open source platforms.

Posted on: October 23, 2009

Recommend the tutorial

Advertisements Advertisements
 

 

 

DMCA.com