Convert pdf to text file using Java


 

Convert pdf to text file using Java

In this section, you will learn how to convert pdf file to text file using Java Programming.

In this section, you will learn how to convert pdf file to text file using Java Programming.

How to Convert pdf to text file in Java

In this section, you will learn how to convert pdf file to text file in Java Programming. We have used itext api for this purpose. To read resume.pdf  file, we have used PDFReader class. The data is first converted into bytes and then with the use of StringBuffer, it will again converted into string and write into the pdf.txt file.

Here is the code:

import java.io.*;
import java.util.*;
import com.lowagie.text.*;
import com.lowagie.text.pdf.*;

public class ConvertPDFToTEXT {
      public static void main(String[] argsthrows IOException {
      try {
      Document document = new Document();
      document.open();
      PdfReader reader = new PdfReader("C:\\resume.pdf");
      PdfDictionary dictionary = reader.getPageN(1);
      PRIndirectReference reference = (PRIndirectReference
                dictionary.get
(PdfName.CONTENTS);
                        PRStream stream = (PRStreamPdfReader.getPdfObject(reference);
                        byte[] bytes = PdfReader.getStreamBytes(stream);
                        PRTokeniser tokenizer = new PRTokeniser(bytes);
                        FileOutputStream fos=new FileOutputStream("pdf.txt");
                        StringBuffer buffer = new StringBuffer();
                        while (tokenizer.nextToken()) {
                        if (tokenizer.getTokenType() == PRTokeniser.TK_STRING) {
                                        buffer.append(tokenizer.getStringValue());
                                        }
                        }
                String test=buffer.toString();
                StringReader stReader = new StringReader(test);
                int t;
                while((t=stReader.read())>0)
                fos.write(t);
                document.add(new Paragraph(".."));
                document.close();
      }
         catch (Exception e) {}
    }
    }
  

Ads