HTMLParserText

HTMLParserText

Thanks for the file that you send me. But my problem, i want to convert it from HTML to TXT file. This means I have to remove all tags. In this way, I need your helpful.

With my best regards
View Answers

April 29, 2010 at 3:14 PM

Hi Friend,

Try the following code:

import java.io.*;
public class ReadHTML{
public static void main(String[]args){
try{
File file = new File("applet.html");
FileInputStream fis = null;
BufferedInputStream bis = null;
DataInputStream dis = null;
fis = new FileInputStream(file);
bis = new BufferedInputStream(fis);
dis = new DataInputStream(bis);
String st="";
while (dis.available() != 0) {
st+=dis.readLine().toString()+"\n";
}
String str=st.replaceAll("[\"]"," ").replaceAll("[/<>.=]"," ");

File f = new File("hello.txt");
FileWriter fstream = new FileWriter(f);
BufferedWriter out = new BufferedWriter(fstream);
out.write(str);
out.close();
}
catch(Exception e){}
}
}

Hope that it will be helpful for you.
Thanks









Related Tutorials/Questions & Answers:
HTMLParserText - Java Interview Questions

Ads