Latest Tutorials| Questions and Answers|Ask Questions?|Site Map



Home Regularexpressions Calculating Repetingwords using Regular expression

 
 

Share on Google+Share on Google+

Calculating Repetingwords using Regular expression

Advertisement
This Example describe the way to calculate the Repeating word from the file using Regularexpression.

Calculating Repetingwords using Regular expression

     

This Example describe the way to calculate the Repeating word from the file using Regularexpression. The steps involved in calculating the repeating words are described below:-

String file = "/home/girish/Desktop/D.txt":-Declares the file from where the words are to be counted.

FileInputStream inputStream = new FileInputStream(file):-Creates a file inputstream and gets input bytes from a file .

FileChannel fileChannel = inputStream.getChannel():- Creates 
object of FileChannel that is associated with the file input stream.

MappedByteBuffer mbb = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileLength):-
MappedByteBuffer is a buffer whose data is memorymapped with the file.

Charset charset = Charset.forName("ISO-8859-1"):-Creates a charset which is used for creating decoders and encoders.

 

Calculatingword.java

import java.io.FileInputStream;
import java.nio.CharBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.util.Map;
import java.util.TreeMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Calculatingword {

 public static void main(String args[]) throws Exception {
  String file = "/home/girish/Desktop/D.txt";
  
  FileInputStream inputStream = new FileInputStream(file);
  FileChannel fileChannel = inputStream.getChannel();
  System.out.println(fileChannel);
  int fileLength = (intfileChannel.size();
  MappedByteBuffer mbb = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0,
  fileLength);
  
  Charset charset = Charset.forName("ISO-8859-1");
  CharsetDecoder cd = charset.newDecoder();
  
  CharBuffer charBuffer = cd.decode(mbb);
  System.out.println("========================" +
  "File from where words are counted=============");
  System.out.println(charBuffer);
 System.out.println("========================" +
 "===================================");
  
  Pattern pattern = Pattern.compile(".*$", Pattern.MULTILINE);
  Pattern patternword = Pattern.compile("[\\p{Punct}\\s}]");

  Matcher Lmatcher = pattern.matcher(charBuffer);
  Map map = new TreeMap();
  Integer one = new Integer(1);

  while (Lmatcher.find()) {
  CharSequence sequence = Lmatcher.group();
  String word[] = patternword.split(sequence);

  for (int i = 0, n = word.length; i < n; i++) {
  if (word[i].length() 0) {
  Integer times = (Integermap.get(word[i]);
  if (times == null) {
  times =one;
  else {
  int value = times.intValue();
  times = new Integer(value + 1);
  }
  map.put(word[i], times);
  }
  }
  }
  System.out.println("No of times words repeted are :"+"\n"+map);
  }
}

Output of the program:-

======File from where words are counted=======
Angeles Angeles Angeles Angeles Angeles  
Angeles Angele  Angele  Angele Angele
Angele Angele Angele Angele Angele Angel 
Angel Angel Angel Angel Angel Angel 
Angel Angel Angel Angel Ange Ange
Ange Ange Ange Ange Ange Ange Ange
Ange dcjk
4645423 24221 224121 5245241 55241
542541 441 5541 41441
===================================
No of times words repeted are :
{224121=1, 24221=1, 41441=1, 441=1,
4645423=1, 5245241=1, 542541=1,
55241=1, 5541=1, Ange=10,
Angel=11, Angele=9, Angeles=6, dcjk=1}

Download Source Code
Advertisements

If you enjoyed this post then why not add us on Google+? Add us to your Circles



Liked it!  Share this Tutorial


Follow us on Twitter, or add us on Facebook or Google Plus to keep you updated with the recent trends of Java and other open source platforms.

Posted on: August 26, 2008

Related Tutorials

Discuss: Calculating Repetingwords using Regular expression  

Post your Comment


Your Name (*) :
Your Email :
Subject (*):
Your Comment (*):
  Reload Image
 
 
Comments:0
DMCA.com