Home | JSP | EJB | JDBC | Java Servlets | WAP  | Free JSP Hosting  | Spring Framework | Web Services | BioInformatics | Java Server Faces | Jboss 3.0 tutorial | Hibernate 3.0 | XML

Tutorial Categories: Ajax | Articles | JSP | Bioinformatics | Database | Free Books | Hibernate | J2EE | J2ME | Java | JavaScript | JDBC | JMS | Linux | MS Technology | PHP | RMI | Web-Services | Servlets | Struts | UML


 

Search Host

Monthly Fee($)
Disk Space (MB)
Register With us for Newsletter!
Visit Forum! Post Questions!
Jobs At RoseIndia.net!

Have tutorials?
Add your tutorial to our Java Resource and get tons of hits.

We offer free hosting for your tutorials. and exposure for thousands of readers. drop a mail
roseindia_net@yahoo.com
 
   

Tutorials

Java Server Pages

JAXB

Java Beans

JDBC

MySQL

Java Servlets

Struts

Bioinformatics

Java Code Examples

Interview Questions

 
Join For Newsletter

Powered by groups.yahoo.com
Visit Group! Post Questions!

Web Promotion

Web Submission

Submit Sites

Manual Submission?

Web Promotion Guide

Hosting Companies

Web Hosting Guide

Web Hosting

Linux

Beginner Guide to Linux Server

Frameworks

Persistence Framework

Web Frameworks

Free EAI Tools

Web Servers

Aspect Oriented Programming

Free Proxy Servers

Softwares

Adware & Spyware Remover

Open Source Softwares

Using Unicode Variable Names

       

2001-11-23 The Java Specialists' Newsletter [Issue 036] - Using Unicode Variable Names

Author: Dr. Heinz M. Kabutz

If you are reading this, and have not subscribed, please consider doing it now by going to our subscribe page. You can subscribe either via email or RSS.


Welcome to the 36th edition of "The Java(tm) Specialists' Newsletter". This week, we will look at the strange things that happen when we try to use unicode characters in our code.

I am sitting outside in my garden, with beautiful sunshine and a pitbull terrier at my command ;-) Approximately a month ago, the biggest software vendor in South Africa went bankrupt, severely affecting the availability of software in this country. Fortunately for me, I have friends in convenient places: I purchased the software that I needed (Dragon NaturallySpeaking) from Amazon in Germany and had it shipped to infor AG, who I have spoken about in other newsletters - they very kindly shipped it down to the end of the earth.

As a result of using Dragon NaturallySpeaking, you will probably notice that my newsletters will have an even more conversational style than before. I am always looking at ways in which I can improve my newsletters and serve you better. Please remember to forward this newsletter to friends and colleagues who are interested in Java.

A special welcome to country No 56, Malta! My wife's previous boss at a hotel was the Maltese ambassador for Cape Town, which was really cool, as he had diplomatic immunity from parking fines and speeding fines. Mind you, traffic laws are rather lax in this country, I have only had one speeding fine in my life, and I drive an Alfa Romeo!

South Africa has just become the cheapest country in the world! We are the first country where a Big Mac costs less than US$ 1. It is cheaper here even than in the Philipines and China. I had a good response to my advert for my Java Course (thank you for your patience in this regard) and so I definitely want to develop the idea of running courses in South Africa, combined with a holiday :-)

How do you go from being an OO beginner to an OO guru? Simple answer: Experience! But what if you can't wait 10 years to get that experience? Simple answer: Design Patterns! How can you learn Design Patterns in a relaxed setting from someone who has used them in the real world? Simple answer: Ask about my new course "Design Patterns - The Timeless Way of Coding".

1707 members are currently subscribed from 56 countries

Using Unicode Variable Names

A few months ago, I was reading a book written by the authors of Java, when I stumbled across a piece of code that was using Unicode characters as variable names. Being the curious type, I immediately tried writing a piece of code that used funny characters. Easier said than done! I don't know of any Java IDE that supports Unicode. The common e-mail systems in this world would also choke like a dog on a chicken bone if I sent you a newsletter containing Unicode characters ;-)

Before I get into how we could use Unicode characters in our variables, let's just take a step back and think about it: Imagine being called in by a Japanese company who has got a memory leak in their program which they want you to fix (one of the most common tasks I have been asked to perform), and imagine if in their company they used Japanese characters for their variables. Yes, it would compile if you follow the ideas in this newsletter, but what would the result be for me? I would probably pack my bags and head back home! It's bad enough having to read code where the variable names are in German or in Afrikaans, I cannot imagine trying to understand code where I don't even know the characters used in variable names!

Since I could not find an IDE that supported Unicode, my first job was to write a Unicode editor. Also easier said than done. I had learned many years ago that Writers and Readers are used for Unicode characters, but I had never really used Unicode before. My first approach at reading and writing Unicode files looked something like this:

public void load() throws IOException {
  BufferedReader in = new BufferedReader(new FileReader(filename));
  String s;
  while((s = in.readLine()) != null) {
    // ...
  }
}

Did you know that FileReader extends InputStreamReader? In its constructor it constructs a FileInputStream that it passes to its parent. The InputStreamReader has a constructor that takes as argument the encoding used for reading files. FileReader unfortunately does not expose the constructor that takes the encoding as an argument, it simply uses an operating-system dependent encoding. One cannot but wonder what the author of the FileReader had been smoking the day he/she wrote that code ...

(Actually, when I wrote the Sun Microsystems Java programmer examination a few years ago, the only none-GUI question that I got wrong was a question relating to reading ISO-8859-1 data. Perhaps there has always been a hole in my knowledge regarding this topic.)

Should you want to use the FileReader to read an encoding different to the standard one, you would have to do the following:

public void load() throws IOException {
  BufferedReader in = new BufferedReader(
    new InputStreamReader(
      new FileInputStream(filename), "UTF-16BE"));
  String s;
  while((s = in.readLine()) != null) {
    // ...
  }
}

Without further ado, here is the code for a Unicode text editor. It allows you to insert Unicode characters by entering their decimal values and pressing the appropriate button. For the design, I have followed an approach I saw a few years ago on jGuru, where all the GUI elements are created lazily. It makes the GUI code very nicely maintainable, as you never have to worry in what order elements are constructed.

import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
import java.io.*;

public class UnicodeEditor extends JFrame {
  private JPanel buttonPanel;
  private JScrollPane editorPanel;
  private JTextArea editor;
  private final String filename;
  private final String encoding;

  public UnicodeEditor(String filename, String encoding)
      throws IOException {
    this.filename = filename;
    this.encoding = encoding;
    getContentPane().add(getButtonPanel(), BorderLayout.NORTH);
    getContentPane().add(getEditorPanel(), BorderLayout.CENTER);
    load();
  }

  protected JPanel getButtonPanel() {
    if (buttonPanel == null) {
      buttonPanel = new JPanel();
      JButton unicodeInsert = new JButton("Insert Unicode:");
      final JTextField unicodeField = new JTextField(8);
      JButton saveExit = new JButton("Save & Exit");
      unicodeInsert.addActionListener(new ActionListener() {
        public void actionPerformed(ActionEvent e) {
          getEditor().insert(
            "" + (char)Integer.parseInt(unicodeField.getText()),
            getEditor().getCaretPosition());
        }
      });
      saveExit.addActionListener(new ActionListener() {
        public void actionPerformed(ActionEvent e) {
          try {
            save();
            System.exit(0);
          } catch(IOException ex) { ex.printStackTrace(); }
        }
      });
      buttonPanel.add(unicodeInsert);
      buttonPanel.add(unicodeField);
      buttonPanel.add(saveExit);
    }
    return buttonPanel;
  }

  protected JTextArea getEditor() {
    if (editor == null) {
      editor = new JTextArea();
    }
    return editor;
  }

  protected JScrollPane getEditorPanel() {
    if (editorPanel == null) {
      editorPanel = new JScrollPane(getEditor());
    }
    return editorPanel;
  }

  protected void load() throws IOException {
    BufferedReader in = new BufferedReader(new InputStreamReader(
      new FileInputStream(filename), encoding));
    StringBuffer buf = new StringBuffer();
    int i;
    while((i = in.read()) != -1) buf.append((char)i);
    in.close();
    getEditor().setText(buf.toString());
  }

  protected void save() throws IOException {
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
      new FileOutputStream(filename), encoding));
    char[] text = getEditor().getText().toCharArray();
    for (int i=0; i<text.length; i++) out.write(text[i]);
    out.close();
  }

  public static void main(String[] args) throws IOException {
    if (args.length < 1)
      throw new IllegalArgumentException(
        "usage: UnicodeEditor filename [encoding]");
    String encoding = (args.length == 2)?args[1]:"UTF-16BE";
      UnicodeEditor editor = new UnicodeEditor(args[0], encoding);
    editor.setSize(500,500);
    editor.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
    editor.show();
  }
}

By default this uses the UTF-16BE format, standing for Sixteen-bit Unicode Transformation Format, big-endian byte order. You can specify any encoding when you start the editor, such as UTF-8, ISO-8859-1, etc. But, before we use this editor, we first need to have a file containing Unicode characters. I've written a code generator that generates two files, MathsSymbols.java and MathsSymbolsTest.java:

import java.io.*;
public class UnicodeVariableGenerator {
  public static void generateMathsSymbols() throws IOException {
    PrintWriter out = new PrintWriter(new OutputStreamWriter(
      new FileOutputStream("MathsSymbols.java"), "UTF-16BE"));
    out.println("public interface MathsSymbols {");
    out.print(  "  public static final double ");
    out.print((char)960);
    out.println(" = 3.14159265358979323846;");
    out.print(  "  public static final double ");
    out.print((char)949);
    out.println(" = 2.7182818284590452354;");
    out.println("}");
    out.close();
  }
  public static void generateMathsSymbolsTest() throws IOException {
    PrintWriter out = new PrintWriter(new OutputStreamWriter(
      new FileOutputStream("MathsSymbolsTest.java"), "UTF-16BE"));
    out.println("public class MathsSymbolsTest implements MathsSymbols {");
    out.println("  public static void main(String args[]) {");
    out.println("    System.out.println(\"The value of PI is: \" + \u03C0);");
    out.println("    System.out.println(\"The value of E is: \" + \u03B5);");
    out.println("  }");
    out.println("}");
    out.close();
  }
  public static void main(String[] args) throws IOException {
    generateMathsSymbols();
    generateMathsSymbolsTest();
  }
}

I won't include the code for MathsSymbols.java and MathsSymbolsTest.java, please run the UnicodeVariableGenerator class to generate that code. I already bomb out enough mailing systems by sending my newsletters in HTML (*evil grin*), no use in causing more trouble by using Unicode. Once you've run the UnicodeVariableGenerator, please load the MathsSymbols.java file with the UnicodeEditor, using UTF-16BE and have a look at it: you should see the Greek symbol for PI.

The last "trick" you need to know about is how to compile the MathsSymbols.java and MathsSymbolsTest.java. If you open the files with notepad or vi, you will probably see a rather strangely formatted file, with two bytes being used per character. When you compile these files, you therefore have to specify the character encoding used:

javac -encoding UTF-16BE MathsSymbols*.java

That's it! And it has kept me busy longer than just about all the other newsletters to try and get it right. Another interesting variation of this is where David Treves (who I met through a really cool advanced Java chat list - JavaDesk on YahooGroups - where you get shouted at if you ask beginner questions) tried to write/read Hebrew to the Database. He doggedly tried to get it working until eventually he succeeded - after I had given up hope of ever figuring it out. Stay tuned for the next few weeks to see how he did it.

Until next week, when we celebrate our first anniversary as the most interesting Java newsletter on the Internet ;-)

Kind regards

Heinz


This material from The Java(tm) Specialists' Newsletter by Maximum Solutions (South Africa). Please contact Maximum Solutions for more information.

       

Useful Links
  JDO Tutorials
  EAI Articles
  Struts Tutorials
  Java Tutorials
  Java Certification
Tell A Friend
Your Friend Name
Search Tutorials

 

 
Browse all Java Tutorials
Java JSP Struts Servlets Hibernate XML
Ajax JDBC EJB MySQL JavaScript JSF
Maven2 Tutorial JEE5 Tutorial Java Threading Tutorial Photoshop Tutorials Linux Technology
Technology Revolutions Eclipse Spring Tutorial Bioinformatics Tutorials Tools SQL
 

Home | JSP | EJB | JDBC | Java Servlets | WAP  | Free JSP Hosting  | Search Engine | News Archive | Jboss 3.0 tutorial | Free Linux CD's | Forum | Blogs

About Us | Advertising On RoseIndia.net

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright 2007. All rights reserved.