Home | JSP | EJB | JDBC | Java Servlets | WAP  | Free JSP Hosting  | Spring Framework | Web Services | BioInformatics | Java Server Faces | Jboss 3.0 tutorial | Hibernate 3.0 | XML

Tutorial Categories: Ajax | Articles | JSP | Bioinformatics | Database | Free Books | Hibernate | J2EE | J2ME | Java | JavaScript | JDBC | JMS | Linux | MS Technology | PHP | RMI | Web-Services | Servlets | Struts | UML


 

Search Host

Monthly Fee($)
Disk Space (MB)
Register With us for Newsletter!
Visit Forum! Post Questions!
Jobs At RoseIndia.net!

Have tutorials?
Add your tutorial to our Java Resource and get tons of hits.

We offer free hosting for your tutorials. and exposure for thousands of readers. drop a mail
roseindia_net@yahoo.com
 
   

Tutorials

Java Server Pages

JAXB

Java Beans

JDBC

MySQL

Java Servlets

Struts

Bioinformatics

Java Code Examples

Interview Questions

 
Join For Newsletter

Powered by groups.yahoo.com
Visit Group! Post Questions!

Web Promotion

Web Submission

Submit Sites

Manual Submission?

Web Promotion Guide

Hosting Companies

Web Hosting Guide

Web Hosting

Linux

Beginner Guide to Linux Server

Frameworks

Persistence Framework

Web Frameworks

Free EAI Tools

Web Servers

Aspect Oriented Programming

Free Proxy Servers

Softwares

Adware & Spyware Remover

Open Source Softwares

Appending Strings

       

2003-04-21 The Java Specialists' Newsletter [Issue 068] - Appending Strings

Author: Dr. Heinz M. Kabutz

If you are reading this, and have not subscribed, please consider doing it now by going to our subscribe page. You can subscribe either via email or RSS.


Welcome to the 68th edition of The Java(tm) Specialists' Newsletter, sent to 6400 Java Specialists in 95 countries. The softcover book of all newsletters up to the 65th issue is now available for purchase within South Africa. Please have a look at our webpage. I keep a copy of the book next to my bed, for obvious reasons (when I cannot wake up in the mornings, I read a few of the newsletters, that wakes up my brain and puts me in a good mood).

Since our last newsletter, we have had two famous Java authors join the ranks of subscribers. It gives me great pleasure to welcome Mark Grand and Bill Venners to our list of subscribers. Mark is famous for his three volumes of Java Design Patterns books. You will notice that I quote Mark in the brochure of my Design Patterns course. Bill is famous for his book Inside The Java Virtual Machine. Bill also does a lot of work training with Bruce Eckel.

Our last newsletter on BASIC Java produced gasps of disbelief. Some readers told me that they now wanted to unsubscribe, which of course I supported 100%. Others enjoyed it with me. It was meant in humour, as the warnings at the beginning of the newsletter clearly indicated.

For those living in Cape Town, South Africa, we are doing another Design Patterns Course in May 2003, please see the advert at the bottom of this newsletter. If you come on that course, you will also receive a free copy of the Java Specialists' Newsletters Book.

Appending Strings

The first code that I look for when I am asked to find out why some code is slow is concatenation of Strings. When we concatenate Strings with += a whole lot of objects are constructed.

Before we can look at an example, we need to define a Timer class that we will use for measuring performance:

/**
 * Class used to measure the time that a task takes to execute.
 * The method "time" prints out how long it took and returns
 * the time.
 */
public class Timer {
  /**
   * This method runs the Runnable and measures how long it takes
   * @param r is the Runnable for the task that we want to measure
   * @return the time it took to execute this task
   */
  public static long time(Runnable r) {
    long time = -System.currentTimeMillis();
    r.run();
    time += System.currentTimeMillis();
    System.out.println("Took " + time + "ms");
    return time;
  }
}

In the test case, we have three tasks that we want to measure. The first is a simple += String append, which turns out to be extremely slow. The second creates a StringBuffer and calls the append method of StringBuffer. The third method creates the StringBuffer with the correct size and then appends to that. After I have presented the code, I will explain what happens and why.

public class StringAppendDiff {
  public static void main(String[] args) {
    System.out.println("String += 10000 additions");
    Timer.time(new Runnable() {
      public void run() {
        String s = "";
        for(int i = 0; i < 10000; i++) {
          s += i;
        }
        // we have to use "s" in some way, otherwise a clever
        // compiler would optimise it away.  Not that I have
        // any such compiler, but just in case ;-)
        System.out.println("Length = " + s.length());
      }
    });

    System.out.println(
        "StringBuffer 300 * 10000 additions initial size wrong");
    Timer.time(new Runnable() {
      public void run() {
        StringBuffer sb = new StringBuffer();
        for(int i = 0; i < (300 * 10000); i++) {
          sb.append(i);
        }
        String s = sb.toString();
        System.out.println("Length = " + s.length());
      }
    });

    System.out.println(
        "StringBuffer 300 * 10000 additions initial size right");
    Timer.time(new Runnable() {
      public void run() {
        StringBuffer sb = new StringBuffer(19888890);
        for(int i = 0; i < (300 * 10000); i++) {
          sb.append(i);
        }
        String s = sb.toString();
        System.out.println("Length = " + s.length());
      }
    });
  }
}

This program does use quite a bit of memory, so you should set the maximum old generation heapspace to be quite large, for example 256mb. You can do that with the -Xmx256m flag. When we run this program, we get the following output:


String += 10000 additions
Length = 38890
Took 2203ms
StringBuffer 300 * 10000 additions initial size wrong
Length = 19888890
Took 2254ms
StringBuffer 300 * 10000 additions initial size right
Length = 19888890
Took 1562ms

You can observe that using StringBuffer directly is about 300 times faster than using +=. Another observation that we can make is that if we set the initial size to be correct, it only takes 1562ms instead of 2254ms. This is because of the way that java.lang.StringBuffer works. When you create a new StringBuffer, it creates a char[] of size 16. When you append, and there is no space left in the char[] then it is doubled in size. This means that if you size it first, you will reduce the number of char[]s that are constructed.

The time that the += String append takes is dependent on the compiler that you use to compile the code. I discovered this accidentally during my Java course last week, and much to my embarrassment, I did not know why this was. If you compile it from within Eclipse, you get the result above, and if you compile it with Sun's javac, you get the output below. I think that Eclipse uses jikes to compile the code, but I am not sure. Perhaps it even has an internal compiler?


String += 10000 additions
Length = 38890
Took 7912ms
StringBuffer 300 * 10000 additions initial size wrong
Length = 19888890
Took 2634ms
StringBuffer 300 * 10000 additions initial size right
Length = 19888890
Took 1822ms

Why the difference between compilers?

This took some head-scratching, resulting in my fingers being full of wood splinters. I started by writing a class that did the basic String append with +=.

public class BasicStringAppend {
  public BasicStringAppend() {
    String s = "";
    for(int i = 0; i < 100; i++) {
      s += i;
    }
  }
}

When in doubt about what the compiler does, disassemble the classes. Even when I disassembled them, it took a while before I figured out what the difference was and why it was important. The part where they differ is in italics. You can disassemble a class with the tool javap that is in the bin directory of your java installation. Use the -c parameter:


javap -c BasicStringAppend

Compiled with Eclipse:
Compiled from BasicStringAppend.java
public class BasicStringAppend extends java.lang.Object {
    public BasicStringAppend();
}

Method BasicStringAppend()
   0 aload_0
   1 invokespecial #9 <Method java.lang.Object()>
   4 ldc #11 <String "">
   6 astore_1
   7 iconst_0
   8 istore_2
   9 goto 34
  12 new #13 <Class java.lang.StringBuffer>
  15 dup
  16 aload_1
  17 invokestatic #19 <Method java.lang.String valueOf(java.lang.Object)>
  20 invokespecial #22 <Method java.lang.StringBuffer(java.lang.String)>
  23 iload_2
  24 invokevirtual #26 <Method java.lang.StringBuffer append(int)>
  27 invokevirtual #30 <Method java.lang.String toString()>
  30 astore_1
  31 iinc 2 1
  34 iload_2
  35 bipush 100
  37 if_icmplt 12
  40 return

Compiled with Sun's javac:
Compiled from BasicStringAppend.java
public class BasicStringAppend extends java.lang.Object {
    public BasicStringAppend();
}

Method BasicStringAppend()
   0 aload_0
   1 invokespecial #1 <Method java.lang.Object()>
   4 ldc #2 <String "">
   6 astore_1
   7 iconst_0
   8 istore_2
   9 goto 34
  12 new #3 <Class java.lang.StringBuffer>
  15 dup
  16 invokespecial #4 <Method java.lang.StringBuffer()>
  19 aload_1
  20 invokevirtual #5 <Method java.lang.StringBuffer append(java.lang.String)>
  23 iload_2
  24 invokevirtual #6 <Method java.lang.StringBuffer append(int)>
  27 invokevirtual #7 <Method java.lang.String toString()>
  30 astore_1
  31 iinc 2 1
  34 iload_2
  35 bipush 100
  37 if_icmplt 12
  40 return

Instead of explaining what every line does (which I hope should not be necessary on a Java Specialists' Newsletter) I present the equivalent Java code for both IBM's Eclipse and Sun. The differences, which equate to the disassembled difference, is again in italics:

public class IbmBasicStringAppend {
  public IbmBasicStringAppend() {
    String s = "";
    for(int i = 0; i < 100; i++) {
      s = new StringBuffer(String.valueOf(s)).append(i).toString();
    }
  }
}
public class SunBasicStringAppend {
  public SunBasicStringAppend() {
    String s = "";
    for(int i = 0; i < 100; i++) {
      s = new StringBuffer().append(s).append(i).toString();
    }
  }
}

It does not actually matter which compiler is better, either is terrible. The answer is to avoid += with Strings wherever possible.

Throw the used StringBuffers away!

You should never reuse a StringBuffer object. Construct it, fill it, convert it to a String, and then throw it away.

Why is this? StringBuffer contains a char[] which holds the characters to be used for the String. When you call toString() on the StringBuffer, does it make a copy of the char[]? No, it assumes that you will throw the StringBuffer away and constructs a String with a pointer to the same char[] that is contained inside StringBuffer! If you do change the StringBuffer after creating a String, it makes a copy of the char[] and uses that internally. Do yourself a favour and read the source code of StringBuffer - it is enlightning.

But it gets worse than this. In JDK 1.4.1, Sun changed the way that setLength() works. Before 1.4.1, it was safe to do the following:


  ... // StringBuffer sb defined somewhere else
  sb.append(...);
  sb.append(...);
  sb.append(...);
  String s = sb.toString();
  sb.setLength(0);

The code of setLength pre-1.4.1 used to contain the following snippet of code:

if (count < newLength) {
  // *snip*
} else {
  count = newLength;
  if (shared) {
    if (newLength > 0) {
      copy();
    } else {
      // If newLength is zero, assume the StringBuffer is being
      // stripped for reuse; Make new buffer of default size
      value = new char[16];
      shared = false;
    }
  }
}

It was replaced in the 1.4.1 version with:

if (count < newLength) {
  // *snip*
} else {
  count = newLength;
  if (shared) copy();
}

Therefore, if you reuse a StringBuffer in JDK 1.4.1, and any one of the Strings created with that StringBuffer is big, all future Strings will have the same size char[]. This is not very kind of Sun, since it causes bugs in many libraries. However, my argument is that you should not have reused StringBuffers anyway, since you will have less overhead simply creating a new one than setting the size to zero again.

This memory leak was pointed out to me by Andrew Shearman during one of my courses, thank you very much! For more information, you can visit Sun's website.

When you read those posts, it becomes apparent that JDOM reuses StringBuffers extensively. It was probably a bit mean to change StringBuffer's setLength() method, although I think that it is not a bug. It is simply highlighting bugs in many libraries.

For those of you that use JDOM, I hope that JDOM will be fixed soon to cater for this change in the JDK. For the rest of us, let us remember to throw away used StringBuffers.

I hope to see some of you on the course next month where we look at some Java Design Patterns.

So long...

Heinz

This material from The Java(tm) Specialists' Newsletter by Maximum Solutions (South Africa). Please contact Maximum Solutions for more information.

       

Useful Links
  JDO Tutorials
  EAI Articles
  Struts Tutorials
  Java Tutorials
  Java Certification
Tell A Friend
Your Friend Name
Search Tutorials

 

 
Browse all Java Tutorials
Java JSP Struts Servlets Hibernate XML
Ajax JDBC EJB MySQL JavaScript JSF
Maven2 Tutorial JEE5 Tutorial Java Threading Tutorial Photoshop Tutorials Linux Technology
Technology Revolutions Eclipse Spring Tutorial Bioinformatics Tutorials Tools SQL
 

Home | JSP | EJB | JDBC | Java Servlets | WAP  | Free JSP Hosting  | Search Engine | News Archive | Jboss 3.0 tutorial | Free Linux CD's | Forum | Blogs

About Us | Advertising On RoseIndia.net

Send your comments, Suggestions or Queries regarding this site at roseindia_net@yahoo.com.

Copyright 2007. All rights reserved.