Skip to content

Code Optimization

August 5, 2009
tags:

Definition
Code produced by standard algorithms can often be made to run faster or take less space or both. These things can be achieved through transformation called Code Optimization.
Compilers that apply these transformations are called Optimizing Compilers. It is more special to optimize the frequently executed part of a program.

Criteria for Optimization:
A transformation must preserve the meaning of a program. It cannot change the output procedure for any input parameters and it cannot introduce an error. Transformation should be on average, speed up programs and should be effective.

Why we need Code Optimization:
As a rule 90/10, 90 percentage of the program￯﾿ᄁ￯ᄒタ￯ᄒルs execution time is spent on executing the 10 percent of the code. Optimizing the other 90 percent of the of the program where the 10 percent of the execution time is spent has no noticeable effect on performance. If we are able to make the 90 percent of the code execute twice as fast, the program would be five percent faster than before. So the first task in optimizing the code is to identify the 10 percentage of the program that consumes most of the execution time.

General Optimization Techniques:
There are several common optimization techniques that apply independent of the being used. Some of the techniques, such as global register allocation, are difficult strategies to allocate machine resources such as CPU registers. Here are some techniques

Strength Reduction:
Strength reduction occurs when an operator is replaced by an equivalent operation that executes faster. The most common examples of strength reduction is using the shift operator to multiply and divide integers by power of 2.
Example:

+ x>>2 can be used instead x/4

+ x<<1 can be used instead x*2

Common sub expression elimination:
Common sub expression elimination removes redundant calculation.

Example:

double x = d* (len/ brt) * a;

double y = d* (len/ brt) * b;

Common sub expressions is calculated once and used for both calculations.

double calc = d* (len/ brt) * a;

double x = calc * a;

double y = calc * b;

Code Motion:
Code moves an operation that performs an operation or calculates an expression whose result does not change or it is invariant. The code is moved only executes when the result may change, rather than executing each time the result is required. This is most common with the loops, but it can also involve code repeated on each invocation of method.

Example:

for(int i; i< x.length ; i++) .{

x[i] * = Math.PI* Math. Cos(y) ;

}

Instead of using this we can use

double calc = Math . PI *Math. cos(y) ; for(int i; i< x.length ; i++)

x[i]* = calc;

Unrolling Loops:
Unrolling loops reduces the over head of loop control code by performing more than one operation each time through the loop, and consequently executing the fewer iterations. If we know if the length of x[] is always a multiple of two, we rewrite the loop as

Example:

double calc = Math . PI *Math. cos(y) ; for(int i; i< x.length ; i++){

x[i]* = calc;

`x[i+1];

}

Understanding the Benchmarks:

The purpose of benchmarking operations is to know which operations to avoid or to replace with alternate implementations. This varies between environments — and especially between interpreters and JIT compilers that convert Java's bytecodes into native code at run time. Interpreted code will want to take advantage of the bytecode instructions, while a JIT compiler often will have strengths and weaknesses separate from the bytecode. Here we can discuss about the ￯﾿ᄁ￯ᄒタ￯ᄒワThe instantiation benchmark￯﾿ᄁ￯ᄒタ?in detail.

The instantiation benchmark:

Instantiating an object is fairly expensive in terms of CPU cycles. Regarding this, discarded objects will need to be garbage collected at some point. The efficiency of garbage collection is difficult to measure, and the Benchmark applet makes no attempt to measure it. However, if you run the instantiation benchmark, most of the extra time between the "test" time and the total running time is spent garbage collecting, so this gives some idea of the impact garbage collection can have on the performance of code that instantiates lots of objects. Also, the longer the hierarchy chain for the object needs more constructors to be called. This adds to the instantiation overhead. If you add extra layers to your class hierarchy for increased abstraction, you'll also get increased instantiation time.

One other minor issue with instantiating objects: When an object is instantiated, all of its instance variables automatically are initialized to their default values, which not coincidentally is the value with all bits set to zero for all data types. If the instance variables are explicitly initialized to their default values, this will duplicate the default initialization, generate additional bytecodes, and make the instantiation take longer.

Example:

for(int i; i< x.length ; i++) .{
String str = new String( );
.
.
.
.
}
In the above example the string str is instantiated with in the loop so as a result it will be instantiated over the loop count. To avoid this we can rewrite it as

String str = new String( );

for(int i; i= 0)
{
sub = s.substring(i,j);
i = j + 1;
j = s.indexOf(delimiter, i); // Rest of substrings
}

sub = s.substring(i); // Last substring
}

Next you can use this method as
byte[] b =getAsciiBytes(s);

5. For converting bytes to String, do not use the String
String s= new String (bytebuf);

Instead declare it as
String s= new String (bytebuf, 0);

This method works only for ASCII data, but the performance is really good. In HTTP protocol most of the times we deal with with only ASCII data.

6. StringTokenizer is one of the most popularly used function in Java. It really gives the flexibility of parsing Strings with parsing capability of multiple string length delimiter. For example it can parse Strings separated by “bqr” as delimiter as follows,

StringTokenizer st =new StringTokenizer(myStr, “bqr”);

But then there’s always overhead during this operation. Since this has to compare each character, if it belongs in the set of delimiters.

Instead you can you this method, if Strings need to be delimited only by a single character,

public static void mySimpleTokenizer(String s, String delimiter)
{
String sub = null;
int i =0;
int j =s.indexOf(delimiter); // First substring

while( j >= 0) {
sub = s.substring(i,j);
i = j + 1;
j = s.indexOf(delimiter, i); // Rest of substrings
}
sub = s.substring(i); // Last substring
}

The above method works almost 4 times faster than StringTokenizer, because of less overhead. But use this method with caution, since you might be loosing maintainability of your program.

7. Don’t repeat the same function in conditional statements. For Example instead of writing

for (int i=0 ; i < s.length; i++) {
charc c =s.charAt(i);
}

Write it as ,

int j =str.length();
for (int i =0 ; i < j; i++) {
charc c =s.charAt(i);
}

Doing this you have removed a Extra overhead on Conditional Statement.

8. In the String Class prefer charAt() method instead of startsWith() method. String class provides this method to decide if the String starts starts with given Sub string. From performance perspective, startWith() makes quite a few comparisons preparing itself to compare it's prefix with another string.

So instead of writing,

if (s.startsWith("a") ) {
// Do something here
}

write it as,

if ("a" == s.charAt(0) ) {
// Do something here
}

9. Instead of writing,

String s ="a";
s += "b";
s += "c";

declare it as,

String s="a" + "b" + "c";

Since compiler is intelligent enough to replace above string with with
String s = "abc"

instead of creating new String object and appending them together in different steps.

10. Don't create objects unless required. For example do not declare

Date myDate =new Date();
if (requiredCondition) {
// usemyDate
}

Instead declare the Date object inside the if condition, so that it is not initialized if the required Condition is false. So declare the code as,

if (requiredCondition){
Date myDate =new Date();
// use myDate
}

11. Do not declare the objects twice. For example in the following code

public class x{
privateVector v = new Vector();
public x() {
v = new Vector();
}
}

The compiler generates the following code for the Constructor,

public x() {
v = new Vector();
v = new Vector();
}

By default anything initialized as public variables, the code for initialization is moved to constructor. So if you have initialized the public variable outside the default constructor , then do not initialize it inside the Constructor. So declare the code as,

public classx {
privateVector v;
public x() {
v = new Vector();
}
}

12. In the conditional statements, the conditions are evaluated in the order they are placed. For example if we declare as,

if (cond1 | | cond2){
// Do some operation
}

then if the "cond1" is true, then "cond2" is never evaluated. It is compared only if "cond1" is false.
Always try to take advantage of this one. For instructions which require more processing time, place them at the end. For Example declare it as,

boolean editmode = curStats1&& curStatus2;
if (editMode ||getDataStatusFromDataBase())
{
// do some operation
}

In the above statement getDataStatusFromDataBase() is not called if edit Mode is true.

13. For Large Scale enterprise applications, it is common to add new Features and remove old features. if possible, if the old feature codes are never going to be used, then remove the old code. Don't try to place old code in,

if (lfOldFeatureRequired){
// Old Code.
}

this makes the code larger and takes more maintenance time. Also It makes its lower to load the class at runtime.

14. Vector provides the following methods to insert elements.

addElementAt( e, index)
addElement (e)
add(e)
add(index, e)

Out of these try to avoid using methods, addElementAt( e, index) and add(index, e). The way these methods work is , all the elements are moved down between the insertion point and the end of the vector, making space for the new Element. The same works for deleting element at a Particular index. If possible, if these features are required, then try to use a different Data Structure if possible.

15. If the approximate size of the Vector is know initially then use it. Instead of declaring Vector as,
Vector v = newVector();
declare it as,

Vector v = new Vector(40);
or Vector v = new Vector(40,25);

This method indicates initial capacity of Vector is 40 and increment by 25 elements per expansion.

The way the Vector is expanded is, a new Vector of double the size of currentVector is created, all the Elements in the old Vector is copied to the new Vector and then the old Vector is discarded. (During GC). This has major effect on performance.

So if the initial size of the Vector is known, use it. Same goes for the size of Hashtable and related Data structures.

16. Avoid using the Enumeration class. The way Enumeration class is used is,

for(Enumeration enum = v.elements(); enum.hasMoreElements() ) {
s = (String)enum.nextElement();
// Do SomeOperation
}

In this code, enum.hasMoreElements() takes lot of processing time. Also for enum.nextElement() , enumeration class has to do lot of processing internally like incrementing the internal Counter etc.

Instead for retrieving elements from Vector, here is the preferred way,

int size = v.size();
for (int i = 0; i< size; i++) {
s = (String)v.elementAt(i);
// Do SomeOperation
}

17. If the code is executing in a SingleThread, then use ArrayList instead of Vector. All the methods in Vector are Synchronized. For example for the following methods,

public Object elementAt( int index) {
return elementData [ index ];
}

In the Vector class, the elementAt() method explicitly verifies that the internal array index does not fall out of bounds. Methods in ArrayList leaves the task to user and are not Synchronized. This makes the execution faster.

18. Couple of times it becomes necessary in code to retrieve element at a particular position in the Vector and then remove that Element. Code for this one would be,
e = v.elementAt(index);
v.removeElementAt (index);

In such a scenario, use the following code
e =v.remove(index);

This returns element at index position and then removes that element from Vector. This saves call to elementAt() method.

19. In Hashtable couple of times we try to get Element with particular key and if it does not exist, then we try to create new Element and insert that Element into Hashtable. Sometimes code is written as,

if( htable.containsKey(keyCode) ) {
return htable.get(keyCode);
}
else {
MyData md = createNewData(keyCode);
htable.put ( keyCode, md);
return md;
}

Here, we made performance overhead by calling these two methods on the same Hashtable,containsKey() and get(). Both these methods do the same thing, except containsKey() returns true/false and get()returns the actual Element. If the Hashtable is of considerable size then the overhead is high. Instead write the code as follows,

if ( (md = (MyData)htable.get (keyCode) != null ){
return md;
}
else {
// insert New Element into Hashtable here.
}

20. Cache Data as much as possible both on the Server and Client side as much as possible. For example do no try to get Data from Database every time which is hardly going to change for a long time. Either store it in local text or properties files or cache it for a pre defined interval once it is read from Database.

21. InetAddress.getHostAddress() has a lot of new operations. It creates a lot of intermediate strings to return the host address. Avoid it, if possible.

22. java.util.Date has some performance problems, particularly with internationalization. If you frequently print out the current time as something other than the (long ms-since-epoch) that it is usually represented as, you may be able to cache your representation of the current time and then create a separate thread to update that representation every N seconds (N depends on how accurately you need to represent the current time). You could also delay converting the time until a client needs it, and the current representation is known to be stale.

23. Avoid java.lang.String.hashCode() . If the String's length exceeds 16 characters, hashCode() samples only a portion of the String. So if the places that a set of Strings differ in don't get sampled you can see lots of similar hash values. This can turn your hash tables into linked lists!

24. Good Design is very important for any project. In most applications, good performance comes from getting the architecture right. Using the right data structures for the problem you're solving is a lot more important than tweaking String operations. Thread architecture is also important. (Try to avoid wait/notify operations–they can cause a lot of lock contention in some VMs.) And of course you should use caching for your most expensive operations.

25. This is one of the most common mistakes. A lot of people do something like the following:

debug("Debug Comment: " + var1 + var1 + "Current Value");
public static void debug(String s) {
System.err.println(s);
}

Then they think that they've turned off debugging overhead. Nope! If there are enough debugging statements, you can see a lot of time spent in creating new strings to evaluate "Debug Comment: " + var1 + var1 + "Current Value", which is then tossed after calling debug .

Disadvantages of Optimization:
If your code already works, optimizing is a sure way to introduce new, and there is possibility for bugs

Optimization tends to make code harder to understand and maintain

Some of the techniques presented here increase speed by reducing the extensibility of the code

Optimizing code for one platform may actually make it worse on another platform

A lot of time can be spent optimizing, with little gain in performance, and can result in obfuscated code

If you're overly obsessed with optimizing code, people will call you a nerd behind your back

Advertisements
No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: