Tuesday, May 22, 2007

Minimizing Java bytecode size

I played around with the topic mentioned in the title of this entry when I was considering entering the Java 4K Programming Contest. I ended up not participating in the contest because I couldn't dream up any cool ideas for a game, but I did come up with a list of things which help with the code size. I would like to point out that the 4K limit is for the executable class file and not the source file and thus these actions have the intention of reducing the size of the compiled class file.

Some of these are obvious. Others might be less obvious to the masses.

Action: Don't bother with generics.
Justification: Generics stuff is not used by the JVM and it only takes up space.

Action: Compile without debug information or strip it afterwards.
Justification: This includes source code linenumber information, source code file name and localvariable names. The attributes in the class file to be removed are Source, LocalVariableTable and LineNumberTable. These are not required as they only exist for debugging purposes.

Action: Rename all class, method and field names to be one character.
Justification: One character takes up less space than two or three or four.

Action: Do not define a package.
Justification: The package definition provides no functionality (in this context) and takes up space.

Action: Minimize the number of methods.
Justification: Method headers take up a lot of space.
2 (accessflags) +
2 (name index) +
2 (descriptor index) +
20 (empty Code attribute)

That's 26 bytes and still assuming you manage to reuse the name string in the constant pool and reuse the method descriptor (which defines the return type and parameters) and not throw any exceptions.

Action: Minimize the number of fields.
Field headers take up a lot of space.
2 (accessflags) +
2 (name index) +
2 (descriptor index) +
2 (0 attributes)

That's 8 bytes and still assuming you manage to reuse the name string in the constant pool and manage to reuse the descriptor.

Action: Strip any method throws information.
Justification: The VM doesn't use this information. You can't tell the compiler not to create it, but it can be stripped afterwards.

Action: Rearrange local variables, putting the four that are the most used first. Heavily reuse these variables.
Justification: Instructions that refer to the local variables 0-3 take up one byte and instructions that refer to the rest of the local variables take up two bytes.

Action: Set the scope of the local variables so that only 4 variables are in scope at any given time.
Justification: Two variables, even if they're of different types, can be stored in the same local variable "slot" if their scopes don't overlap. And why using the first four local variables is good is explained in the previous item.

Action: Reuse string constants. As well as any string literals in the code, this includes class, field and method-names. If you have one class which has one field and the main method (to have a Java application entry point) name the class and the field "main" as well.
Justification: The compiler will only need to put one string entry into the class file constant pool.