Recently, a friend of mine asked me on how he can improve performance in an open-source Java application that he was working on. I was reminded of the past (i.e., circa 2003) when Java was being bashed for poor performance and when a number of books were written on improving performance of all most all parts of Java.
While it is true that an application running within a container will be naturally less performant than a native application, the question one should ask is whether the optimal performance they are getting is good enough for them, as opposed whether it is the best. When it comes to performance, in most cases, 'works for the situation' is more useful than being the fastest horse in the racetrack.
Coming back to the issue in question, I think there are some basic steps one can take to drastically improve the performance of a Java application with relatively minimal effort. In this case, the application was a data extraction tool that was writing thousands of small (1 - 5KB) XML files. Based on this, the following recommendations can be provided.
JVM Garbage Collection parameters
If the size of the XML is small, it is reasonable to assume that the application may be using DOM instead of SAX. As DOM is an in-memory XML model, it would create a lot of objects that live for a short period of time.
By default, a JVM has two parameters - Xms and Xmx, which is the minimum or start-up memory the JVM will use and the maximum amount of memory it can possibly use (beyond which an OutOfMemoryError will be thrown). By default, Xms parameter is around 40MB to 64MB for Windows systems and the Xmx is around 128MB - 256MB. The JVM would start with the Xms memory and if it hits the limit, will then keep increasing the memory till it reaches Xmx value in steps. The problem is that this incremental increase comes at a performance cost, especially if you start really low compared to the needs of the application. If you have a reasonable idea of your application's minimum memory requirements, or if your system has enough memory to spare, it's a good idea to boost this to a much bigger number.
In this case, the numbers were boosted to 512MB for Xms and 1024MB for Xmx. Additionally, it is a good idea to increase the -XX:MaxPermSize value from the default 32MB to a more respectable 128MB or even 256MB (which was the value used in this case).
Once these changes were done, the application, which took 1 hour 15 minutes to run, went down to 28 minutes - and all without changing a single line of code!
Choose your OS wisely
The second improvement came not within Java, but outside. This particular application was creating around 300,000 files under a single folder. Windows typically does not handle huge file volumes nicely within a single folder. The optimal value seems to be somewhere around 2,000 files. UNIX based systems on the other hand, have no such issues.
The application, thanks to Java's portability, was moved to a UNIX based system. The running time went down from 28 minutes to 13 minutes - again without changing a single line of code!
There are a number of other tweaks that can be made, and most without changing the actual application itself - hopefully for another blog down the road...
So, the bottom line is, don't blame the language/library without spending some time to fine tune the performance - more importantly, sometimes it just takes a few minutes of effort to make a big difference.
No comments:
Post a Comment