I Will Get That Job At Google.

Getting prepared for the Google interview

Java memory leaks

Java garbage collector is not a silver bullet for most common cases where memory consumption is quite high. In the article by IBM shown how such situations occur in request dispatchers where references to dispatched and pending requests are being held in particular collections (or other containers).

Typical memory leak situation:

Here are some other cases which occur frequently.

ObjectInputStream and ObjectOutputStream

ObjectInputStream and ObjectOutputStream keep references to all objects they have seen in order to send subsequent occurences of the same object as references rather than copies (and thereby deal with circular references). This causes a memory leak when you keep such a stream open indefinitely (e.g. when using it to communicate over the network).

Fix for this problem is to call reset() periodically

Substrings

The most common way where memory leaks in Java occur. As I wrote before, substrings in Java are not being constructed as independent objects. They just use other length & offset in char array of the original char array.

Example from stackowerflow: you read in a 3000 character record and get a substring of 12 characters, returning that to the caller (within the same JVM). Even though you don’t directly have a reference to the original string, that 12 character string is still using 3000 characters in memory.  For systems that receive and then parse lots of messages, this can be a real problem.

The solution for this problem is to use constructors or to use intern() method after the substring call.

String split

The same for String.split() method, which does store the initial char sequence in memory.

Thread stacks

Each instantiation of a Thread allocates memory for a stack (default 512k, tuneable via -Xss). A naive attempt to heavily multi-thread an application will result in a sizable non-obvious consumption of memory.

Other interesting thing with threads is the situation where start() method of threads is never being called. In this case the stack memory will never be reclaimed.

Non-static inner classes

Any non-static inner classes you make hold on to outer classes. So that innocent-looking inner class can hold onto a huge object graph. Put the instance in a static or application-wide collection somewhere and you’re using a lot of memory you’re not supposed to be using.

Observer-based applications

As well the situation occurs in applications which use the Observer pattern. It’s easy to understand, because the observer holds all the references to subscribed listeners even if they are not needed after some usage. It’s necessary to remove unnecessary listeners from observer to avoid such situation.

Singleton

Once the Singleton class is instantiated, it remains in memory for the application’s lifetime. The other objects will also have a live reference to it and, as a result, will never be garbage collected.

ThreadLocal variables

A thread local variable is referenced by its thread and as such its lifecycle is bound to it. In most application servers threads are reused via thread pools and thus are never garbage collected. If the application code is not carefully clearing the thread local variable you get a nasty memory leak.

Mutable static objects

The most common reason for a memory leak is the wrong usage of statics. A static variable is held by its class and subsequently by its classloader. While a class can be garbage collected it will seldom happen during an applications lifetime. Very often statics are used to hold cache information or share state across threads. If this is not done diligently it is very easy to get a memory leak. Especially static mutable collections should be avoided at all costs for just that reason. A good architectural rule is not to use mutable static objects at all, most of the time there is a better alternative.

Classloaders

Classes are referenced by their classloader and normally they will not get garbage collected until the classloader itself is collected. That however only happens when the application gets unloaded by the application server or OSGi container.

Some tips to avoid memory leak problems in future:

  1. Use profilers. Profilers will help to see what objects appear in heap more frequently and which of them weren’t removed
  2. You may add some logic to finalize() method to log when object is being collected by GC. Described here
  3. Carefully check whether long-living static objects are really needed.
  4. Clear elements in collections which are never being used 

References:

  1. http://stackoverflow.com/questions/1281549/memory-leak-traps-in-the-java-standard-api
  2. http://www.javaworld.com/javaworld/jw-03-2006/jw-0313-leak.html?page=1
  3. http://blog.dynatrace.com/2011/04/20/the-top-java-memory-problems-part-1/
  4. http://www.ibm.com/developerworks/java/library/j-jtp11225/index.html
  5. http://www.ibm.com/developerworks/java/library/j-jtp01246/index.html

Update: This article was written specially for Habrahabr sandbox and previously published on it. Check the russian version in the Java blog: http://habrahabr.ru/blogs/java/132500/

Update2: Thanks to vladoos from habrahabr.ru for IBM’s “Java theory and practice: Plugging memory leaks with weak references” links.

  1. iwillgetthatjobatgoogle posted this