— 11 min read


In this article we will look at different ways to figure out and fix unnecessary memory consumption by JVM. This is useful for JVM applications that store huge amount of data in memory and can help avoid OutOfMemoryError if application data cannot be partitioned and scaled horizontally.

In part 2, we will take a look on couple of techniques used to reduce JVM heap size. The techniques listed in the order of least invasive to most, so that optimization does just enough for application to work and keeps it as maintainable as possible. The example application can be found in part 1 or in example source code.

Memory optimisation - data structures

As mentioned before we could fit around 204500 orders/54600 users into 64M heap. Let’s take a look on how we can improve this.

The least intrusive change is usually changing data structures that hold objects. These usually conform to some of Java collection interfaces and changing them would not introduce much change in the code. Let’s see how much memory we can save by doing so.

First obvious candidate would be ArrayList as it’s no arg constructor would create a list which will have capacity of 10 after first insertion. However we have around 200K orders across 50K users, which brings the average number of orders per list to 4. We can be aggressive and set that size to 1 initially. That gives us 209000 orders/54900 users. Easy win, but only 2%.

Second optimization has to do with java HashMap, which uses simple chaining implementation. That is memory efficient only if map key collision rate is low. In our case both order and user ids is about 18% (see this test), so there might be potential here. Maps that use open addressing implementation remain memory efficient even with high collision rates. Any open addressing implementation should do, as collision resolution strategy (probing, Robin-Hood, cuckoo, etc) should not significantly impact memory footprint.

I’ll use Agrona Object2ObjectHashMap to test this. Agrona’s implementation also does not create extra Node objects, as it stores keys and values as even and odd indexes in the same array. Changing map implementation (example 8) gives us 231500 orders/56000 users, which is significant 11% increase.

Memory optimisation - using minimal types to store data

Let’s take a look at Order objects themselves.

First step is to look at the structure of stored values. We want them to have as flat structure as possible, minimizing extra object header overhead. In our case we would just get rid of a boxed type. This brings us to 245900 orders/56600 users, which is another 6%. Quite significant increase for this little work.

Now when the low hanging fruit was plucked, we should take a look at out value type and evaluate if we represent data with appropriate types.

In or case, we have another field which is secretly boxed, which is Order#id. At the moment, this field is just a number, but wrapped as String. Most times there is no need for id field to be a string, as we can represent enough unique identifiers with a simple number (or a tuple of numbers) and have a Long keys in a map:

public final class Order {

    private final long id;
    private final String user;
    private final int articleNr;
    private final int count;
    private final int price;
    private final String address;

    // rest of class omitted for clarity
}
Map<Long, Order> ordersById = new Object2ObjectHashMap<>();

By doing so we can fit 276100 order/57900 users, that’s another 12% up. However, we are now forced to waste some memory for autoboxing long, which we can fix by using primitive collection. There are ready libraries to do this, for example Trove collections. Here I’ll use Agrona, since I’ve already used it in example before:

Long2ObjectHashMap<Order> ordersById = new Long2ObjectHashMap<>();

That’s 288300 orders/58300 users, which is extra 4%, and the final code can be observed in example 9.

Memory optimisation - shared object pool

At this point most of memory would be occupied by address String. The problem here is that addresses contain lots of repeating information: cities, post codes, street numbers, names and surnames, etc. What we can try to do, is to break address apart into these components and share repeating instances between Order values. JVM even has built in String pool for us to use!

I’ve decided to break up address like this:

public final class Order {

    private final long id;
    private final String user;
    private final int articleNr;
    private final int count;
    private final int pricePence;
    private final String addressNumber;
    private final String addressStreet;
    private final String addressCity;
    private final String addressRegion;
    private final String addressPostCode;

    // rest of class omitted for clarity
}

Without pooling we can only fit 144200 orders/50000 users into memory, which again proves that object header overhead is very significant. It might seem like a step back, but let’s turn String pooling on in Order constructor (example 10):

@JsonCreator
public Order(
        // some fields omitted
        @JsonProperty("addressNumber") String addressNumber,
        @JsonProperty("addressStreet") String addressStreet,
        @JsonProperty("addressCity") String addressCity,
        @JsonProperty("addressRegion") String addressRegion,
        @JsonProperty("addressPostCode") String addressPostCode) {
    // some fields omitted
    this.addressNumber = addressNumber.intern();
    this.addressStreet = addressStreet.intern();
    this.addressCity = addressCity.intern();
    this.addressRegion = addressRegion.intern();
    this.addressPostCode = addressPostCode.intern();
}

which let’s us fit 386000 orders/61000 users into memory (+34%)! JVM since version 7 does not allocate interned String objects in native memory, therefore this gain is fair comparison (this can be double checked by running JVM with flags -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics and inspecting memory usage).

As far as I know, String pool is the only JVM pool which is usable by application, because Integer pool goes only up to 128. However, it shouldn’t be hard to pool any user defined object. Usability of this method depends on how much of data is repeating and how it could be split and the object overhead of internal pool implementation (e.g. be careful with tree-like collections which allocate extra object for every node).

Memory optimisation - data encoding

Compression approach can save some memory at the expense of processing power and code maintainability as switching form JDK types to custom ones limits the usage of library methods.

Since we only use capital letters and digits for ids, we can encode strings using base 36 encoding. Base 36 values would take only 6 bits each, instead of usual 8 bits (16 bits before Java 9) per character, therefore it is expected to save about 25% of memory (60% before Java9):

public Order(
            @JsonProperty("id") long id,
            @JsonProperty("user") byte[] user,
            // rest omitted

and we only need to store reference to string as a map key:

ordersByUser
        .computeIfAbsent(
                new Base36String(order.getUser()), 
                key -> new ArrayList<>(1)
        )
        .add(order);

More compact encoding (example 11) and not storing extra reference on Order objects allows us to stuff JVM with 481300 orders/62800 users (+24%).

Memory optimisation - ditching references

Now we can try to get rid of extra String object references and store only byte[]. If we ever need a String object later in code we can allocate short-lived object for that operation or use a view wrapper. This is somewhat risky, because arrays in Java are not immutable, but let’s see how much memory we can save that way.

Simple type replacement, however, would not work here as our strings are pooled objects. Therefore, we need to roll out our own pool for byte[]. This is a bit tricky in Java, cause Java arrays do not properly implement hashCode or equals, but we can leverage array order and TreeSet:

public class TreeBasedByteArrayPool implements ByteArrayPool {

    private final NavigableSet<byte[]> values = new TreeSet<>(Arrays::compare);


    @Override
    public byte[] intern(byte[] in) {
        boolean newValue = values.add(in);
        if (newValue) {
            return in;
        }
        else {
            return values.floor(in);
        }
    }
}

and inject pool during deserialization like so:

@JsonCreator
public Order(
        @JsonProperty("id") long id,
        // more fields here ...
        @JacksonInject ByteArrayPool byteArrayPool
)

This approach (example 12) doesn’t help us much, because by getting rid of some references in Order object, we’ve created many more references within TreeSet, and thus for this use-case we can actually fit less objects in memory. Even linear-time array-based implementation does not reduce memory footprint much, which means, that for our data structure this optimization is not useful.

Memory optimisation - object compression

Now what if we could compress the whole Order object? In our case we must only compress fields which are not shared between Order instances (non-address fields are transient), hence we end up with something like this (example 13):

public class OrderCompressed {

    private final byte[] compressedPart;
    // shared fields omitted
    // constructor omitted

    public static OrderCompressed compress(Order order) {
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        try (
                GZIPOutputStream gzipOut = new GZIPOutputStream(out);
                ObjectOutputStream objectOut = new ObjectOutputStream(gzipOut);
        ) {
            objectOut.writeObject(order);
        }
        catch (IOException e) {
            throw new RuntimeException("Unable to serialize " + order, e);
        }
        byte[] bytes = out.toByteArray();
        return new OrderCompressed(
                bytes,
                order.getAddressNumber(),
                order.getAddressStreet(),
                order.getAddressCity(),
                order.getAddressRegion(),
                order.getAddressPostCode()
        );
    }
}

In our case this also does not give any memory reduction, as we are only compressing couple of primitive fields. This technique is much more useful where it is impossible to share much data between retained objects.

Conclusion

Using appropriate data structures, variable types, compression and pooling we have reduced memory consumption by 135% (fitting 481300 orders instead of 204500). More memory usage reduction can be achieved by bulk compressing larger chunks of data together or moving off-heap. These however impact code and architecture quite a bit and must be considered along with data sharding approaches.