Beginners guide to Java Garbage Collector

Beginners guide to Java Garbage Collector

·

12 min read

Context

Often developers find Garbage Collector(GC) as a synonym for automated memory management. People tend to underestimate the challenges that come with it. Every automated solution has some trade-off associated with it. In C/C++, you need to take care of memory management. So, the margin of error should be narrow, or your application could end up having memory issues during runtime. On the good side, it gives the flexibility to have your custom memory management that is tailored to your application. So, what is the challenge associated with an automated solution? well, Garbage Collector automates memory management. But, it will not auto-adjust with your application. It is a generic solution that addresses most memory management concerns.

We should be crystal clear about the application behavior and requirements. Then, we need to pick up an appropriate Garbage Collector that can address it on a long-term basis. We may also need to apply some sort of tuning by specifying appropriate JVM memory arguments. Also, tuning is not only about JVM arguments but is more about how you have written the code. For eg: a refactored method with few lines of code, where local objects have a short lifespan.

Significance

I mentioned earlier that GC is a generic solution. It means that it will be just enough for most cases. Also, the JVM will automatically pick up a GC implementation depending on the host if you don't specify one. So, where is the catch? Let's take a close look at the basic operations performed by GC: mark, sweep and compact. In the mark phase, GC identifies dead objects that are no longer used by the application. Memory fragments for these dead objects will be released in the sweep phase. Finally, GC will compact the memory to get rid of fragmented memory space. Garbage collection events are so-called "stop the world" events. The application/threads will be paused until it is completed. Frequent GC events may result in frequent pauses in the application. A rare GC event may result in a long pause in the application if there are a large number of objects to be garbage collected.

Modern Garbage Collectors have very small pause time and can run GC events concurrently. But that doesn't mean one of them is a default option for your application. For applications that are sensitive to responsiveness, we may use a Java GC like Shenandoah. If you have sufficient heap memory available, then you may go with ZGC. Recent versions of JDKs provide G1GC as the default garbage collector, which also has a low pause time. However, note that these garbage collectors bring different results for your application. It is important to understand the impact of a Garbage Collector choice and its respective tuning exclusive to your application. Ideally, a very low pause time is desired and GC events should not be too frequent. We will discuss these garbage collector choices in depth when we brainstorm the same later on.

Internals

Garbage Collector is hosted by JVM in their execution engine. Let's assume that JVM automatically chose a GC implementation to use for garbage collection.

(source: oracle.com)

All Java objects (other than local primitive variables) are stored in heap memory. Heap memory is divided into different address spaces for performance reasons. We will discuss them in a moment. JVM also manages another memory section called Stack. Stack is the static memory space where primitive variables (local variables or method-level variables) and references to local objects are stored. Stack memory will be cleared once the method returns the result. Garbage Collection is only performed for objects in Heap memory.

Take a close look at how heap memory is structured for garbage collection:

(source: oracle.com)

New objects will be stored in the young generation heap space. Young generation space is further divided into Eden, S0 and S1 respectively. Brand new objects will fall into Eden space. When Eden space fills up, the Garbage Collector will perform mark, sweep and compact operations against objects in Eden. Survived objects (objects that are still alive -- still being referenced in the application) are moved to the first survivor space (S0). In the next iteration, garbage collection will be performed in both Eden and S0. Survived objects from both Eden and S0 go to S1 (second generation). It means objects in Eden & S0 are now cleared and can accept new objects. The garbage collection event in the young generation space is called minor garbage collection. When the next minor garbage collection happens, survivor space will be switched and survived objects will be moved to S0. It means objects in Eden & S1 are now cleared and can accept new objects.

(source: oracle.com)

Eventually, long living objects will be moved to **old generation space (**also called Tenured generation). Once old generation space fills up, then a major garbage collection event will trigger. A major garbage collection event will have more pause time compared to a minor garbage collection event because it involves all live objects. This is a discussion point for applications that are sensitive to responsiveness. In such cases, we need to use a Garbage Collector that is tailored to minimize the pause time in major garbage collection events.

Permanent generation contains all the metadata about your application such as the class declarations, methods etc. Permanent generation space is included in the full garbage collection event. Note that, Permanent generation was replaced by Metadata space since Java 8 to get rid of its limitations. OutOfMemoryError is the famous one that was a result of one of its limitations. If garbage collection doesn't happen properly in PermGen space, then a memory leak can result in OutOfMemoryError. Permanent generation being fixed memory was more susceptible to OutOfMemoryError. Metadata space automatically grows by default with options to tune the memory. The maximum space available is the available system memory but can be configured using the JVM argument: MaxMetaspaceSize.
OutOfMemoryError can still happen with Metadata space if the amount of memory required for class metadata exceeds MaxMetaspaceSize or available memory in the system.

How it works

We have seen how the heap address space is structured to optimize the garbage collection process. We have also discussed how objects are garbage collected from the heap. But how does the Garbage Collector identify the objects eligible for garbage collection? For that, we need to talk about the special live object used by the Garbage Collector, which is the GC Root. GC Root is an object that is accessible outside of heap memory, exclusive to the Garbage Collector. GC Root is the starting point to traverse objects during the marking process. It is just like traversing through a tree from the root to the leaf examining whether objects are being used/reachable. An object is garbage collected if it is not reachable during the marking process from GC Root.

There are different types of GC Root objects available:

  • Class: Classes loaded by a system class loader; contains references to static variables as well.

  • Stack Local: Local variables and parameters to methods stored on the local stack.

  • Active Java Threads: All active Java threads

  • JNI References: Native code Java objects created for JNI calls; contain local variables, parameters to JNI methods, and global JNI references.

  • Objects that are used as monitors for synchronization.

  • Specific objects defined by the JVM implementation that are not garbage collected for its purpose. That might contain important exception classes, system class loaders, or custom class loaders.

Exercise

Take a look at the below code to understand the object lifecycle a bit.

public class GCDemo {
    public static ArrayList<Object> list = new ArrayList<>();    
    public void doSomething() {
        HashMap<String, Object> m = new HashMap<>();     
        Object o1 = new Object(); 
        Object o2 = new Object();
        m.put("o1", o1);
        o1 = o2;
        o1 = null;
        list.add(m);
        m = null; 
        System.gc();
    }
}

If we try to create an object dependency graph, it would look like the below:
list -> m -> o1 -> o2

The object o1 will not be eligible for garbage collection unless the class GCDemo is garbage collected or a new value is written to list variable. Because object o1 is used by a static variable list. As we know, static variables are class-level variables. Garbage collection in class-loader usually never happens. When JVM exits, the operating system can reclaim the memory that was not garbage collected.

Note that even though we call System.gc(), there is no guarantee that GC will be invoked. It is just sending a signal to the GC to consider garbage collection for the given moment. Also, note that it will potentially trigger major garbage collection. So, it can result in stop-the-world events that result in a performance hit. Calling System.gc() will not save from OutOfMemoryError and it is better to trust the garbage collector instead. Treat it like a hack for experimenting with the GC events in your application.

Modern Garbage Collectors

In this section, let's focus on modern garbage collectors introduced in more recent versions of Java. We will skip old garbage collectors such as Serial/Parallel GC and CMS (Concurrent Mark Sweep) GC in favor of recent ones. CMS GC was deprecated in JDK 9. The main difference is that concurrent compact operations are supported in garbage collectors available in recent Java versions.

G1GC

Garbage First Garbage Collector (G1GC) is available since Java 7 and was introduced as a long-term replacement for the CMS garbage collector. G1GC incrementally compacts the memory after sweeping. This garbage collector is aware of where the free space is accumulated. Marked objects around that space will be swept first. Both mark and sweep are parallel and concurrent. However, the compact operation is not concurrent. G1GC is the default garbage collector for the JVM. In recent versions of Java, G1GC has been improved to bring the ability to abort stop-the-world events. Recent upgrades bring the ability to predict the number of regions for garbage collection and proceed only if it can be aborted in case of necessity.

To enable the G1 Collector, use the below JVM argument:
-XX:+UseG1GC

ZGC

ZGC was initially introduced as an experimental feature in JDK 11 and was declared production ready in JDK 15. Unlike G1GC, ZGC can also perform compact operations concurrently. ZGC is a low-pause garbage collector meant for low-latency apps. What makes ZGC unique is the stable low latency that doesn't exceed 10ms. Also, pause times are independent of the heap size. However, we need to ensure that sufficient heap memory is available for ZGC. ZGC uses three views of heap memory under the hood. Hence the memory footprint is more in ZGC compared to G1GC. Setting up a max heap size is very important for ZGC. ZGC also provides concurrent class loading. Applications can run continuously while ZGC does garbage collection in the background, except for thread stack scanning. We get better throughput and latency but sacrifice footprints.

To enable ZGC, use the below JVM argument:
-XX:+UseZGC

Shenandoah

Shenandoah garbage collector is introduced in JDK 12. This garbage collector can also perform concurrent compact operations. To achieve concurrent relocation of objects, it uses something called a Brooks pointer which is a pointer to the object itself. During the relocation of objects, it will not remove the old pointer. Instead, it will be pointed to the new location of the object. Thereby compact operation is decoupled from sweep operation. Also, the pause time is independent of the heap size. So, it doesn't matter how large the heap is: whether it is 2MB or 200GB, low-pause time is always guaranteed. It doesn't require a large heap, unlike ZGC.

To enable Shenandoah, use the below JVM argument:
-XX:+UseShenandoahGC

Usecase

Both Shenandoah and ZGC are capable of concurrent operations. Is one of them a default choice over G1? Not really. There is not a single GC that is better than the others. It depends on what you need.

The G1 collector is designed for applications that:

  • Can operate concurrently with application threads.

  • Compact free space without lengthy GC-induced pause times.

  • Need more predictable GC pause durations.

  • Do not want to sacrifice throughput.

  • Do not require a much larger Java heap.

So, the G1 collector is more like a stable and generic solution unless you have specific requirements for low-pause times. G1 is the default garbage collector Since Java 9.

ZGC is designed for applications that:

  • Require low pause times and/or use a very large heap (multi-terabytes).

  • Require stable responsiveness on a large volume of GC operations.

  • Hosted on Linux x86 64-bit system (currently supported, ARM underway).

  • Are using JDK 11+.

Shenandoah is designed for applications that:

  • Require low pause times irrespective of the heap size (low or high).

  • Require stable responsiveness on a large volume of GC operations.

  • Hosted on either Linux or ARM x86 32-bit and 64-bit.

  • Are still using JDK 8 (Downstream backport to OpenJDK 8u is available).

Now let's go through some of the benchmarks to understand a bit about these garbage collectors in action. All the mentioned benchmarks below are from Hazelcast JET. These tests are done in AWS EC2 c5.4xlarge instance with 16 vCPUs and 32 GB of RAM. Here is a benchmark for 2M items/second:

(source: Hazelcast benchmark)

The behavior for all three types of GC is pretty much the same until it reaches around the 92nd percentile (meaning 92% of the throughput/traffic experience this latency). We can see that G1 performs worst in the remaining 8% of traffic with over 10ms of latency comfortably! We can also observe that both Shenandoah and ZGC were almost stable until the 99% mark.

Now if the throughput increased to 3M items per second:

(source: Hazelcast benchmark)

The above benchmark was done after applying the pacer improvement to Shenandoah GC. It is a technique of graceful degradation during failure mode. It is an out-of-topic for this article, hence I'm skipping it for the moment. We can observe that both G1 & ZGC performs the same when the throughput was increased to 3M items/second. While pacer improvement had a big impact on Shenandoah pause times.

Now let's take a look at another benchmark of varying throughputs:

ZGC was performing almost stable with low pause time until it reached 8M items/second. Shenandoah performed the worst, but it is because pacer improvements are applied. Otherwise, it would sit somewhere between G1 and ZGC for the latency. However, we can observe that both ZGC and Shenandoah are not stable beyond 8M items/sec throughput.

ZGC is the clear winner for throughput that ranges between 2M-8M items/sec. G1 had high pause times ( > 10ms) throughout the duration, but it was very stable with manageable pause times at any moment. I hope now you get the intuition behind keeping G1 as the default garbage collector. As I mentioned earlier, it is pretty much all you need in most cases.