G1 Garbage Collector Explained

In G1 GC, the heap is divided into equal-sized regions. Each region can belong to one of the following categories:

Eden: Where new objects are allocated.
Survivor: Where objects that survive an initial garbage collection are moved.
Old: Where objects that have survived multiple garbage collections are promoted.
Humongous: Where very large objects that do not fit into a standard region are allocated.

Key Concepts

1. Region-Based Memory Management:

Each region is typically between 1 MB and 32 MB.
The number of regions and their size can be controlled via the -XX:G1HeapRegionSize parameter.

2. Young and Old Generation:

Unlike traditional generational collectors that have contiguous spaces for young and old generations, G1 has regions that can dynamically change their role.
Young Generation consists of Eden and Survivor regions.
Old Generation consists of Old regions.

3. Humongous Objects:

Objects larger than half of a region are considered humongous and allocated in a set of contiguous regions.

G1 GC Phases

1. Initial Marking:

Marks all objects that are directly reachable from the root set.
It's a quick stop-the-world pause.

2. Concurrent Marking:

Runs concurrently with the application.
Traverses the object graph and marks reachable objects.

3. Final Remark:

Completes the marking process with another stop-the-world pause.
Fixes any objects that may have been missed during the concurrent phase.

4. Cleanup:

Identifies regions with the most garbage.
Reclaims space by collecting these regions.
May involve some stop-the-world pauses.

Example Workflow

Let's go through an example to illustrate these concepts:

1. Object Allocation

New objects are allocated in Eden regions. Suppose we have a heap with 10 regions for simplicity, and 4 of them are initially designated as Eden regions.

2. Minor GC (Young Collection)

When Eden regions are filled up, a minor GC is triggered. Live objects in Eden are moved to Survivor regions. If objects survive multiple minor GCs, they are promoted to Old regions.

3. Concurrent Marking

The concurrent marking phase starts when the heap occupancy reaches a certain threshold. It marks all reachable objects throughout the heap.

4. Mixed GC (Mixed Collection)

G1 collects both Young and Old regions. Regions with the most garbage are collected first. This phase helps in reclaiming space efficiently.

Visualization

Consider a heap with 10 regions for simplicity. Here’s a step-by-step illustration:

1. Initial State:

[E] [E] [E] [E] [S] [S] [O] [O] [O] [H]

E: Eden region
S: Survivor region
O: Old region
H: Humongous region

2. After Object Allocation in Eden:

[E*] [E*] [E*] [E*] [S] [S] [O] [O] [O] [H]

Objects are allocated in Eden regions (marked with *).

3. After Minor GC:

[E ] [E ] [E ] [E ] [S*] [S*] [O] [O] [O] [H]

Live objects in Eden are moved to Survivor regions (marked with *).

4. After Promoting Long-Lived Objects to Old Regions:

[E ] [E ] [E ] [E ] [S ] [S ] [O*] [O*] [O] [H]

Objects that survive multiple GCs are promoted to Old regions (marked with *).

5. Concurrent Marking Phase:

[E ] [E ] [E ] [E ] [S ] [S ] [O*] [O*] [O] [H*]

Concurrent marking marks all reachable objects in the heap.

6. Mixed GC Phase (Collecting Eden, Survivor, and some Old regions):

[E ] [E ] [E ] [E ] [S ] [S ] [O ] [O ] [O] [H]

Regions with the most garbage are collected first, reclaiming space efficiently.

Configuring G1 GC

You can configure G1 GC using JVM options:

java -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xms2g -Xmx2g -jar my-application.jar

Improving Code Cache Size

To improve the code cache size, you can use the following JVM options:

-XX:ReservedCodeCacheSize=<size>
-XX:InitialCodeCacheSize=<size>

For example:

java -XX:ReservedCodeCacheSize=256m -XX:InitialCodeCacheSize=64m 
                 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xms2g -Xmx2g 
                 -jar my-application.jar

This configuration reserves 256 MB for the code cache and sets the initial size to 64 MB. Adjust these values based on your application's needs.

Conclusion

G1 GC's region-based approach allows for more flexible and efficient memory management compared to traditional garbage collectors. By dividing the heap into regions and focusing on regions with the most garbage, G1 GC can meet pause time goals while efficiently reclaiming space. Understanding these concepts helps in tuning and optimizing the performance of applications running with G1 GC.

Harshana's blog

Search This Blog