30
DecemberPerformance Comparison of Modern Garbage Collectors
Description
PERFORMANCE COMPARISON OF MODERN GARBAGE COLLECTORS
Click Here To Know More About Garbage Collectorsc
Every garbage collector has specific characterstic that make it distinct from other. Understanding how these characteristic impact the different performances metrics of a particular program is a complex challenge. Not only does the performance depend on the topology and amount of object at the heap, but also on the access patterns of the application. Additionally, the metrics are also not independent . Measurement with benchmark suites that comprise a small number of real use case programs can potentially hide problems that do not show up with such a small sample of scenarios. Another problem with using small benchmark suite is that a garbage collector may be optimised for specific environments and therefore, responsible for introducing biased results to the experiment.
The goal is to understand how different garbage collectors impact different metrics more particularly latency, throughput, and memory usage. Using this knowledge, given an application with particular performance requisites (w.r.t application throughput, latency and memory usage), we give hints to which garbage collector implementation is most suitable to fulfil its requirement.To acquire the results on which to compare the tradeoffs between the variously selected garbage collectors (i.e., ParallelOld, CMS, G1, Shenandoah and ZGC), we use a combination of real-world applications and data running on top of an industrial-grade JVM, coupled with real-world-based benchmarks to analyse the performance.
This research to contribute to the field of modern garbage collector for Big Data environments through the following contributions:
• Provide better knowledge on how to match applications to a specific garbage collector, taking into consideration the application needs and the garbage collector tradeoffs.
• Give hints w.r.t. which garbage collector to pick when we want to optimize a particular performance metric.
• An overview of the state of art garbage collectors developed for the Java Virtual Machine and how they improve upon older implementations.
• Design and development of a fine-grained benchmark meant to stress-test specific GC components, e.g., write barriers, read barriers.
The Shenandoah garbage collector, like the ZGC, has the goal of reducing pause times on large heaps. It is also a region-based non generational collector that uses similar principles to ZGC but follows a different implementation strategy. Instead of coloured pointers, Shenandoah makes use of Brooks pointers, for allowing concurrent compaction of memory.The Z Garbage Collector, also known as ZGC, is an experimental scalable low-latency collector that is built to handle heaps varying from relatively small to potentially multi-terabytes sizes. Also, ZGC’s pause time is supposed not to exceed 10ms even when increasing the heap or live-set size. Compared to the Garbage-First collector, ZGC is also a region-based collector; however, not generational. It improves upon G1 by achieving concurrent compaction with the introduction of two core techniques, read barriers and coloured pointers.
Depending on an application’s needs, specific performance metrics are more desired than others for that particular application. An example is one of the applications develoby Futuregen Skill . It validates transactions using near real-time machine learning to analyse data stored in JVM-powered databases. Failing a Service Level Agreement (SLA) for a particular transaction due to a long garbage collection pause while performing a database read or write operation would significantly impact the company negatively. Usually, garbage collectors can be divided into three groups regarding performance:
i) those that offer a guarantee of low pause times, typically under ten milliseconds, such as the ZGC and Shenandoah;
ii) those that seek to achieve the best throughput possible, such as the ParallelOld;
iii) and those that attempt to strike a compromise between low pause times without sacrificing the application’s throughput too much, such as the CMS and G1 collectors.
To better understand the trade-offs between each group, we study specific performance metrics using the benchmarks described in Chapter 4. In particular, we focus our analysis on the following performance metrics: application throughput, memory utilisation, and latency.
The experiments consist of all possible combinations of the following:
1. We use several Garbage Collectors, more specifically the ParallelOld , CMS , G1 , ZGC , and Shenandoah.
2. We increase the size of the Java Virtual Machine (OpenJDK 11 Hotspot) heap, so that we may observe performance wise how each garbage collector behaves with different
sized heaps and interpret why it performs that way.
3. We use the benchmarks which determine the type and numbers of accesses used in each experiment.
The results were extracted either directly from the log file produced by the JVM (we did not change the logging infrastructure for the JVM) or from the benchmark log file. From the JVM log, we extracted the memory available before and after each garbage collection cycle for memory usage, and the time an application threads had to halt execution for a garbage collection cycle to be performed. The throughput metric was extracted from the benchmark log file. The experiment results were analysed in different ways according to the particular performance metric in question. We decided to extract the number of operations per second performed by an application and use a 95% confidence interval as the application throughput metric. Latency was measured across multiple percentiles of all pauses. Lastly, memory utilization was determined as the percentage of heap space used by an application over the defined total heap space with a 95% confidence interval, extracting as well the max memory usage the application reached during the benchmark execution. The reason we use 95% confidence intervals instead of higher confidence intervals is so that the result intervals are tight enough that there is sufficient differentiation between the garbage collectors without losing much confidence. With higher confidence intervals, the intervals become significantly wider, which causes considerable overlapping when comparing results between different garbage collectors.
We’re looking forward to testing ZGC ourselves to get an idea of how it’s performance varies by workload.ZGC is an exciting new Garbage Collector that is designed to offer very low pause times on large heaps. It does this through the use of coloured pointers and load barriers, which are GC techniques new to Hotspot and which open up some other interesting future possibilities.