| // Licensed to the Apache Software Foundation (ASF) under one or more |
| // contributor license agreements. See the NOTICE file distributed with |
| // this work for additional information regarding copyright ownership. |
| // The ASF licenses this file to You under the Apache License, Version 2.0 |
| // (the "License"); you may not use this file except in compliance with |
| // the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, software |
| // distributed under the License is distributed on an "AS IS" BASIS, |
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| // See the License for the specific language governing permissions and |
| // limitations under the License. |
| = Memory and JVM Tuning |
| |
| This article provides best practices for memory tuning that are relevant for deployments with and without native persistence or an external storage. |
| Even though Ignite stores data and indexes off the Java heap, Java heap is still used to store objects generated by |
| queries and operations executed by your applications. |
| Thus, certain recommendations should be considered for JVM and garbage collection (GC) related optimizations. |
| |
| [NOTE] |
| ==== |
| [discrete] |
| Refer to link:perf-and-troubleshooting/persistence-tuning[persistence] tuning article for disk-related |
| optimization practices. |
| ==== |
| |
| == Tune Swappiness Setting |
| |
| An operating system starts swapping pages from RAM to disk when overall RAM usage hits a certain threshold. |
| Swapping can impact Ignite cluster performance. |
| You can adjust the operating system's setting to prevent this from happening. |
| For Unix, the best option is to either decrease the `vm.swappiness` parameter to `10`, or set it to `0` if native persistence is enabled: |
| |
| [source,shell] |
| ---- |
| sysctl -w vm.swappiness=0 |
| ---- |
| |
| The value of this setting can prolong GC pauses as well. For instance, if your GC logs show `low user time, high |
| system time, long GC pause` records, it might be caused by Java heap pages being swapped in and out. To |
| address this, use the `swappiness` settings above. |
| |
| == Share RAM with OS and Apps |
| |
| An individual machine's RAM is shared among the operating system, Ignite, and other applications. |
| As a general recommendation, if an Ignite cluster is deployed in pure in-memory mode (native |
| persistence is disabled), then you should not allocate more than 90% of RAM capacity to Ignite nodes. |
| |
| On the other hand, if native persistence is used, then the OS requires extra RAM for its page cache in order to optimally sync up data to disk. |
| If the page cache is not disabled, then you should not give more than 70% of the server's RAM to Ignite. |
| |
| Refer to link:memory-configuration/data-regions[memory configuration] for configuration examples. |
| |
| In addition to that, because using native persistence might cause high page cache utilization, the `kswapd` daemon might not keep up with page reclamation, which is used by the page cache in the background. |
| As a result, this can cause high latencies due to direct page reclamation and lead to long GC pauses. |
| |
| To work around the effects caused by page memory reclamation on Linux, add extra bytes between `wmark_min` and `wmark_low` with `/proc/sys/vm/extra_free_kbytes`: |
| |
| [source,shell] |
| ---- |
| sysctl -w vm.extra_free_kbytes=1240000 |
| ---- |
| |
| Refer to link:https://events.static.linuxfound.org/sites/events/files/lcjp13_moriya.pdf[this resource, window=_blank] |
| for more insight into the relationship between page cache settings, high latencies, and long GC pauses. |
| |
| == Java Heap and GC Tuning |
| |
| Even though Ignite and Ignite keep data in their own off-heap memory regions invisible to Java garbage collectors, Java |
| Heap is still used for objects generated by your applications workloads. |
| For instance, whenever you run SQL queries against an Ignite cluster, the queries will access data and indexes stored in |
| the off-heap memory while the result sets of such queries will be kept in Java Heap until your application reads the result sets. |
| Thus, depending on the throughput and type of operations, Java Heap can still be utilized heavily and this might require |
| JVM and GC related tuning for your workloads. |
| |
| We've included some common recommendations and best practices below. |
| Feel free to start with them and make further adjustments as necessary, depending on the specifics of your applications. |
| |
| [NOTE] |
| ==== |
| [discrete] |
| Refer to link:perf-and-troubleshooting/troubleshooting#debugging-gc-issues[GC debugging techniques] sections for best |
| practices on GC logs and heap dumps collection. |
| ==== |
| |
| === Generic GC Settings |
| |
| Below are sets of example JVM configurations for applications that can utilize Java Heap on server nodes heavily, thus |
| triggering long — or frequent, short — stop-the-world GC pauses. |
| |
| For JDK 1.8+ deployments you should use G1 garbage collector. |
| The settings below are a good starting point if 10GB heap is more than enough for your server nodes: |
| |
| [source,shell] |
| ---- |
| -server |
| -Xms10g |
| -Xmx10g |
| -XX:+AlwaysPreTouch |
| -XX:+UseG1GC |
| -XX:+ScavengeBeforeFullGC |
| -XX:+DisableExplicitGC |
| ---- |
| |
| If G1 does not work for you, consider using CMS collector and starting with the following settings. |
| Note that 10GB heap is used as an example and a smaller heap can be enough for your use case: |
| |
| [source,shell] |
| ---- |
| -server |
| -Xms10g |
| -Xmx10g |
| -XX:+AlwaysPreTouch |
| -XX:+UseParNewGC |
| -XX:+UseConcMarkSweepGC |
| -XX:+CMSClassUnloadingEnabled |
| -XX:+CMSPermGenSweepingEnabled |
| -XX:+ScavengeBeforeFullGC |
| -XX:+CMSScavengeBeforeRemark |
| -XX:+DisableExplicitGC |
| ---- |
| |
| [NOTE] |
| ==== |
| //TODO: Is this still valid? What does it do? |
| If you use link:persistence/native-persistence[Ignite native persistence], we recommend that you set the |
| `MaxDirectMemorySize` JVM parameter to `walSegmentSize * 4`. |
| With the default WAL settings, this value is equal to 256MB. |
| ==== |
| |
| === Advanced Memory Tuning |
| |
| In Linux and Unix environments, it's possible that an application can face long GC pauses or lower performance due to |
| I/O or memory starvation due to kernel specific settings. |
| This section provides some guidelines on how to modify kernel settings in order to overcome long GC pauses. |
| |
| [WARNING] |
| ==== |
| [discrete] |
| All the shell commands given below were tested on RedHat 7. |
| They may differ for your Linux distribution. |
| Before changing the kernel settings, make sure to check the system statistics/logs to confirm that you really have a problem. |
| Consult your IT department before making changes at the Linux kernel level in production. |
| ==== |
| |
| If GC logs show `low user time, high system time, long GC pause` then most likely memory constraints are triggering swapping or scanning of a free memory space. |
| |
| * Check and adjust the link:perf-and-troubleshooting/memory-tuning#tune-swappiness-setting[swappiness settings]. |
| * Add `-XX:+AlwaysPreTouch` to JVM settings on startup. |
| * Disable NUMA zone-reclaim optimization. |
| + |
| [source,shell] |
| ---- |
| sysctl -w vm.zone_reclaim_mode=0 |
| ---- |
| |
| * Turn off Transparent Huge Pages if RedHat distribution is used. |
| + |
| [source,shell] |
| ---- |
| echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled |
| echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag |
| ---- |
| |
| === Advanced I/O Tuning |
| |
| If GC logs show `low user time, low system time, long GC pause` then GC threads might be spending too much time in the kernel space being blocked by various I/O activities. |
| For instance, this can be caused by journal commits, gzip, or log roll over procedures. |
| |
| As a solution, you can try changing the page flushing interval from the default 30 seconds to 5 seconds: |
| |
| [source,shell] |
| ---- |
| sysctl -w vm.dirty_writeback_centisecs=500 |
| sysctl -w vm.dirty_expire_centisecs=500 |
| ---- |
| |
| [NOTE] |
| ==== |
| [discrete] |
| Refer to the link:perf-and-troubleshooting/persistence-tuning[persistence tuning] section for the optimizations related to disk. |
| Those optimizations can have a positive impact on GC. |
| ==== |