output/feed.xml - accumulo-website - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
   <channel>
     <title>Apache Accumulo™</title>
     <description>The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.
 </description>
     <link>https://accumulo.apache.org/</link>
     <atom:link href="https://accumulo.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
     <pubDate>Mon, 22 Apr 2024 18:30:52 +0000</pubDate>
     <lastBuildDate>Mon, 22 Apr 2024 18:30:52 +0000</lastBuildDate>
     <generator>Jekyll v4.3.2</generator>


       <item>
         <title>Does a compactor process return memory to the OS?</title>
         <description>&lt;h2 id=&quot;goal&quot;&gt;Goal&lt;/h2&gt;
 &lt;p&gt;The goal of the project was to determine if, once an Accumulo process is finished using memory, the JVM would release this unused memory back to the operating system. This was specifically observed in a Compactor process during the tests, but the findings should apply to any Accumulo Server process. We looked at the memory usage of the compactor process specifically to help understand if oversubscribing compactors on a machine is a viable option.&lt;/p&gt;

 &lt;p&gt;As background information, it’s important to note that modern JVMs are expected to release memory back to the operating system, rather than just growing from the initial heap size (-Xms) to the maximum heap size (-Xmx) and never releasing it. This behavior was introduced in Java 11 through the &lt;a href=&quot;https://openjdk.org/jeps/346&quot;&gt;JEP 346: Promptly Return Unused Committed Memory from G1&lt;/a&gt;. This feature aims to improve the efficiency of memory usage by actively returning Java heap memory to the operating system when idle.&lt;/p&gt;
 &lt;h3 id=&quot;test-scenario&quot;&gt;Test Scenario&lt;/h3&gt;
 &lt;p&gt;There could be a scenario where the amount of memory on a machine limits the number of compactors that can be run. For example, on a machine with 32GB of memory, if each compactor process uses 6GB of memory, we can only “fit” 5 compactors on that machine (32/6=5.333). Since each compactor process only runs on a single core, we would only be utilizing 5 cores on that machine where we would like to be using as many as we can.&lt;/p&gt;

 &lt;p&gt;If the compactor process does not return the memory to the OS, then we are stuck with only using the following number of compactor processes:
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(total memory)/(memory per compactor)&lt;/code&gt;.
 If the compactor processes return the memory to the OS, i.e. does not stay at the maximum 6GB once they reach it, then we can oversubscribe the memory allowing us to run more compactor processes on that machine.&lt;/p&gt;

 &lt;p&gt;It should be noted that there is an inherent risk when oversubscribing processes that the user must be willing to accept if they choose to do oversubscribe. In this case, there is the possibility that all compactors run at the same time which might use all the memory on the machine. This could cause one or more of the compactor processes to be killed by the OOM killer.&lt;/p&gt;

 &lt;h2 id=&quot;test-setup&quot;&gt;Test Setup&lt;/h2&gt;

 &lt;h3 id=&quot;environment-prerequisites&quot;&gt;Environment Prerequisites&lt;/h3&gt;

 &lt;p&gt;The machines used for testing were running Pop!_OS 22.04 a debian-based OS. The following package installation and usage steps may vary if one were try to repeat these steps.&lt;/p&gt;

 &lt;h4 id=&quot;install-gnuplot&quot;&gt;Install gnuplot&lt;/h4&gt;

 &lt;p&gt;This was used for plotting the memory usage of the compactor over time from the perspective of the OS&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sudo apt install gnuplot&lt;/code&gt;&lt;/li&gt;
   &lt;li&gt;gnuplot was started with the command &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gnuplot&lt;/code&gt;&lt;/li&gt;
 &lt;/ol&gt;

 &lt;h4 id=&quot;install-visualvm&quot;&gt;Install VisualVM&lt;/h4&gt;

 &lt;p&gt;This was used for plotting the memory usage of the compactor over time from the perspective of the JVM&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;Downloaded the zip from &lt;a href=&quot;https://visualvm.github.io/&quot;&gt;visualvm.github.io&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;Extracted with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unzip visualvm_218.zip&lt;/code&gt;&lt;/li&gt;
   &lt;li&gt;VisualVM was started with the command &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./path/to/visualvm_218/bin/visualvm&lt;/code&gt;&lt;/li&gt;
 &lt;/ol&gt;

 &lt;h4 id=&quot;configure-and-start-accumulo&quot;&gt;Configure and start accumulo&lt;/h4&gt;

 &lt;p&gt;Accumulo 2.1 was used for experimentation. To stand up a single node instance, &lt;a href=&quot;https://github.com/apache/fluo-uno&quot;&gt;fluo-uno&lt;/a&gt; was used.&lt;/p&gt;

 &lt;p&gt;Steps taken to configure accumulo to start compactors:&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;Uncommented lines in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fluo-uno/install/accumulo-2.1.2/conf/cluster.yaml&lt;/code&gt; regarding the compaction coordinator and compactor q1. A single compactor process was used, q1. This allows the external compaction processes to start up.&lt;/li&gt;
   &lt;li&gt;Configured the java args for the compactor process in “accumulo-env.sh.” Line:
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compactor) JAVA_OPTS=('-Xmx256m' '-Xms256m' &quot;${JAVA_OPTS[@]}&quot;) ;;&lt;/code&gt;&lt;/li&gt;
   &lt;li&gt;Started accumulo with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uno start accumulo&lt;/code&gt;&lt;/li&gt;
 &lt;/ol&gt;

 &lt;h4 id=&quot;install-java-versions&quot;&gt;Install java versions&lt;/h4&gt;

 &lt;ol&gt;
   &lt;li&gt;Installed java versions 11, 17 and 21. For example, Java 17 was installed with:
     &lt;ol&gt;
       &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sudo apt install openjdk-17-jdk&lt;/code&gt;&lt;/li&gt;
       &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sudo update-alternatives --config java&lt;/code&gt; and select the intended version before starting the accumulo instance&lt;/li&gt;
       &lt;li&gt;Ensured &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JAVA_HOME&lt;/code&gt; was set to the intended version of java before each test run&lt;/li&gt;
     &lt;/ol&gt;
   &lt;/li&gt;
 &lt;/ol&gt;

 &lt;h2 id=&quot;running-the-test&quot;&gt;Running the test&lt;/h2&gt;

 &lt;ol&gt;
   &lt;li&gt;Started accumulo using &lt;a href=&quot;https://github.com/apache/fluo-uno&quot;&gt;fluo-uno&lt;/a&gt; (after changing the mentioned configuration)
     &lt;ul&gt;
       &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uno start accumulo&lt;/code&gt;&lt;/li&gt;
     &lt;/ul&gt;
   &lt;/li&gt;
   &lt;li&gt;Opened VisualVM and selected the running compactor q1 process taking note of the PID&lt;/li&gt;
   &lt;li&gt;Ran &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mem_usage_script.sh &amp;lt;compactor process PID&amp;gt;&lt;/code&gt;. This collected measurements of memory used by the compactor process over time from the perspective of the OS. We let this continue to run while the compaction script was running.&lt;/li&gt;
   &lt;li&gt;Configured the external compaction script as needed and executed:
     &lt;ul&gt;
       &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uno jshell experiment.jsh&lt;/code&gt;&lt;/li&gt;
     &lt;/ul&gt;
   &lt;/li&gt;
   &lt;li&gt;Memory usage was monitored from the perspective of the JVM (using VisualVM) and from the perspective of the OS (using our collection script).
 Navigated to the “Monitor” tab of the compactor in VisualVM to see the graph of memory usage from JVM perspective.
 Followed the info given in the &lt;a href=&quot;#os-memory-data-collection-script&quot;&gt;OS Memory Data Collection Script&lt;/a&gt; section to plot the memory usage from OS perspective.&lt;/li&gt;
 &lt;/ol&gt;

 &lt;p&gt;Helpful resources:&lt;/p&gt;
 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://accumulo.apache.org/blog/2021/07/08/external-compactions.html&quot;&gt;External Compactions accumulo blog post&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://docs.oracle.com/en/java/javase/21/gctuning/z-garbage-collector.html#GUID-8637B158-4F35-4E2D-8E7B-9DAEF15BB3CD&quot;&gt;Z garbage collector heap size docs&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://docs.oracle.com/en/java/javase/21/gctuning/garbage-collector-implementation.html#GUID-71D796B3-CBAB-4D80-B5C3-2620E45F6E5D&quot;&gt;Generational Garbage Collection docs&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://docs.oracle.com/en/java/javase/21/gctuning/garbage-first-g1-garbage-collector1.html#GUID-ED3AB6D3-FD9B-4447-9EDF-983ED2F7A573&quot;&gt;G1 garbage collector docs&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://thomas.preissler.me/blog/2021/05/02/release-memory-back-to-the-os-with-java-11&quot;&gt;Java 11 and memory release article&lt;/a&gt;&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;external-compaction-test-script&quot;&gt;External compaction test script&lt;/h3&gt;

 &lt;p&gt;Initiates an external compaction of 700MB of data (20 files of size 35MB) on Compactor q1.&lt;/p&gt;

 &lt;p&gt;&lt;strong&gt;&lt;em&gt;referred to as experiment.jsh in the test setup section&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

 &lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;org.apache.accumulo.core.conf.Property&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;

 &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataSize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;35_000_000&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
 &lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataSize&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;];&lt;/span&gt;
 &lt;span class=&quot;nc&quot;&gt;Arrays&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;65&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
 &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;testTable&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;

 &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ingestAndCompact&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;TableNotFoundException&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// ignore &lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Creating table &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// This is done to avoid system compactions, we want to initiate the compactions manually &lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Property&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;TABLE_MAJC_RATIO&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getKey&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;1000&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Configure for external compaction &lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;instanceOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;tserver.compaction.major.service.cs1.planner&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;instanceOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;tserver.compaction.major.service.cs1.planner.opts.executors&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;[{\&quot;name\&quot;:\&quot;large\&quot;,\&quot;type\&quot;:\&quot;external\&quot;,\&quot;queue\&quot;:\&quot;q1\&quot;}]&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;table.compaction.dispatcher&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;table.compaction.dispatcher.opts.service&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;cs1&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numFiles&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;writer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;createBatchWriter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;numFiles&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;nc&quot;&gt;Mutation&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Mutation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;r&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;mut&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;at&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;family&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cf&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;qualifier&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cq&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;writer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addMutation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mut&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;writer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;flush&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;

            &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Writing &quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataSize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot; bytes to a single value&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;flush&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Compacting table&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;tableOperations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;compact&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tableName&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;CompactionConfig&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;setWait&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Finished table compaction&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

 &lt;span class=&quot;n&quot;&gt;ingestAndCompact&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;();&lt;/span&gt;
 &lt;span class=&quot;c1&quot;&gt;// Optionally sleep and ingestAndCompact() again, or just execute the script again.&lt;/span&gt;
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;os-memory-data-collection-script&quot;&gt;OS Memory Data Collection Script&lt;/h3&gt;

 &lt;p&gt;Tracks the Resident Set Size (RSS) of the given PID over time, outputting the data to output_mem_usage.log.
 Data is taken every 5 seconds for an hour or until stopped.&lt;/p&gt;

 &lt;p&gt;&lt;strong&gt;&lt;em&gt;referred to as mem_usage_script.sh in the test setup section&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

 &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/bash &lt;/span&gt;
 &lt;span class=&quot;nv&quot;&gt;PID&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;
 &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Tracking PID: &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PID&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
 &lt;span class=&quot;nv&quot;&gt;DURATION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;3600 &lt;span class=&quot;c&quot;&gt;# for 1 hour &lt;/span&gt;
 &lt;span class=&quot;nv&quot;&gt;INTERVAL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;5    &lt;span class=&quot;c&quot;&gt;# every 5 seconds &lt;/span&gt;
 &lt;span class=&quot;nb&quot;&gt;rm &lt;/span&gt;output_mem_usage.log

 &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$DURATION&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-gt&lt;/span&gt; 0 &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do
     &lt;/span&gt;ps &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; %mem,rss &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$PID&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;tail&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; +2 &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; output_mem_usage.log
     &lt;span class=&quot;nb&quot;&gt;sleep&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$INTERVAL&lt;/span&gt;
     &lt;span class=&quot;nv&quot;&gt;DURATION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;$((&lt;/span&gt;DURATION &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; INTERVAL&lt;span class=&quot;k&quot;&gt;))&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;done&lt;/span&gt;
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;After compactions have completed plot the data using gnuplot:&lt;/p&gt;

 &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gnuplot
 &lt;span class=&quot;nb&quot;&gt;set &lt;/span&gt;title &lt;span class=&quot;s2&quot;&gt;&quot;Resident Set Size (RSS) Memory usage&quot;&lt;/span&gt;
 &lt;span class=&quot;nb&quot;&gt;set &lt;/span&gt;xlabel &lt;span class=&quot;s2&quot;&gt;&quot;Time&quot;&lt;/span&gt;
 &lt;span class=&quot;nb&quot;&gt;set &lt;/span&gt;ylabel &lt;span class=&quot;s2&quot;&gt;&quot;Mem usage in kilobytes&quot;&lt;/span&gt;
 plot &lt;span class=&quot;s2&quot;&gt;&quot;output_mem_usage.log&quot;&lt;/span&gt; using &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$0&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;5&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;:2 with lines title &lt;span class=&quot;s1&quot;&gt;'Mem usage'&lt;/span&gt;
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;data&quot;&gt;Data&lt;/h2&gt;

 &lt;p&gt;Important Notes:&lt;/p&gt;
 &lt;ul&gt;
   &lt;li&gt;ZGC and G1PeriodicGCInterval are not available with Java 11, so couldn’t be tested for&lt;/li&gt;
   &lt;li&gt;ZGenerational for ZGC is only available in Java 21, so couldn’t be tested for in Java 17&lt;/li&gt;
   &lt;li&gt;G1 GC is the default GC in Java 11, 17, and 21 (doesn’t need to be specified in java args)&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;All Experiments Performed:&lt;/p&gt;

 &lt;table&gt;
   &lt;thead&gt;
     &lt;tr&gt;
       &lt;th&gt;Java Version&lt;/th&gt;
       &lt;th&gt;Manual Compaction&lt;/th&gt;
       &lt;th&gt;Xmx=1G&lt;/th&gt;
       &lt;th&gt;Xmx=2G&lt;/th&gt;
       &lt;th&gt;Xms=256m&lt;/th&gt;
       &lt;th&gt;XX:G1PeriodicGCInterval=60000&lt;/th&gt;
       &lt;th&gt;XX:-G1PeriodicGCInvokesConcurrent&lt;/th&gt;
       &lt;th&gt;XX:+UseShenandoahGC&lt;/th&gt;
       &lt;th&gt;XX:+UseZGC&lt;/th&gt;
       &lt;th&gt;XX:ZUncommitDelay=120&lt;/th&gt;
       &lt;th&gt;XX:+ZGenerational&lt;/th&gt;
     &lt;/tr&gt;
   &lt;/thead&gt;
   &lt;tbody&gt;
     &lt;tr&gt;
       &lt;td&gt;11&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;11&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;11&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;11&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;17&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;17&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;17&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;17&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;17&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;17&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;21&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;21&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;21&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;21&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
     &lt;tr&gt;
       &lt;td&gt;21&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt;🗸&lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
       &lt;td&gt; &lt;/td&gt;
     &lt;/tr&gt;
   &lt;/tbody&gt;
 &lt;/table&gt;

 &lt;h3 id=&quot;java-11-g1-gc-with-manual-gc-via-visualvm-every-minute-java-args--xmx1g--xms256m&quot;&gt;Java 11 G1 GC with manual GC (via VisualVM) every minute. Java args: -Xmx1G -Xms256m&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_OS_manualeverymin.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_OS_manualeverymin.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_VM_manualeverymin.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_VM_manualeverymin.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-11-g1-gc-with-manual-gc-via-visualvm-after-each-compaction-java-args--xmx1g--xms256m&quot;&gt;Java 11 G1 GC with manual GC (via VisualVM) after each compaction. Java args: -Xmx1G -Xms256m&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_OS_manualaftercomp.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_OS_manualaftercomp.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_VM_manualaftercomp.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_G1_x1_s256_VM_manualaftercomp.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-11-g1-gc-java-args--xmx2g--xms256&quot;&gt;Java 11 G1 GC. Java args: -Xmx2G -Xms256&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_G1_x2_s256_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_G1_x2_s256_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_G1_x2_s256_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_G1_x2_s256_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-11-shenandoah-gc-java-args--xmx2g--xms256--xxuseshenandoahgc&quot;&gt;Java 11 Shenandoah GC. Java args: -Xmx2G -Xms256 -XX:+UseShenandoahGC&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_UseShenandoah_x2_s256_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_UseShenandoah_x2_s256_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_11_UseShenandoah_x2_s256_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_11_UseShenandoah_x2_s256_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-17-g1-gc-java-args--xmx1g--xms256m--xxg1periodicgcinterval60000&quot;&gt;Java 17 G1 GC. Java args: -Xmx1G -Xms256m -XX:G1PeriodicGCInterval=60000&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-17-g1-gc-java-args--xmx2g--xms256m--xxg1periodicgcinterval60000&quot;&gt;Java 17 G1 GC. Java args: -Xmx2G -Xms256m -XX:G1PeriodicGCInterval=60000&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x2_s256_periodic60000_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x2_s256_periodic60000_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x2_s256_periodic60000_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x2_s256_periodic60000_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-17-g1-gc-java-args--xmx1g--xms256m--xxg1periodicgcinterval60000--xx-g1periodicgcinvokesconcurrent&quot;&gt;Java 17 G1 GC. Java args: -Xmx1G -Xms256m -XX:G1PeriodicGCInterval=60000 -XX:-G1PeriodicGCInvokesConcurrent&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_concurrent_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_concurrent_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_concurrent_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_concurrent_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-17-zgc-java-args--xmx2g--xms256m--xxusezgc--xxzuncommitdelay120&quot;&gt;Java 17 ZGC. Java args: -Xmx2G -Xms256m -XX:+UseZGC -XX:ZUncommitDelay=120&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_ZGC_x2_s256_UseZGC_uncommit_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_ZGC_x2_s256_UseZGC_uncommit_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_ZGC_x2_s256_UseZGC_uncommit_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_ZGC_x2_s256_UseZGC_uncommit_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-17-shenandoah-gc-java-args--xmx1g--xms256m--xxuseshenandoahgc&quot;&gt;Java 17 Shenandoah GC. Java args: -Xmx1G -Xms256m -XX:+UseShenandoahGC&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x1_s256_UseShenandoah_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x1_s256_UseShenandoah_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x1_s256_UseShenandoah_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x1_s256_UseShenandoah_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-17-shenandoah-gc-java-args--xmx2g--xms256m--xxuseshenandoahgc&quot;&gt;Java 17 Shenandoah GC. Java args: -Xmx2G -Xms256m -XX:+UseShenandoahGC&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x2_s256_UseShenandoah_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x2_s256_UseShenandoah_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x2_s256_UseShenandoah_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_shenandoah_x2_s256_UseShenandoah_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-21-g1-gc-java-args--xmx2g--xms256m--xxg1periodicgcinterval60000&quot;&gt;Java 21 G1 GC. Java args: -Xmx2G -Xms256m -XX:G1PeriodicGCInterval=60000&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_G1_x2_s256_periodic60000_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_G1_x2_s256_periodic60000_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_G1_x2_s256_periodic60000_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_G1_x2_s256_periodic60000_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-21-zgc-java-args--xmx2g--xms256m--xxusezgc--xxzgenerational--xxzuncommitdelay120&quot;&gt;Java 21 ZGC. Java args: -Xmx2G -Xms256m -XX:+UseZGC -XX:+ZGenerational -XX:ZUncommitDelay=120&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_generational_uncommit_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_generational_uncommit_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_generational_uncommit_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_generational_uncommit_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-21-zgc-java-args--xmx2g--xms256m--xxusezgc--xxzuncommitdelay120&quot;&gt;Java 21 ZGC. Java args: -Xmx2G -Xms256m -XX:+UseZGC -XX:ZUncommitDelay=120&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_uncommit_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_uncommit_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_uncommit_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_uncommit_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-21-shenandoah-gc-java-args--xmx1g--xms256m--xxuseshenandoahgc&quot;&gt;Java 21 Shenandoah GC. Java args: -Xmx1G -Xms256m -XX:+UseShenandoahGC&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x1_s256_UseShenandoah_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x1_s256_UseShenandoah_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x1_s256_UseShenandoah_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x1_s256_UseShenandoah_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h3 id=&quot;java-21-shenandoah-gc-java-args--xmx2g--xms256m--xxuseshenandoahgc&quot;&gt;Java 21 Shenandoah GC. Java args: -Xmx2G -Xms256m -XX:+UseShenandoahGC&lt;/h3&gt;
 &lt;!-- creates a styled box with two images side by side --&gt;
 &lt;!-- accepts two URLs relative to the project root and two alt text strings --&gt;
 &lt;div class=&quot;p-3 border rounded d-flex&quot;&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x2_s256_UseShenandoah_OS.png&quot;&gt;
 	&lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x2_s256_UseShenandoah_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
     &lt;/a&gt;
     &lt;a href=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x2_s256_UseShenandoah_VM.png&quot;&gt;
         &lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_shenandoah_x2_s256_UseShenandoah_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
     &lt;/a&gt;
 &lt;/div&gt;

 &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
 &lt;p&gt;All the garbage collectors tested (G1 GC, Shenandoah GC, and ZGC) and all the Java versions tested (11, 17, 21) will release memory that is no longer used by a compactor, back to the OS*. Regardless of which GC is used, after an external compaction is done, most (but usually not all) memory is eventually released back to the OS and all memory is released back to the JVM. Although a comparable amount of memory is returned to the OS in each case, the amount of time it takes for the memory to be returned and the amount of memory used during a compaction depends on which garbage collector is used and which parameters are set for the java process.&lt;/p&gt;

 &lt;p&gt;The amount that is never released back to the OS appears to be minimal and may only be present with G1 GC and Shenandoah GC. In the following graph with Java 17 using G1 GC, we see that the baseline OS memory usage before any compactions are done is a bit less than 400MB. We see that after a compaction is done and the garbage collection runs, this baseline settles at about 500MB.&lt;/p&gt;

 &lt;p&gt;&lt;a class=&quot;p-3 border rounded d-block&quot; href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_OS.png&quot;&gt;
    &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
 &lt;/a&gt;&lt;/p&gt;

 &lt;p&gt;On the same test run, the JVM perspective (pictured in the graph below) shows that all memory is returned (memory usage drops back down to Xms=256m after garbage collection occurs).&lt;/p&gt;

 &lt;p&gt;&lt;a class=&quot;p-3 border rounded d-block&quot; href=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_VM.png&quot;&gt;
    &lt;img src=&quot;/images/blog/202404_compactor_memory/java_17_G1_x1_s256_periodic60000_VM.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the JVM perspective&quot; /&gt;
 &lt;/a&gt;&lt;/p&gt;

 &lt;p&gt;The roughly 100MB of unreturned memory is also present with Shenandoah GC in Java 17 and Java 21 but does not appear to be present with Java 11. With ZGC, however, we see several runs where nearly all the memory used during a compaction is returned to the OS (the graph below was from a run using ZGC with Java 21). These findings regarding the unreturned memory may or may not be significant. They may also be the result of variance between runs. More testing would need to be done to confirm or deny these claims.&lt;/p&gt;

 &lt;p&gt;&lt;a class=&quot;p-3 border rounded d-block&quot; href=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_generational_uncommit_OS.png&quot;&gt;
    &lt;img src=&quot;/images/blog/202404_compactor_memory/java_21_ZGC_x2_s256_UseZGC_generational_uncommit_OS.png&quot; class=&quot;img-fluid rounded&quot; alt=&quot;Graph showing memory usage from the OS perspective&quot; /&gt;
 &lt;/a&gt;&lt;/p&gt;

 &lt;p&gt;Another interesting finding was that the processes use more memory when more is allocated. These results were obtained from initiating a compaction of 700MB of data (see experiment.jsh script). For example, setting 2GB versus 1GB of max heap for the compactor process results in a higher peak memory usage. During a compaction, when only allocated 1GB of heap space, the max heap space is not completely utilized. When allocated 2GB, compactions exceed 1GB of heap space used. It appears that G1 GC and ZGC use the least amount of heap space during a compaction (maxing out around 1.5GB and when using ZGC with ZGeneration in Java 21, this maxes out around 1.7GB). Shenandoah GC appears to use the most heap space during a compaction with a max heap space around 1.9GB (for Java 11, 17, and 21). However, these differences might be due to differences between outside factors during runs and more testing may need to be done to confirm or deny these claims.&lt;/p&gt;

 &lt;p&gt;Another difference found between the GCs tested was that Shenandoah GC sometimes required two garbage collections to occur after a compaction completed to clean up the memory. Based on our experiments, when a larger max heap size was allocated (2GB vs 1GB), the first garbage collection that occurred only cleaned up about half of the now unused memory, and another garbage collection had to occur for the rest to be cleaned up. This was not the case when 1GB of max heap space was allocated (almost all of the unused memory was cleaned up on the first garbage collection, with the rest being cleaned up on the next garbage collection). G1 GC and ZGC always cleaned up the majority of the memory on the first garbage collection.&lt;/p&gt;

 &lt;p&gt;*Note: When using the default GC (G1 GC), garbage collection does not automatically occur unless further garbage collection settings are specified (e.g., G1PeriodicGCInterval)&lt;/p&gt;
 </description>
         <pubDate>Tue, 09 Apr 2024 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/blog/2024/04/09/does-a-compactor-return-memory-to-OS.html</link>
         <guid isPermaLink="true">https://accumulo.apache.org/blog/2024/04/09/does-a-compactor-return-memory-to-OS.html</guid>


         <category>blog</category>

       </item>

       <item>
         <title>Apache Accumulo 1.10.4</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 1.10.4 is the final bug fix release of the 1.10 LTM release
 line. As of this release, the 1.10 release line is now considered end-of-life.
 This means that any fixes that are applied because of a bug found in this
 version will not be applied and released as a new 1.10 patch version, but
 instead will be applied and released to the currently active release lines, if
 they apply to those versions.&lt;/p&gt;

 &lt;p&gt;These release notes are highlights of the changes since 1.10.3. The full
 detailed changes can be seen in the git history. If anything important is
 missing from this list, please &lt;a href=&quot;/contact-us&quot;&gt;contact&lt;/a&gt; us to have it included.&lt;/p&gt;

 &lt;p&gt;Users of any 1.10 version are encouraged to upgrade to the next LTM release,
 which is 2.1 at the time of this writing. This patch release is provided as a
 final release with all the patches the developers have made to 1.10, for
 anybody who must remain using 1.10, and who want to upgrade from an earlier 1.x
 version.&lt;/p&gt;

 &lt;h2 id=&quot;known-issues&quot;&gt;Known Issues&lt;/h2&gt;

 &lt;p&gt;Apache Commons VFS was upgraded in &lt;a href=&quot;https://github.com/apache/accumulo/issues/1295&quot;&gt;#1295&lt;/a&gt; for 1.10.0 and some users have reported
 issues similar to &lt;a href=&quot;https://issues.apache.org/jira/projects/VFS/issues/VFS-683&quot;&gt;VFS-683&lt;/a&gt;. Possible solutions are discussed in &lt;a href=&quot;https://github.com/apache/accumulo/issues/2775&quot;&gt;#2775&lt;/a&gt;.
 This issue is applicable to all 1.10 versions.&lt;/p&gt;

 &lt;h2 id=&quot;major-improvements&quot;&gt;Major Improvements&lt;/h2&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3391&quot;&gt;#3391&lt;/a&gt; Drop support for MapFile file formats as an alternative to
 RFile; the use of MapFiles was already broken, and had been for a long time.
 So this change was done to cause an explicit and detectable failure, rather
 than allow a silent one to occur if a MapFile was attempted to be used.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3703&quot;&gt;#3703&lt;/a&gt; Add verification checks to improve the reliability of the
 accumulo-gc, in order to ensure that a full row for a tablet was seen when a
 file deletion candidate is checked&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;other-improvements&quot;&gt;Other Improvements&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3300&quot;&gt;#3300&lt;/a&gt; Fix the documentation about iterator teardown in the user manual&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3343&quot;&gt;#3343&lt;/a&gt; Fix errors in the javadoc for Range&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;note-about-jdk-15&quot;&gt;Note About JDK 15&lt;/h2&gt;

 &lt;p&gt;See the note in the 1.10.1 release notes about the use of JDK 15 or later, as
 the information pertaining to the use of the CMS garbage collector remains
 applicable to all 1.10 releases.&lt;/p&gt;

 &lt;h2 id=&quot;useful-links&quot;&gt;Useful Links&lt;/h2&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/compare/rel/1.10.3...apache:rel/1.10.4&quot;&gt;All Changes since 1.10.3&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues?q=%20project%3Aapache%2Faccumulo%2F27&quot;&gt;GitHub&lt;/a&gt; - List of issues tracked on GitHub corresponding to this release&lt;/li&gt;
 &lt;/ul&gt;

 </description>
         <pubDate>Thu, 16 Nov 2023 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-1.10.4/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.10.4/</guid>


         <category>release</category>

       </item>

       <item>
         <title>Apache Accumulo 3.0.0</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 3.0.0 is a non-LTM major version release. While it
 primarily contains the 2.1 codebase, including all patches through
 2.1.2, it has also removed a substantial number of deprecated features
 and code, in an attempt to clean up several years of accrued technical
 debt, and lower the maintenance burden to make way for future
 improvements. It also contains a few other minor improvements.&lt;/p&gt;

 &lt;h2 id=&quot;notable-removals&quot;&gt;Notable Removals&lt;/h2&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1328&quot;&gt;#1328&lt;/a&gt; The FileSystem monitor has been removed and will no
 longer watch for problems with local file systems and self-terminate.
 System administrators are encouraged to use whatever systems health
 monitoring is appropriate for their deployments, rather than depend on
 Accumulo to monitor these.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2443&quot;&gt;#2443&lt;/a&gt; The MapReduce APIs embedded in the accumulo-core module
 were removed. The separate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accumulo-hadoop-mapreduce&lt;/code&gt; jar is their
 replacement.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3073&quot;&gt;#3073&lt;/a&gt; The legacy Connector and Instance client classes were removed.
 The AccumuloClient is their replacement.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3080&quot;&gt;#3080&lt;/a&gt; The cross-data center replication feature was removed without
 replacement due to lack of being maintained, having numerous outstanding
 unfixed issues with no volunteer to maintain it since it was deprecated, and
 substantial code complexity. The built-in replication table it used for
 tracking replication metadata will be removed on upgrade.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3114&quot;&gt;#3114&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3115&quot;&gt;#3115&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3116&quot;&gt;#3116&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3117&quot;&gt;#3117&lt;/a&gt; Removed
 deprecated VolumeChooser, TabletBalancer, Constraint, and other APIs, in
 favor of their SPI replacements.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3106&quot;&gt;#3106&lt;/a&gt; Remove deprecated configuration properties (see 2.1 property
 documentation for which ones were deprecated)&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3112&quot;&gt;#3112&lt;/a&gt; Remove CompactionStrategy class in favor of CompactionSelector
 and CompactionConfigurer.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3160&quot;&gt;#3160&lt;/a&gt; Remove upgrade code for versions prior to 2.1 (minimum version
 to upgrade from is now 2.1.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3192&quot;&gt;#3192&lt;/a&gt; Remove arguments to server processes, such as (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-a&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-g&lt;/code&gt;,
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-q&lt;/code&gt;, etc.) were removed in favor of configuration properties that can be
 specified in the Accumulo configuration files or supplied on a per-process
 basis using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-o&lt;/code&gt; argument. The provided cluster management reference
 scripts were updated in &lt;a href=&quot;https://github.com/apache/accumulo/issues/3197&quot;&gt;#3197&lt;/a&gt; to use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-o&lt;/code&gt; method.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3136&quot;&gt;#3136&lt;/a&gt; Remove the built-in VFS classloader support. To use a custom
 classloader, users must now set the ContextClassLoaderFactory implementation
 in the properties. The default is now the URLContextClassLoaderFactory.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3318&quot;&gt;#3318&lt;/a&gt; Remove the old bulk import implementation, replaced by the new
 bulk import API added &lt;a href=&quot;https://accumulo.apache.org/release/accumulo-2.0.0/#new-bulk-import-api&quot;&gt;in 2.0.0&lt;/a&gt;.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3265&quot;&gt;#3265&lt;/a&gt; Remove scan interpreter and scan formatter from the shell&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3361&quot;&gt;#3361&lt;/a&gt; Remove all remaining references to the old “master” service
 (renamed to “manager”).&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3360&quot;&gt;#3360&lt;/a&gt; Remove checks and code related to the old password hashing
 mechanism in Accumulo. This will discontinue warnings about users passwords
 that are still out of date. Instead, those outdated passwords will simply
 become invalid. If the user authenticated to Accumulo at any time prior to
 upgrading, their password will have been converted. So this only affects
 accounts that were never used with 2.1 at all. As mitigation, such users will
 be able to have their password reset by the root user. If the root user never
 authenticated (and neither had another admin user) while on 2.1 (very very
 unlikely), an administrator can reset the entire user database through the
 normal init step to reset security.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3378&quot;&gt;#3378&lt;/a&gt; Remove broken support for old map files. (RFiles have been in
 use for a long time, so this should not impact any users; if users had been
 trying to use map files, they would have found that they were broken anyway)&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;notable-additions&quot;&gt;Notable Additions&lt;/h2&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3088&quot;&gt;#3088&lt;/a&gt; New methods were added to compaction-related APIs to share
 information about the current tablet being compacted to user code&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3107&quot;&gt;#3107&lt;/a&gt; Decompose internal thrift services by function to make RPC
 functionality more modular by server instances&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3189&quot;&gt;#3189&lt;/a&gt; Standardized server lock data structure in ZooKeeper&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3206&quot;&gt;#3206&lt;/a&gt; Internal caches now use Caffeine instead of Guava’s Cache&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3161&quot;&gt;#3161&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3288&quot;&gt;#3288&lt;/a&gt; The internal service (renamed from
 GarbageCollectionLogger to LowMemoryDetector) that was previously used only
 to report low memory in servers, was made configurable to allow pausing
 certain operations like scanning, minor compactions, or major compactions,
 when memory is low. See the server properties for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;general.low.mem.*&lt;/code&gt;.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;upgrading&quot;&gt;Upgrading&lt;/h2&gt;

 &lt;p&gt;View the &lt;a href=&quot;/docs/2.x/administration/upgrading&quot;&gt;Upgrading Accumulo documentation&lt;/a&gt; for guidance.&lt;/p&gt;

 &lt;h2 id=&quot;300-github-project&quot;&gt;3.0.0 GitHub Project&lt;/h2&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/projects/11&quot;&gt;All tickets related to 3.0.0.&lt;/a&gt;&lt;/p&gt;

 </description>
         <pubDate>Mon, 21 Aug 2023 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-3.0.0/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-3.0.0/</guid>


         <category>release</category>

       </item>

       <item>
         <title>Apache Accumulo 2.1.2</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 2.1.2 is a patch release of the 2.1 LTM line. It contains bug
 fixes and minor enhancements. This version supersedes 2.1.1. Users upgrading to
 2.1 should upgrade directly to this version instead of 2.1.1.&lt;/p&gt;

 &lt;p&gt;Included here are some highlights of the most interesting bugs fixed and
 features added in 2.1.2. For the full set of changes, please see the commit
 history or issue tracker.&lt;/p&gt;

 &lt;h3 id=&quot;notable-improvements&quot;&gt;Notable Improvements&lt;/h3&gt;

 &lt;p&gt;Improvements that affect performance:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3499&quot;&gt;#3499&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3543&quot;&gt;#3543&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3549&quot;&gt;#3549&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3500&quot;&gt;#3500&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3509&quot;&gt;#3509&lt;/a&gt;
 Made some optimizations around the processing of file references in the
 accumulo-gc code, including optimizing a constructor in a class called
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TabletFile&lt;/code&gt; used to track file references.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3541&quot;&gt;#3541&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3542&quot;&gt;#3542&lt;/a&gt; Added a new property,
 &lt;a href=&quot;/docs/2.x/configuration/server-properties#manager_tablet_watcher_interval&quot;&gt;manager.tablet.watcher.interval&lt;/a&gt;, to make the time to wait between
 scanning the metadata table for outstanding tablet actions (such as assigning
 tablets, etc.) to be configurable.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;Improvements that help with administration:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3678&quot;&gt;#3678&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3683&quot;&gt;#3683&lt;/a&gt; Added extra validation of property
 &lt;a href=&quot;/docs/2.x/configuration/server-properties#table_class_loader_context&quot;&gt;table.class.loader.context&lt;/a&gt; at the time it is set, to prevent
 invalid contexts from being set on a table.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3548&quot;&gt;#3548&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3561&quot;&gt;#3561&lt;/a&gt; Added a banner to the manager page in the
 Monitor that displays the manager state and goal state when they are not
 normal.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3383&quot;&gt;#3383&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3680&quot;&gt;#3680&lt;/a&gt; Prompt the user for confirmation when they
 attempt to set a deprecated property in the Shell as a way to get them to use
 the non-deprecated property.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3233&quot;&gt;#3233&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3562&quot;&gt;#3562&lt;/a&gt; Add option to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--exclude-parent&lt;/code&gt; to allow
 creating a table or namespace in the shell initialized with only the
 properties set on another table or namespace, but not those the other table
 or namespace were inheriting from their parent.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3600&quot;&gt;#3600&lt;/a&gt; Normalized metric labels and structure.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;notable-bug-fixes&quot;&gt;Notable Bug Fixes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3488&quot;&gt;#3488&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3612&quot;&gt;#3612&lt;/a&gt; Fixed sorting of some columns on the monitor&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3674&quot;&gt;#3674&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3677&quot;&gt;#3677&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3685&quot;&gt;#3685&lt;/a&gt; Prevent an invalid table
 context and other errors from killing the minor compaction thread and
 preventing a tablet from being closed and shutting down normally.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3630&quot;&gt;#3630&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3631&quot;&gt;#3631&lt;/a&gt; Fix a bug where BatchWriter latency and
 timeout values were incorrectly converted to the wrong time unit..&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3617&quot;&gt;#3617&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3622&quot;&gt;#3622&lt;/a&gt; Close LocalityGroupReader when IOException is
 thrown to release reference to a possibly corrupted stream in a cached block
 file.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3570&quot;&gt;#3570&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3571&quot;&gt;#3571&lt;/a&gt; Fixed the TabletGroupWatcher shutdown order.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3569&quot;&gt;#3569&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3579&quot;&gt;#3579&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/3644&quot;&gt;#3644&lt;/a&gt; Changes to ensure that scan
 sessions are cleaned up.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3553&quot;&gt;#3553&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3555&quot;&gt;#3555&lt;/a&gt; A bug where a failed user compaction would not
 retry and would hang was fixed.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;other-notable-changes&quot;&gt;Other Notable Changes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3550&quot;&gt;#3550&lt;/a&gt; The contents of the contrib directory have been moved to more
 appropriate locations for build-related resources&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;upgrading&quot;&gt;Upgrading&lt;/h2&gt;

 &lt;p&gt;View the &lt;a href=&quot;/docs/2.x/administration/upgrading&quot;&gt;Upgrading Accumulo documentation&lt;/a&gt; for guidance.&lt;/p&gt;

 &lt;h2 id=&quot;212-github-project&quot;&gt;2.1.2 GitHub Project&lt;/h2&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/projects/29&quot;&gt;All tickets related to 2.1.2.&lt;/a&gt;&lt;/p&gt;

 </description>
         <pubDate>Mon, 21 Aug 2023 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-2.1.2/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-2.1.2/</guid>


         <category>release</category>

       </item>

       <item>
         <title>Apache Accumulo 2.1.1</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 2.1.1 is a patch release of the 2.1 LTM line. It contains
 many bug fixes and minor enhancements, including a critical fix. This version
 supersedes 2.1.0. Users upgrading to 2.1 should upgrade directly to this
 version instead of 2.1.0.&lt;/p&gt;

 &lt;p&gt;Included here are some highlights of the most interesting bugs and features
 fixed in 2.1.1. Several trivial bugs were also fixed that related to the
 presentation of information on the monitor, or to avoid spammy/excessive
 logging, but are too numerous to list here. For the full set of bug fixes,
 please see the commit history or issue tracker.&lt;/p&gt;

 &lt;p&gt;NOTE: This 2.1 release also includes any applicable bug fixes and improvements
 that occurred in 1.10.3 and earlier.&lt;/p&gt;

 &lt;h3 id=&quot;critical-fixes&quot;&gt;Critical Fixes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://www.cve.org/CVERecord?id=CVE-2023-34340&quot;&gt;CVE-2023-34340&lt;/a&gt; Fixed a critical issue that improperly allowed a user under
 some conditions to authenticate to Accumulo using an invalid password.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;notable-improvements&quot;&gt;Notable Improvements&lt;/h3&gt;

 &lt;p&gt;Improvements that add capabilities:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3180&quot;&gt;#3180&lt;/a&gt; Enable users to provide per-volume Hadoop Filesystem
 configuration overrides via the Accumulo configuration. Hadoop Filesystem
 objects are configured by the standard Hadoop mechanisms (default
 configuration, core-site.xml, hdfs-site.xml, etc.), but these configuration
 files don’t allow for the same property to be specified with different values
 for different namespaces. This change allows users to specify different
 property values for different Accumulo volumes, which will be applied to the
 Hadoop Filesystem object created for each Accumulo volume&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1169&quot;&gt;#1169&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3142&quot;&gt;#3142&lt;/a&gt; Add configuration option for users to select
 how the last location field is used, so users have better control over
 initial assignments on restarts&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3400&quot;&gt;#3400&lt;/a&gt; Inject environment injected into ContextClassLoaderFactory SPI
 so implementations can read and make use of Accumulo’s own configuration&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;Improvements that affect performance:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3175&quot;&gt;#3175&lt;/a&gt; Reset number of locks in SynchronousLoadingBlockCache from
 2017 back to 5003, the value that it was in 1.10. &lt;a href=&quot;https://github.com/apache/accumulo/issues/3226&quot;&gt;#3226&lt;/a&gt; Also,
 modified the lock to be fair, which allows the different scan threads in the
 server to make progress in a more fair manner when they need to load a block
 into the cache&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3077&quot;&gt;#3077&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3079&quot;&gt;#3079&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3083&quot;&gt;#3083&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3123&quot;&gt;#3123&lt;/a&gt; Avoid filling
 OS page cache by calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setDropBehind&lt;/code&gt; on the FS data stream when
 performing likely one-time file accesses, as with WAL and compaction input
 and output files. This should allow files that might benefit more from
 caching to stay in the cache longer. &lt;a href=&quot;https://github.com/apache/accumulo/issues/3083&quot;&gt;#3083&lt;/a&gt; and &lt;a href=&quot;https://github.com/apache/accumulo/issues/3123&quot;&gt;#3123&lt;/a&gt;
 introduces new properties, table.compaction.major.output.drop.cache and
 table.compaction.minor.output.drop.cache, for dropping pages from the OS page
 cache for compaction output files. These changes will only have an impact on
 HDFS FileSystem implementations and operating systems that support the
 underlying OS system call. See associated issue, &lt;a href=&quot;https://issues.apache.org/jira/browse/HDFS-16864&quot;&gt;HDFS-16864&lt;/a&gt;, that will
 improve the underlying implementation when resolved.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;Improvements that help with administration:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3445&quot;&gt;#3445&lt;/a&gt; Add emergency maintenance utility to edit properties in
 ZooKeeper while the Accumulo cluster is shut down&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3118&quot;&gt;#3118&lt;/a&gt; Added option to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;admin zoo-info-viewer&lt;/code&gt; command to dump
 the ACLs on ZooKeeper nodes. This information can be used to fix znodes with
 incorrect ACLs during the upgrade process&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;Other notable changes:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3126&quot;&gt;#3126&lt;/a&gt; Remove unintentionally bundled htrace4 from our packaging;
 users will need to provide that for themselves if they require it on their
 classpath&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3436&quot;&gt;#3436&lt;/a&gt; Deprecate gc.trash.ignore property. The trash can be
 customized within Hadoop if one wishes to ignore it, or configure it to be
 ignored for only specific files (and this has been tested with recent
 versions of Hadoop); In version 3.0, this property will be removed, and it
 will no longer be possible to ignore the trash by changing this property&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;notable-bug-fixes&quot;&gt;Notable Bug Fixes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3134&quot;&gt;#3134&lt;/a&gt; Fixed Thrift issues due to incorrect setting of maxMessageSize&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3144&quot;&gt;#3144&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3150&quot;&gt;#3150&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3164&quot;&gt;#3164&lt;/a&gt; Fixed bugs in ScanServer that
 prevented a tablet from being scanned when some transient failures occurred&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3346&quot;&gt;#3346&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3366&quot;&gt;#3366&lt;/a&gt; Fixed tablet metadata verification task so it
 doesn’t unintentionally cause the server to halt&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3479&quot;&gt;#3479&lt;/a&gt; Fixed issue preventing servers from shutting down because they
 were still receiving assignments&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3492&quot;&gt;#3492&lt;/a&gt; Fixed a bug where bulk imports could cause compactions to hang&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;upgrading&quot;&gt;Upgrading&lt;/h2&gt;

 &lt;p&gt;View the &lt;a href=&quot;/docs/2.x/administration/upgrading&quot;&gt;Upgrading Accumulo documentation&lt;/a&gt; for guidance.&lt;/p&gt;

 &lt;h2 id=&quot;211-github-project&quot;&gt;2.1.1 GitHub Project&lt;/h2&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/projects/25&quot;&gt;All tickets related to 2.1.1.&lt;/a&gt;&lt;/p&gt;

 </description>
         <pubDate>Mon, 19 Jun 2023 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-2.1.1/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-2.1.1/</guid>


         <category>release</category>

       </item>

       <item>
         <title>Apache Accumulo 1.10.3</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 1.10.3 is a bug fix release of the 1.10 LTM release line.&lt;/p&gt;

 &lt;p&gt;These release notes are highlights of the changes since 1.10.2. The full
 detailed changes can be seen in the git history. If anything important is
 missing from this list, please &lt;a href=&quot;/contact-us&quot;&gt;contact&lt;/a&gt; us to have it included.&lt;/p&gt;

 &lt;p&gt;Users of 1.10.2 or earlier are encouraged to upgrade to 1.10.3, as this is a
 continuation of the 1.10 LTM release line with bug fixes and improvements, and
 it supersedes any prior 1.x version. Users are also encouraged to consider
 migrating to a 2.x version when one that is suitable for their needs becomes
 available.&lt;/p&gt;

 &lt;h2 id=&quot;known-issues&quot;&gt;Known Issues&lt;/h2&gt;

 &lt;p&gt;Apache Commons VFS was upgraded in &lt;a href=&quot;https://github.com/apache/accumulo/issues/1295&quot;&gt;#1295&lt;/a&gt; for 1.10.0 and some users have reported
 issues similar to &lt;a href=&quot;https://issues.apache.org/jira/projects/VFS/issues/VFS-683&quot;&gt;VFS-683&lt;/a&gt;. Possible solutions are discussed in &lt;a href=&quot;https://github.com/apache/accumulo/issues/2775&quot;&gt;#2775&lt;/a&gt;.
 This issue is applicable to all 1.10 versions.&lt;/p&gt;

 &lt;h2 id=&quot;major-improvements&quot;&gt;Major Improvements&lt;/h2&gt;

 &lt;p&gt;None&lt;/p&gt;

 &lt;h3 id=&quot;other-improvements&quot;&gt;Other Improvements&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2708&quot;&gt;#2708&lt;/a&gt; Disabled merging minor-compactions by default&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3226&quot;&gt;#3226&lt;/a&gt; Change scan thread resource management to use a “fair”
 semaphore to avoid resource starvation in some situations&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3221&quot;&gt;#3221&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3249&quot;&gt;#3249&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3261&quot;&gt;#3261&lt;/a&gt; Improve some performance by
 improving split point calculations&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3276&quot;&gt;#3276&lt;/a&gt; Improve performance by optimizing internal data structures in
 frequently used Authorizations object&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;other-bug-fixes&quot;&gt;Other Bug Fixes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3069&quot;&gt;#3069&lt;/a&gt; Fix a minor bug with VFS on newer Java versions due to
 MIME-type changes&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3176&quot;&gt;#3176&lt;/a&gt; Fixed bug in client scanner code that was not using the
 correct timeout variable in some places&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3168&quot;&gt;#3168&lt;/a&gt; Fixed bug in TabletLocator that could cause the BatchScanner
 to return duplicate data&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3231&quot;&gt;#3231&lt;/a&gt;, &lt;a href=&quot;https://github.com/apache/accumulo/issues/3235&quot;&gt;#3235&lt;/a&gt; Fix wait timeout logic when waiting for
 minimum number of available tservers during startup&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;note-about-jdk-15&quot;&gt;Note About JDK 15&lt;/h2&gt;

 &lt;p&gt;See the note in the 1.10.1 release notes about the use of JDK 15 or later, as
 the information pertaining to the use of the CMS garbage collector remains
 applicable to all 1.10 releases.&lt;/p&gt;

 &lt;h2 id=&quot;useful-links&quot;&gt;Useful Links&lt;/h2&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://lists.apache.org/thread/zl8xoogzqnbcw75vcmvqmwlrf8djfcb5&quot;&gt;Release VOTE email thread&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/compare/rel/1.10.2...apache:rel/1.10.3&quot;&gt;All Changes since 1.10.2&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues?q=%20project%3Aapache%2Faccumulo%2F23&quot;&gt;GitHub&lt;/a&gt; - List of issues tracked on GitHub corresponding to this release&lt;/li&gt;
 &lt;/ul&gt;

 </description>
         <pubDate>Thu, 13 Apr 2023 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-1.10.3/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.10.3/</guid>


         <category>release</category>

       </item>

       <item>
         <title>Apache Accumulo 2.1.0</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 2.1.0 brings many new features and updates since 1.10 and 2.0. The 2.1 release
 series is an LTM series, and as such, is expected to receive stability-improving bugfixes, as
 needed. This makes this series suitable for production environments where stability is preferable
 over new features that might appear in subsequent non-LTM releases.&lt;/p&gt;

 &lt;p&gt;This release has received more than 1200 commits from over 50 contributors, including numerous
 bugfixes, updates, and features.&lt;/p&gt;

 &lt;h2 id=&quot;minimum-requirements&quot;&gt;Minimum Requirements&lt;/h2&gt;

 &lt;p&gt;This version of Accumulo requires at least Java 11 to run. Various Java 11 versions from different
 distributors were used throughout its testing and development, so we expect it to work with any
 standard OpenJDK-based Java distribution.&lt;/p&gt;

 &lt;p&gt;At least Hadoop 3 is required, though it is recommended to use a more recent version. Version 3.3
 was used extensively during testing, but we have no specific knowledge that an earlier version of
 Hadoop 3 will not work. Whichever major/minor version you use, it is recommended to use the latest
 bugfix/patch version available. By default, our POM depends on 3.3.4.&lt;/p&gt;

 &lt;p&gt;During much of this release’s development, ZooKeeper 3.5 was used as a minimum. However, that
 version reach its end-of-life during development, and we do not recommend using end-of-life versions
 of ZooKeeper. The latest bugfix version of 3.6, 3.7, or 3.8 should also work fine. By default, our
 POM depends on 3.8.0.&lt;/p&gt;

 &lt;h2 id=&quot;binary-incompatibility&quot;&gt;Binary Incompatibility&lt;/h2&gt;

 &lt;p&gt;This release is known to be incompatible with prior versions of the client libraries. That is, the
 2.0.0 or 2.0.1 version of the client libraries will not be able to communicate with a 2.1.0 or later
 installation of Accumulo, nor will the 2.1.0 or later version of the client libraries communicate
 with a 2.0.1 or earlier installation.&lt;/p&gt;

 &lt;h2 id=&quot;major-new-features&quot;&gt;Major New Features&lt;/h2&gt;

 &lt;h3 id=&quot;overhaul-of-table-compactions&quot;&gt;Overhaul of Table Compactions&lt;/h3&gt;

 &lt;p&gt;Significant changes were made to how Accumulo compacts files in this release. See
 &lt;a href=&quot;/docs/2.x/administration/compaction&quot;&gt;compaction &lt;/a&gt; for details, below are some highlights.&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;Multiple concurrent compactions per tablet on disjoint files is now supported.  Previously only a
 single compaction could run on a tablet.  This allows tablets that are running long compactions
 on large files to concurrently compact new smaller files that arrive.&lt;/li&gt;
   &lt;li&gt;Multiple compaction thread pools per tablet server are now supported. Previously only a single
 thread pool existed within a tablet server for compactions.  With a single thread pool, if all
 threads are working on long compactions it can starve quick compactions.  Now compactions with
 little data can be processed by dedicated thread pools.&lt;/li&gt;
   &lt;li&gt;Accumulo’s default algorithm for selecting files to compact was modified to select the smallest
 set of files that meet the compaction ratio criteria instead of the largest set.  This change
 makes tablets more aggressive about reducing their number files while still doing logarithmic
 compaction work. This change also enables efficiently compacting new small files that arrive
 during a long running compaction.&lt;/li&gt;
   &lt;li&gt;Having dedicated compaction threads pools for tables is now supported through configuration.  The
 default configuration for Accumulo sets up dedicated thread pools for compacting the Accumulo
 metadata table.&lt;/li&gt;
   &lt;li&gt;Merging minor compactions were dropped.  These were added to Accumulo to address the problem of
 new files arriving while a long running compaction was running.  Merging minor compactions could
 cause O(N^2) compaction work. The new compaction changes in this release can satisfy this use
 case while doing a logarithmic amount of work.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;CompactionStrategy was deprecated in favor of new public APIs. CompactionStrategy was never public
 API as it used internal types and one of these types &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FileRef&lt;/code&gt; was removed in 2.1. Users who have
 written a CompactionStrategy can replace &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FileRef&lt;/code&gt; with its replacement internal type
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StoredTabletFile&lt;/code&gt; but this is not recommended. Since it is very likely that CompactionStrategy will
 be removed in a future release, any work put into rewriting a CompactionStrategy will be lost. It is
 recommended that users implement CompactionSelector, CompactionConfigurer, and CompactionPlanner
 instead.  The new compaction changes in 2.1 introduce new algorithms for optimally scheduling
 compactions across multiple thread pools, configuring a deprecated compaction strategy may result is
 missing out on the benefits of these new algorithms.&lt;/p&gt;

 &lt;p&gt;See the &lt;a href=&quot;https://static.javadoc.io/org.apache.accumulo/accumulo-tserver/2.1.2/org/apache/accumulo/tserver/compaction/CompactionStrategy.html&quot;&gt;javadoc&lt;/a&gt; for more
 information.&lt;/p&gt;

 &lt;p&gt;GitHub tickets related to these changes: &lt;a href=&quot;https://github.com/apache/accumulo/issues/564&quot;&gt;#564&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1605&quot;&gt;#1605&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1609&quot;&gt;#1609&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1649&quot;&gt;#1649&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;external-compactions-experimental&quot;&gt;External Compactions (experimental)&lt;/h3&gt;

 &lt;p&gt;This feature includes two new optional server components, CompactionCoordinator and Compactor, that
 enables the user to run major compactions outside of the TabletServer. See &lt;a href=&quot;/docs/2.x/getting-started/design&quot;&gt;design &lt;/a&gt;, &lt;a href=&quot;/docs/2.x/administration/compaction&quot;&gt;compaction &lt;/a&gt;, and the External Compaction &lt;a href=&quot;/blog/2021/07/08/external-compactions.html&quot;&gt;blog
 post&lt;/a&gt; for more information. This work was completed over many tickets, see the GitHub
 &lt;a href=&quot;https://github.com/apache/accumulo/projects/20&quot;&gt;project&lt;/a&gt; for the related issues. &lt;a href=&quot;https://github.com/apache/accumulo/issues/2096&quot;&gt;#2096&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;scan-servers-experimental&quot;&gt;Scan Servers (experimental)&lt;/h3&gt;

 &lt;p&gt;This feature includes a new optional server component, Scan Server, that enables the user to run
 scans outside of the TabletServer. See &lt;a href=&quot;/docs/2.x/getting-started/design&quot;&gt;design &lt;/a&gt;,
 &lt;a href=&quot;https://github.com/apache/accumulo/issues/2411&quot;&gt;#2411&lt;/a&gt;, and &lt;a href=&quot;https://github.com/apache/accumulo/issues/2665&quot;&gt;#2665&lt;/a&gt; for more information. Importantly, users can utilize this
 feature to avoid bogging down the TabletServer with long-running scans, slow iterators, etc.,
 provided they are willing to tolerate eventual consistency.&lt;/p&gt;

 &lt;h3 id=&quot;new-per-table-on-disk-encryption-experimental&quot;&gt;New Per-Table On-Disk Encryption (experimental)&lt;/h3&gt;

 &lt;p&gt;On-disk encryption can now be configured on a per table basis as well as for the entire instance
 (all tables). See &lt;a href=&quot;/docs/2.x/security/on-disk-encryption&quot;&gt;on-disk-encryption &lt;/a&gt; for more information.&lt;/p&gt;

 &lt;h3 id=&quot;new-jshell-entry-point&quot;&gt;New jshell entry point&lt;/h3&gt;

 &lt;p&gt;Created new “jshell” convenience entry point. Run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bin/accumulo jshell&lt;/code&gt; to start up jshell,
 preloaded with Accumulo classes imported and with an instance of AccumuloClient already created for
 you to connect to Accumulo (assuming you have a client properties file on the class path)  &lt;a href=&quot;https://github.com/apache/accumulo/issues/1870&quot;&gt;#1870&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1910&quot;&gt;#1910&lt;/a&gt;&lt;/p&gt;

 &lt;h2 id=&quot;major-improvements&quot;&gt;Major Improvements&lt;/h2&gt;

 &lt;h3 id=&quot;fixed-gc-metadata-hotspots&quot;&gt;Fixed GC Metadata hotspots&lt;/h3&gt;

 &lt;p&gt;Prior to this release, Accumulo stored GC file candidates in the metadata table using rows of the
 form &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~del&amp;lt;URI&amp;gt;&lt;/code&gt;. This row schema lead to uneven load on the metadata table and metadata tablets
 that were eventually never used. In &lt;a href=&quot;https://github.com/apache/accumulo/issues/1043&quot;&gt;#1043&lt;/a&gt; / &lt;a href=&quot;https://github.com/apache/accumulo/issues/1344&quot;&gt;#1344&lt;/a&gt;, the row format was changed to
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~del&amp;lt;hash(URI)&amp;gt;&amp;lt;URI&amp;gt;&lt;/code&gt; resulting in even load on the metadata table and even data spread in the
 tablets. After upgrading, there may still be splits in the metadata table using the old row format.
 These splits can be merged away as shown in the example below which starts off with splits generated
 from the old and new row schema. The old splits with the prefix &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~delhdfs&lt;/code&gt; are merged away.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;root@uno&amp;gt; getsplits -t accumulo.metadata
 2&amp;lt;
 ~
 ~del55
 ~dela7
 ~delhdfs://localhost:8020/accumulo/tables/2/default_tablet/F00000a0.rf
 ~delhdfs://localhost:8020/accumulo/tables/2/default_tablet/F00000kb.rf
 root@uno&amp;gt; merge -t accumulo.metadata -b ~delhdfs -e ~delhdfs~
 root@uno&amp;gt; getsplits -t accumulo.metadata
 2&amp;lt;
 ~
 ~del55
 ~dela7
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;master-renamed-to-manager&quot;&gt;Master Renamed to Manager&lt;/h3&gt;

 &lt;p&gt;In order to use more inclusive language in our code, the Accumulo team has renamed all references to
 the word “master” to “manager” (with the exception of deprecated classes and packages retained for
 compatibility). This change includes the master server process, configuration properties with master
 in the name, utilities with master in the name, and packages/classes in the code base. Where these
 changes affect the public API, the deprecated “master” name will still be supported until Accumulo
 3.0.&lt;/p&gt;

 &lt;blockquote&gt;
   &lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;
   One particular change to be aware of is that certain state for the manager process is stored in
   ZooKeeper, previously in under a directory named &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;masters&lt;/code&gt;. This directory has been renamed to
   &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;managers&lt;/code&gt;, and the upgrade will happen automatically if you launch Accumulo using the provided
   scripts. However, if you do not use the built in scripts (e.g., accumulo-cluster or
   accumulo-service), then you will need to perform a one-time upgrade of the ZooKeeper state by
   executing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RenameMasterDirInZK&lt;/code&gt; utility:&lt;/p&gt;
   &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  ${ACCUMULO_HOME}/bin/accumulo org.apache.accumulo.manager.upgrade.RenameMasterDirInZK
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
 &lt;/blockquote&gt;

 &lt;p&gt;Some other specific examples of these changes include:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;All configuration properties starting with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master.&lt;/code&gt; have been renamed to start with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;manager.&lt;/code&gt;
 instead. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master.*&lt;/code&gt; property names in the site configuration file (or passed on the
 command-line) are converted internally to the new name, and a warning is printed. However, the old
 name can still be used until at least the 3.0 release of Accumulo. Any &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master.*&lt;/code&gt; properties that
 have been set in ZooKeeper will be automatically converted to the new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;manager.*&lt;/code&gt; name when
 Accumulo is upgraded. The old property names can still be used by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;config&lt;/code&gt; shell command or
 via the methods accessible via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AccumuloClient&lt;/code&gt;, but a warning will be generated when the old
 names are used. You are encouraged to update all references to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master&lt;/code&gt; in your site configuration
 files to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;manager&lt;/code&gt; when installing Accumulo 2.1.&lt;/li&gt;
   &lt;li&gt;The tablet balancers in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.master.balancer&lt;/code&gt; package have all been
 relocated to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.manager.balancer&lt;/code&gt;. DefaultLoadBalancer has been also
 renamed to SimpleLoadBalancer along with the move. The default balancer has been updated from
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.master.balancer.TableLoadBalancer&lt;/code&gt; to
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.manager.balancer.TableLoadBalancer&lt;/code&gt;, and the default per-table
 balancer has been updated from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.master.balancer.DefaultLoadBalancer&lt;/code&gt; to
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.manager.balancer.SimpleLoadBalancer&lt;/code&gt;. If you have customized the
 tablet balancer configuration, you are strongly encouraged to update your configuration to
 reference the updated balancer names. If you have written a custom tablet balancer, it should be
 updated to implement the new interface
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.manager.balancer.TabletBalancer&lt;/code&gt; rather than extending the deprecated
 abstract &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.master.balancer.TabletBalancer&lt;/code&gt;.&lt;/li&gt;
   &lt;li&gt;The configuration file &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;masters&lt;/code&gt; for identifying the manager host(s) has been deprecated. If this
 file is found, a warning will be printed. The replacement file &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;managers&lt;/code&gt; should be used (i.e.,
 rename your masters file to managers) instead.&lt;/li&gt;
   &lt;li&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;master&lt;/code&gt; argument to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accumulo-service&lt;/code&gt; script has been deprecated, and the replacement
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;manager&lt;/code&gt; argument should be used instead.&lt;/li&gt;
   &lt;li&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-master&lt;/code&gt; argument to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.server.util.ZooZap&lt;/code&gt; utility has been deprecated
 and the replacement &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-manager&lt;/code&gt; argument should be used instead.&lt;/li&gt;
   &lt;li&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GetMasterStats&lt;/code&gt; utility has been renamed to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GetManagerStats&lt;/code&gt;.&lt;/li&gt;
   &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.master.state.SetGoalState&lt;/code&gt; is deprecated, and any custom scripts that invoke
 this utility should be updated to call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.apache.accumulo.manager.state.SetGoalState&lt;/code&gt; instead.&lt;/li&gt;
   &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;masterMemory&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;minicluster.properties&lt;/code&gt; has been deprecated and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;managerMemory&lt;/code&gt; should be used
 instead in any &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;minicluster.properties&lt;/code&gt; files you have configured.&lt;/li&gt;
   &lt;li&gt;See also &lt;a href=&quot;https://github.com/apache/accumulo/issues/1640&quot;&gt;#1640&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1642&quot;&gt;#1642&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1703&quot;&gt;#1703&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1704&quot;&gt;#1704&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1873&quot;&gt;#1873&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1907&quot;&gt;#1907&lt;/a&gt;&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;new-tracing-facility&quot;&gt;New Tracing Facility&lt;/h3&gt;

 &lt;p&gt;HTrace support was removed in this release and has been replaced with &lt;a href=&quot;https://opentelemetry.io/&quot;&gt;OpenTelemetry&lt;/a&gt;. Trace information will not be shown in the monitor. See comments in &lt;a href=&quot;https://github.com/apache/accumulo/issues/2259&quot;&gt;#2259&lt;/a&gt; for an example of how to configure Accumulo to emit traces to supported OpenTelemetry sinks.
 &lt;a href=&quot;https://github.com/apache/accumulo/issues/2257&quot;&gt;#2257&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-metrics-implementation&quot;&gt;New Metrics Implementation&lt;/h3&gt;

 &lt;p&gt;The Hadoop Metrics2 framework is no longer being used to emit metrics from Accumulo. Accumulo is now
 using the &lt;a href=&quot;https://micrometer.io/&quot;&gt;Micrometer&lt;/a&gt; framework. Metric name and type changes have been
 documented in org.apache.accumulo.core.metrics.MetricsProducer, see the &lt;a href=&quot;https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.2/org/apache/accumulo/core/metrics/MetricsProducer.html&quot;&gt;javadoc&lt;/a&gt; for more information. See comments in &lt;a href=&quot;https://github.com/apache/accumulo/issues/2305&quot;&gt;#2305&lt;/a&gt; for an example of how to configure Accumulo to emit metrics to supported Micrometer sinks.
 &lt;a href=&quot;https://github.com/apache/accumulo/issues/1134&quot;&gt;#1134&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-spi-package&quot;&gt;New SPI Package&lt;/h3&gt;

 &lt;p&gt;A new Service Plugin Interface (SPI) package was created in the accumulo-core jar, at
 &lt;a href=&quot;https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.1.2/org/apache/accumulo/core/spi/package-summary.html&quot;&gt;org.apache.accumulo.core.spi&lt;/a&gt;, under which exists interfaces for the various pluggable
 components. See &lt;a href=&quot;https://github.com/apache/accumulo/issues/1900&quot;&gt;#1900&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1905&quot;&gt;#1905&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1880&quot;&gt;#1880&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1891&quot;&gt;#1891&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1426&quot;&gt;#1426&lt;/a&gt;&lt;/p&gt;

 &lt;h2 id=&quot;minor-improvements&quot;&gt;Minor Improvements&lt;/h2&gt;

 &lt;h3 id=&quot;new-listtablets-shell-command&quot;&gt;New listtablets Shell Command&lt;/h3&gt;

 &lt;p&gt;A new command was created for debugging called listtablets, that shows detailed tablet information
 on a single line. This command aggregates data about a tablet such as status, location, size, number
 of entries and HDFS directory name. It even shows the start and end rows of tablets, displaying them
 in the same sorted order they are stored in the metadata. See example command output below. &lt;a href=&quot;https://github.com/apache/accumulo/issues/1317&quot;&gt;#1317&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1821&quot;&gt;#1821&lt;/a&gt;&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;root@uno&amp;gt; listtablets -t test_ingest -h
 2021-01-04T15:12:47,663 [Shell.audit] INFO : root@uno&amp;gt; listtablets -t test_ingest -h
 NUM  TABLET_DIR      FILES WALS  ENTRIES   SIZE      STATUS     LOCATION                       ID    START (Exclusive)    END
 TABLE: test_ingest
 1    t-0000007       1     0            60       552 HOSTED     CURRENT:ip-10-113-12-25:9997   2     -INF                 row_0000000005
 2    t-0000006       1     0           500     2.71K HOSTED     CURRENT:ip-10-113-12-25:9997   2     row_0000000005       row_0000000055
 3    t-0000008       1     0         5.00K    24.74K HOSTED     CURRENT:ip-10-113-12-25:9997   2     row_0000000055       row_0000000555
 4    default_tablet  1     0         4.44K    22.01K HOSTED     CURRENT:ip-10-113-12-25:9997   2     row_0000000555       +INF
 root@uno&amp;gt; listtablets -t accumulo.metadata
 2021-01-04T15:13:21,750 [Shell.audit] INFO : root@uno&amp;gt; listtablets -t accumulo.metadata
 NUM  TABLET_DIR      FILES WALS  ENTRIES   SIZE      STATUS     LOCATION                       ID    START (Exclusive)    END
 TABLE: accumulo.metadata
 1    table_info      2     0     7         524       HOSTED     CURRENT:ip-10-113-12-25:9997   !0    -INF                 ~
 2    default_tablet  0     0     0         0         HOSTED     CURRENT:ip-10-113-12-25:9997   !0    ~                    +INF
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h3 id=&quot;new-utility-for-generating-splits&quot;&gt;New Utility for Generating Splits&lt;/h3&gt;

 &lt;p&gt;A new command line utility was created to generate split points from 1 or more rfiles. One or more
 HDFS directories can be given as well. The utility will iterate over all the files provided and
 determine the proper split points based on either the size or number given. It uses Apache
 Datasketches to get the split points from the data. &lt;a href=&quot;https://github.com/apache/accumulo/issues/2361&quot;&gt;#2361&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2368&quot;&gt;#2368&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-option-for-cloning-offline&quot;&gt;New Option for Cloning Offline&lt;/h3&gt;

 &lt;p&gt;Added option to leave cloned tables offline &lt;a href=&quot;https://github.com/apache/accumulo/issues/1474&quot;&gt;#1474&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1475&quot;&gt;#1475&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-max-tablets-option-in-bulk-import&quot;&gt;New Max Tablets Option in Bulk Import&lt;/h3&gt;

 &lt;p&gt;The property &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;table.bulk.max.tablets&lt;/code&gt; was created in new bulk import technique. This property acts
 as a cluster performance failsafe to prevent a single ingested file from being distributed across
 too much of a cluster. The value is enforced by the new bulk import technique and is the maximum
 number of tablets allowed for one bulk import file. When this property is set, an error will be
 thrown when the value is exceeded during a bulk import. &lt;a href=&quot;https://github.com/apache/accumulo/issues/1614&quot;&gt;#1614&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-health-check-thread-in-tabletserver&quot;&gt;New Health Check Thread in TabletServer&lt;/h3&gt;

 &lt;p&gt;A new thread was added to the tablet server to periodically verify tablet metadata. &lt;a href=&quot;https://github.com/apache/accumulo/issues/2320&quot;&gt;#2320&lt;/a&gt;
 This thread also prints to the debug log how long it takes the tserver to scan the metadata table.
 The property tserver.health.check.interval was added to control the frequency at which this health
 check takes place. &lt;a href=&quot;https://github.com/apache/accumulo/issues/2583&quot;&gt;#2583&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-ability-for-user-to-define-context-classloaders&quot;&gt;New ability for user to define context classloaders&lt;/h3&gt;

 &lt;p&gt;Deprecated the existing VFS ClassLoader for eventual removal and created a new mechanism for users
 to load their own classloader implementations. The new VFS classloader and VFS context classloaders
 are in a new &lt;a href=&quot;https://github.com/apache/accumulo-classloaders/tree/main/modules/vfs-class-loader&quot;&gt;repo&lt;/a&gt; and can now be specified using Java’s own system
 properties. Alternatively, one can set their own classloader (they always could do this). &lt;a href=&quot;https://github.com/apache/accumulo/issues/1747&quot;&gt;#1747&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1715&quot;&gt;#1715&lt;/a&gt;&lt;/p&gt;

 &lt;p&gt;Please reference the Known Issues section of the 2.0.1 release notes for an issue affecting the
 VFSClassLoader.&lt;/p&gt;

 &lt;h3 id=&quot;change-in-uncaught-exceptionerror-handling-in-server-side-threads&quot;&gt;Change in uncaught Exception/Error handling in server-side threads&lt;/h3&gt;

 &lt;p&gt;Consolidated and normalized thread pool and thread creation. All threads created through this code
 path will have an UncaughtExceptionHandler attached to it that will log the fact that the Thread
 encountered an uncaught Exception and is now dead. When an Error is encountered in a server process,
 it will attempt to print a message to stderr then terminate the VM using Runtime.halt. On the client
 side, the default UncaughtExceptionHandler will only log the Exception/Error in the client and does
 not terminate the VM. Additionally, the user has the ability to set their own
 UncaughtExceptionHandler implementation on the client. &lt;a href=&quot;https://github.com/apache/accumulo/issues/1808&quot;&gt;#1808&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1818&quot;&gt;#1818&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2554&quot;&gt;#2554&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;updated-hash-algorithm&quot;&gt;Updated hash algorithm&lt;/h3&gt;

 &lt;p&gt;With the default password Authenticator, Accumulo used to store password hashes using SHA-256 and
 using custom code to add a salt. In this release, we now use Apache commons-codec to store password
 hashes in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;crypt(3)&lt;/code&gt; standard format. With this change, we’ve also defaulted to using the
 stronger SHA-512. Existing stored password hashes (if upgrading from an earlier version of Accumulo)
 will automatically be upgraded when users authenticate or change their passwords, and Accumulo will
 log a warning if it detects any passwords have not been upgraded. &lt;a href=&quot;https://github.com/apache/accumulo/issues/1787&quot;&gt;#1787&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1788&quot;&gt;#1788&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1798&quot;&gt;#1798&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1810&quot;&gt;#1810&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;various-performance-improvements-when-deleting-tables&quot;&gt;Various Performance improvements when deleting tables&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;Make delete table operations cancel user compactions &lt;a href=&quot;https://github.com/apache/accumulo/issues/2030&quot;&gt;#2030&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2169&quot;&gt;#2169&lt;/a&gt;.&lt;/li&gt;
   &lt;li&gt;Prevent compactions from starting when delete table is called &lt;a href=&quot;https://github.com/apache/accumulo/issues/2182&quot;&gt;#2182&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2240&quot;&gt;#2240&lt;/a&gt;.&lt;/li&gt;
   &lt;li&gt;Added check to not flush when table is being deleted &lt;a href=&quot;https://github.com/apache/accumulo/issues/1887&quot;&gt;#1887&lt;/a&gt;.&lt;/li&gt;
   &lt;li&gt;Added log message before waiting for deletes to finish &lt;a href=&quot;https://github.com/apache/accumulo/issues/1881&quot;&gt;#1881&lt;/a&gt;.&lt;/li&gt;
   &lt;li&gt;Added code to stop user flush if table is being deleted &lt;a href=&quot;https://github.com/apache/accumulo/issues/1931&quot;&gt;#1931&lt;/a&gt;&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;new-monitor-pages-improvements--features&quot;&gt;New Monitor Pages, Improvements &amp;amp; Features&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;A page was added to the Monitor that lists the active compactions and the longest running active
 compaction. As an optimization, this page will only fetch data if a user loads the page and will
 only do so a maximum of once a minute. This optimization was also added for the Active Scans page,
 along with the addition of a “Fetched” column indicating when the data was retrieved.&lt;/li&gt;
   &lt;li&gt;A new feature was added to the TabletServer page to help users identify which tservers are in
 recovery mode. When a tserver is recovering, its corresponding row in the TabletServer Status
 table will be highlighted.&lt;/li&gt;
   &lt;li&gt;A new page was also created for External Compactions that allows users to see the progress of
 compactions and other details about ongoing compactions (see below).&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2283&quot;&gt;#2283&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2294&quot;&gt;#2294&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2358&quot;&gt;#2358&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2663&quot;&gt;#2663&lt;/a&gt;&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/release/ec-running2.png&quot; alt=&quot;External Compactions&quot; style=&quot;width:85%&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/release/ec-running-details.png&quot; alt=&quot;External Compactions Details&quot; style=&quot;width:85%&quot; /&gt;&lt;/p&gt;

 &lt;h3 id=&quot;new-tserver-scan-timeout-property&quot;&gt;New tserver scan timeout property&lt;/h3&gt;

 &lt;p&gt;The new property &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tserver.scan.results.max.timeout&lt;/code&gt; was added to allow configuration of the timeout.
 A bug was discovered where tservers were running out of memory, partially due to this timeout being
 so short. The default value is 1 second, but now it can be increased. It is the max time for the
 thrift client handler to wait for scan results before timing out. &lt;a href=&quot;https://github.com/apache/accumulo/issues/2599&quot;&gt;#2599&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2598&quot;&gt;#2598&lt;/a&gt;&lt;/p&gt;

 &lt;h3 id=&quot;always-choose-volumes-for-new-tablet-files&quot;&gt;Always choose volumes for new tablet files&lt;/h3&gt;

 &lt;p&gt;In &lt;a href=&quot;https://github.com/apache/accumulo/issues/1389&quot;&gt;#1389&lt;/a&gt;, we changed the behavior of the VolumeChooser. It now runs any time a new file is
 created. This means VolumeChooser decisions are no longer “sticky” for tablets. This allows tablets
 to balance their files across multiple HDFS volumes, instead of the first selected. Now, only the
 directory name is “sticky” for a tablet, but the volume is not. So, new files will appear in a
 directory named the same on different volumes that the VolumeChooser selects.&lt;/p&gt;

 &lt;h3 id=&quot;iterators-package-is-now-public-api&quot;&gt;Iterators package is now public API&lt;/h3&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1390&quot;&gt;#1390&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1400&quot;&gt;#1400&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1411&quot;&gt;#1411&lt;/a&gt; We declared that the core.iterators package is public
 API, so it will now follow the semver rules for public API.&lt;/p&gt;

 &lt;h3 id=&quot;better-accumulo-gc-memory-usage&quot;&gt;Better accumulo-gc memory usage&lt;/h3&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1543&quot;&gt;#1543&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1650&quot;&gt;#1650&lt;/a&gt; Switch from batching file candidates to delete based on the amount of
 available memory, and instead use a fixed-size batching strategy. This allows the accumulo-gc to run
 consistently using a batch size that is configurable by the user. The user is responsbile for
 ensuring the process is given enough memory to accommodate the batch size they configure, but this
 makes the process much more consistent and predictable.&lt;/p&gt;

 &lt;h3 id=&quot;log4j2&quot;&gt;Log4j2&lt;/h3&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1528&quot;&gt;#1528&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1514&quot;&gt;#1514&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1515&quot;&gt;#1515&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1516&quot;&gt;#1516&lt;/a&gt; While we still use slf4j, we have
 upgraded the default logger binding to log4j2, which comes with a bunch of features, such as dynamic
 reconfiguration, colorized console logging, and more.&lt;/p&gt;

 &lt;h3 id=&quot;added-foreach-method-to-scanner&quot;&gt;Added forEach method to Scanner&lt;/h3&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1742&quot;&gt;#1742&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1765&quot;&gt;#1765&lt;/a&gt; We added a forEach method to Scanner objects, so you can easily
 iterate over the results using a lambda / BiConsumer that accepts a key-value pair.&lt;/p&gt;

 &lt;h3 id=&quot;new-public-api-to-set-multiple-properties-atomically&quot;&gt;New public API to set multiple properties atomically&lt;/h3&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2692&quot;&gt;#2692&lt;/a&gt; We added a new public API added to support setting multiple properties at once
 atomically using a read-modify-write pattern. This is available for table, namespace, and system
 properties, and is called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;modifyProperties()&lt;/code&gt;. This builds off a related change that allows us to
 more efficiently store and properties in ZooKeeper, which also results in fewer ZooKeeper watches.&lt;/p&gt;

 &lt;h3 id=&quot;simplified-cluster-configuration&quot;&gt;Simplified cluster configuration&lt;/h3&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2138&quot;&gt;#2138&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2903&quot;&gt;#2903&lt;/a&gt; Modified the accumulo-cluster script to read the server locations from a single
 file, cluster.yaml, in the conf directory instead of multiple files (tserver, manager, gc, etc.). Starting the new scan server and compactor server types is supported using this new file.  It also contains options for starting multiple Tablet and Scan Servers per host.&lt;/p&gt;

 &lt;h3 id=&quot;other-notable-changes&quot;&gt;Other notable changes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1174&quot;&gt;#1174&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/816&quot;&gt;#816&lt;/a&gt; Abstract metadata and change root metadata schema&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1309&quot;&gt;#1309&lt;/a&gt; Explicitly prevent cloning metadata table to prevent poor user experience&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1313&quot;&gt;#1313&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/936&quot;&gt;#936&lt;/a&gt; Store Root Tablet list of files in Zookeeper&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1294&quot;&gt;#1294&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1299&quot;&gt;#1299&lt;/a&gt; Add optional -t tablename to importdirectory shell command.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1332&quot;&gt;#1332&lt;/a&gt; Disable FileSystemMonitor checks of /proc by default (to be removed in future)&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1345&quot;&gt;#1345&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1352&quot;&gt;#1352&lt;/a&gt; Optionally disable gc-initiated compactions/flushes&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1397&quot;&gt;#1397&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1461&quot;&gt;#1461&lt;/a&gt; Replace relative paths in the metadata tables on upgrade.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1456&quot;&gt;#1456&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1457&quot;&gt;#1457&lt;/a&gt; Prevent catastrophic tserver shutdown by rate limiting the shutdown&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1053&quot;&gt;#1053&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1060&quot;&gt;#1060&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1576&quot;&gt;#1576&lt;/a&gt; Support multiple volumes in import table&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1568&quot;&gt;#1568&lt;/a&gt; Support multiple tservers / node in accumulo-service&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1644&quot;&gt;#1644&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1645&quot;&gt;#1645&lt;/a&gt; Fix issue with minor compaction not retrying&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1660&quot;&gt;#1660&lt;/a&gt; Dropped unused MemoryManager property&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1764&quot;&gt;#1764&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/1783&quot;&gt;#1783&lt;/a&gt; Parallelize listcompactions in shell&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1797&quot;&gt;#1797&lt;/a&gt; Add table option to shell delete command.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2039&quot;&gt;#2039&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2045&quot;&gt;#2045&lt;/a&gt; Add bulk import option to ignore empty dirs&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2117&quot;&gt;#2117&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2236&quot;&gt;#2236&lt;/a&gt; Make sorted recovery write to RFiles. New &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tserver.wal.sort.file.&lt;/code&gt;
 property to configure&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2076&quot;&gt;#2076&lt;/a&gt; Sorted recovery files can now be encrypted&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2441&quot;&gt;#2441&lt;/a&gt; Upgraded to Junit 5&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2462&quot;&gt;#2462&lt;/a&gt; Added SUBMITTED FaTE status to differentiate between things submitted vs. running&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2467&quot;&gt;#2467&lt;/a&gt; Added fate shell command option to cancel FaTE operations that are NEW or SUBMITTED&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2807&quot;&gt;#2807&lt;/a&gt; Added several troubleshooting utilities to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;accumulo admin&lt;/code&gt; command.&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2820&quot;&gt;#2820&lt;/a&gt; &lt;a href=&quot;https://github.com/apache/accumulo/issues/2900&quot;&gt;#2900&lt;/a&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;du&lt;/code&gt; command performance improved by using the metadata table for
 computation instead of HDFS&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2966&quot;&gt;#2966&lt;/a&gt; Upgrade Thrift to 0.17.0&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;upgrading&quot;&gt;Upgrading&lt;/h2&gt;

 &lt;p&gt;View the &lt;a href=&quot;/docs/2.x/administration/upgrading&quot;&gt;Upgrading Accumulo documentation&lt;/a&gt; for guidance.&lt;/p&gt;

 &lt;h2 id=&quot;210-github-project&quot;&gt;2.1.0 GitHub Project&lt;/h2&gt;

 &lt;p&gt;&lt;a href=&quot;https://github.com/apache/accumulo/projects/3&quot;&gt;All tickets related to 2.1.0.&lt;/a&gt;&lt;/p&gt;

 &lt;h2 id=&quot;known-issues&quot;&gt;Known Issues&lt;/h2&gt;

 &lt;p&gt;At the time of release, the following issues were known:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3045&quot;&gt;#3045&lt;/a&gt; - External compactions may appear stuck until the coordinator is restarted&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3048&quot;&gt;#3048&lt;/a&gt; - The monitor may not show times in the correct format for the user’s locale&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3053&quot;&gt;#3053&lt;/a&gt; - ThreadPool creation is a bit spammy by default in the debug logs&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/3057&quot;&gt;#3057&lt;/a&gt; - The monitor may have an annoying popup on the external compactions page if the
 coordinator is offline&lt;/li&gt;
 &lt;/ul&gt;

 </description>
         <pubDate>Tue, 01 Nov 2022 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-2.1.0/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-2.1.0/</guid>


         <category>release</category>

       </item>

       <item>
         <title>2.1.0 Metrics and Tracing Changes</title>
         <description>&lt;p&gt;Metrics and Tracing changed in 2.1.0. This post explains the new implementations and provides examples on how to configure them.&lt;/p&gt;

 &lt;h1 id=&quot;metrics&quot;&gt;Metrics&lt;/h1&gt;

 &lt;p&gt;Accumulo was &lt;a href=&quot;https://issues.apache.org/jira/browse/ACCUMULO-1817&quot;&gt;modified&lt;/a&gt; in version 1.7.0 (2015) to use the Hadoop Metrics2 framework for capturing and emitting internal Accumulo metrics. &lt;a href=&quot;https://micrometer.io/&quot;&gt;Micrometer&lt;/a&gt;, a newer metrics framework, supports sending metrics to many popular &lt;a href=&quot;https://micrometer.io/docs/concepts#_supported_monitoring_systems&quot;&gt;monitoring systems&lt;/a&gt;. In Accumulo 2.1.0 support for the Hadoop Metrics2 framework has been removed in favor of using Micrometer. Metrics are disabled by default.&lt;/p&gt;

 &lt;p&gt;Micrometer has the concept of a &lt;a href=&quot;https://micrometer.io/docs/concepts#_registry&quot;&gt;MeterRegistry&lt;/a&gt;, which is used to create and emit metrics to the supported monitoring systems. Additionally, Micrometer supports sending metrics to multiple monitoring systems concurrently. Configuring Micrometer in Accumulo will require you to write a small peice of code to provide the MeterRegistry configuration. Specifically, you will need to create a class that implements &lt;a href=&quot;https://github.com/apache/accumulo/blob/main/core/src/main/java/org/apache/accumulo/core/metrics/MeterRegistryFactory.java&quot;&gt;MeterRegistryFactory&lt;/a&gt;. Your implementation will need to create and configure the appropriate MeterRegistry. Additionally, you will need to add the MeterRegistry jar file and the jar file containing your MeterRegistryFactory implementation to Accumulo’s classpath. The page for each monitoring system that Micrometer supports contains instructions on how to configure the registry and which jar file is required.&lt;/p&gt;

 &lt;p&gt;Accumulo’s metrics integration test uses a &lt;a href=&quot;https://github.com/apache/accumulo/blob/main/test/src/main/java/org/apache/accumulo/test/metrics/TestStatsDRegistryFactory.java&quot;&gt;TestStatsDRegistryFactory&lt;/a&gt; to create and configure a &lt;a href=&quot;https://micrometer.io/docs/registry/statsD&quot;&gt;StatsD Meter Registry&lt;/a&gt;. The instructions below provide an example of how to use this class to emit Accumulo’s metrics to a Telegraf - InfluxDB - Grafana monitoring stack.&lt;/p&gt;

 &lt;h2 id=&quot;metrics-example&quot;&gt;Metrics Example&lt;/h2&gt;

 &lt;p&gt;This example uses a Docker container that contains Telegraf-InfluxDB-Grafana system. We will configure Accumulo to send metrics to the &lt;a href=&quot;https://www.influxdata.com/time-series-platform/telegraf/&quot;&gt;Telegraf&lt;/a&gt; component running in the Docker image. Telegraf will persist the metrics in &lt;a href=&quot;https://www.influxdata.com/products/influxdb-overview/&quot;&gt;InfluxDB&lt;/a&gt; and then we will visualize the metrics using &lt;a href=&quot;https://grafana.com/&quot;&gt;Grafana&lt;/a&gt;. This example assumes that you have installed Docker (or equivalent engine) and have an Accumulo database already installed and initialized. We will be installing some things, modifying the Accumulo configuration, and starting Accumulo.&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;Download the Telegraf-Influx-Grafana (TIG) Docker image
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker pull artlov/docker-telegraf-influxdb-grafana:latest
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Create directories for the Docker container
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mkdir -p /tmp/metrics/influxdb
 chmod 777 /tmp/metrics/influxdb
 mkdir /tmp/metrics/grafana
 mkdir /tmp/metrics/grafana-dashboards
 mkdir -p /tmp/metrics/telegraf/conf
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Download Telegraf configuration and Grafana dashboard
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;cd /tmp/metrics/telegraf/conf
 wget https://raw.githubusercontent.com/apache/accumulo-testing/main/contrib/terraform-testing-infrastructure/modules/config-files/templates/telegraf.conf.tftpl
 cat telegraf.conf.tftpl | sed &quot;s/\${manager_ip}/localhost/&quot; &amp;gt; telegraf.conf
 cd /tmp/metrics/grafana-dashboards
 wget https://raw.githubusercontent.com/apache/accumulo-testing/main/contrib/terraform-testing-infrastructure/modules/config-files/files/grafana_dashboards/accumulo-dashboard.json
 wget https://raw.githubusercontent.com/apache/accumulo-testing/main/contrib/terraform-testing-infrastructure/modules/config-files/files/grafana_dashboards/accumulo-dashboard.yaml
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Start the TIG Docker container
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run --ulimit nofile=66000:66000 -d --rm \
  --name tig-stack \
  -p 3003:3003 \
  -p 3004:8888 \
  -p 8086:8086 \
  -p 22022:22 \
  -p 8125:8125/udp \
  -v /tmp/metrics/influxdb:/var/lib/influxdb \
  -v /tmp/metrics/grafana:/var/lib/grafana \
  -v /tmp/metrics/telegraf/conf:/etc/telegraf \
  -v /tmp/metrics/grafana-dashboards:/etc/grafana/provisioning/dashboards \
  artlov/docker-telegraf-influxdb-grafana:latest
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Download Micrometer StatsD Meter Registry jar
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;wget -O micrometer-registry-statsd-1.9.1.jar https://search.maven.org/remotecontent?filepath=io/micrometer/micrometer-registry-statsd/1.9.1/micrometer-registry-statsd-1.9.1.jar
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;At a mininum you need to enable the metrics using the property &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;general.micrometer.enabled&lt;/code&gt; and supply the name of the MeterRegistryFactory class using the property &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;general.micrometer.factory&lt;/code&gt;. To enable &lt;a href=&quot;https://micrometer.io/docs/ref/jvm&quot;&gt;JVM&lt;/a&gt; metrics, use the property &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;general.micrometer.jvm.metrics.enabled&lt;/code&gt;. Modify the accumulo.properties configuration file by adding the properties below.
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Micrometer settings
 general.micrometer.enabled=true
 general.micrometer.jvm.metrics.enabled=true
 general.micrometer.factory=org.apache.accumulo.test.metrics.TestStatsDRegistryFactory
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;
     &lt;p&gt;Copy the micrometer-registry-statsd-1.9.1.jar and accumulo-test.jar into the Accumulo lib directory&lt;/p&gt;
   &lt;/li&gt;
   &lt;li&gt;The TestStatsDRegistryFactory uses system properties to determine the host and port of the StatsD server. In this example the Telegraf component started in step 4 above contains a StatsD server listening on localhost:8125. Configure the TestStatsDRegistryFactory by adding the following system properties to the JAVA_OPTS variable in accumulo-env.sh.
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&quot;-Dtest.meter.registry.host=127.0.0.1&quot;
 &quot;-Dtest.meter.registry.port=8125&quot;
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Start Accumulo.  You should see the following statement in the server log files
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[metrics.MetricsUtil] INFO : initializing metrics, enabled:true, class:org.apache.accumulo.test.metrics.TestStatsDRegistryFactory
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Log into Grafana (http://localhost:3003/) using the default credentials (root/root). Click the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Home&lt;/code&gt; icon at the top, then click the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Accumulo Micrometer Test Dashboard&lt;/code&gt;. If everything is working correctly, then you should see something like the image below.&lt;/li&gt;
 &lt;/ol&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202206_metrics_and_tracing/Grafana_Screenshot.png&quot; alt=&quot;Grafana Screenshot&quot; /&gt;&lt;/p&gt;

 &lt;h1 id=&quot;tracing&quot;&gt;Tracing&lt;/h1&gt;

 &lt;p&gt;With the retirement of HTrace, Accumulo has selected to replace it’s tracing functionality with &lt;a href=&quot;https://opentelemetry.io/&quot;&gt;OpenTelemetry&lt;/a&gt; in version 2.1.0. Hadoop appears to be on the same &lt;a href=&quot;https://issues.apache.org/jira/browse/HADOOP-15566&quot;&gt;path&lt;/a&gt; which, when finished, should provide better insight into Accumulo’s use of HDFS. OpenTelemetry supports exporting Trace information to several different systems, to include &lt;a href=&quot;https://www.jaegertracing.io/&quot;&gt;Jaeger&lt;/a&gt;, &lt;a href=&quot;https://zipkin.io/&quot;&gt;Zipkin&lt;/a&gt;, and others. The HTrace trace spans in the Accumulo source code have been updated to use OpenTelemetry trace spans. If tracing is enabled, then Accumulo will use the OpenTelemetry implementation registered with the &lt;a href=&quot;https://github.com/open-telemetry/opentelemetry-java/blob/main/api/all/src/main/java/io/opentelemetry/api/GlobalOpenTelemetry.java&quot;&gt;GlobalOpenTelemetry&lt;/a&gt; object. Tracing is disabled by default and a no-op OpenTelemetry implementation is used.&lt;/p&gt;

 &lt;h2 id=&quot;tracing-example&quot;&gt;Tracing Example&lt;/h2&gt;

 &lt;p&gt;This example uses the OpenTelemetry Java Agent jar file to configure and export trace information to Jaeger. The OpenTelemetry Java Agent jar file bundles together the supported Java exporters, provides a way to &lt;a href=&quot;https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure&quot;&gt;configure&lt;/a&gt; them, and registers them with the GlobalOpenTelemetry singleton that is used by Accumulo. An alternate method to supplying the OpenTelemetry dependencies, without using the Java Agent jar file, is to create a shaded jar with the OpenTelemetry &lt;a href=&quot;https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure&quot;&gt;autoconfigure&lt;/a&gt; module and it’s runtime dependencies and place the resulting shaded jar on the classpath. An example Maven pom.xml file to create the shaded jar is &lt;a href=&quot;https://github.com/apache/accumulo/pull/2259#issuecomment-965571339&quot;&gt;here&lt;/a&gt;. When using this alternate method you can skip step 2 and the uncommenting of the java agent in step 5 below.&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;Download Jaeger all-in-one Docker image
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  docker pull jaegertracing/all-in-one:1.35
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Download OpenTelemetry Java Agent (https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure)
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  wget -O opentelemetry-javaagent-1.15.0.jar https://search.maven.org/remotecontent?filepath=io/opentelemetry/javaagent/opentelemetry-javaagent/1.15.0/opentelemetry-javaagent-1.15.0.jar
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;To enable tracing, you need to set the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;general.opentelemetry.enabled&lt;/code&gt; property. Modify the accumulo.properties configuration file and add the following property.
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# OpenTelemetry settings
 general.opentelemetry.enabled=true
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;To enable tracing in the shell, set the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;general.opentelemetry.enabled&lt;/code&gt; property in the accumulo-client.properties configuration file.
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# OpenTelemetry settings
 general.opentelemetry.enabled=true
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Configure the OpenTelemetry JavaAgent in accumulo-env.sh by uncommenting the following and updating the path to the java agent jar:
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  ## Optionally setup OpenTelemetry SDK AutoConfigure
   ## See https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure
   #JAVA_OPTS=('-Dotel.traces.exporter=jaeger' '-Dotel.metrics.exporter=none' '-Dotel.logs.exporter=none' &quot;${JAVA_OPTS[@]}&quot;)
   ## Optionally setup OpenTelemetry Java Agent
   ## See https://github.com/open-telemetry/opentelemetry-java-instrumentation for more options
   #JAVA_OPTS=('-javaagent:path/to/opentelemetry-javaagent.jar' &quot;${JAVA_OPTS[@]}&quot;)
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Start Jaeger Docker container
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run -d --rm --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 14250:14250 \
  -p 9411:9411 jaegertracing/all-in-one:1.35
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;Start Accumulo.  You should see the following statement in the server log files
     &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[trace.TraceUtil] INFO : Trace enabled in Accumulo: yes, OpenTelemetry instance: class io.opentelemetry.javaagent.instrumentation.opentelemetryapi.v1_10.ApplicationOpenTelemetry110, Tracer instance: class io.opentelemetry.javaagent.instrumentation.opentelemetryapi.trace.ApplicationTracer
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
   &lt;/li&gt;
   &lt;li&gt;View traces in Jaeger UI at http://localhost:16686. You can select the service name on the left panel and click &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Find Traces&lt;/code&gt; to view the trace information. If everything is working correctly, then you should see something like the image below.&lt;/li&gt;
 &lt;/ol&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202206_metrics_and_tracing/Jaeger_Screenshot.png&quot; alt=&quot;Jaeger Screenshot&quot; /&gt;&lt;/p&gt;
 </description>
         <pubDate>Wed, 22 Jun 2022 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/blog/2022/06/22/2.1.0-metrics-and-tracing.html</link>
         <guid isPermaLink="true">https://accumulo.apache.org/blog/2022/06/22/2.1.0-metrics-and-tracing.html</guid>


         <category>blog</category>

       </item>

       <item>
         <title>Apache Accumulo 1.10.2</title>
         <description>&lt;h2 id=&quot;about&quot;&gt;About&lt;/h2&gt;

 &lt;p&gt;Apache Accumulo 1.10.2 is a bug fix release of the 1.10 LTM release line.&lt;/p&gt;

 &lt;p&gt;These release notes are highlights of the changes since 1.10.1. The full
 detailed changes can be seen in the git history. If anything important is
 missing from this list, please &lt;a href=&quot;/contact-us&quot;&gt;contact&lt;/a&gt; us to have it included.&lt;/p&gt;

 &lt;p&gt;Users of 1.10.1 or earlier are encouraged to upgrade to 1.10.2, as this is a
 continuation of the 1.10 LTM release line with bug fixes and improvements, and
 it supersedes any prior 1.x version. Users are also encouraged to consider
 migrating to a 2.x version when one that is suitable for their needs becomes
 available.&lt;/p&gt;

 &lt;h2 id=&quot;known-issues&quot;&gt;Known Issues&lt;/h2&gt;

 &lt;p&gt;Apache Commons VFS was upgraded in &lt;a href=&quot;https://github.com/apache/accumulo/issues/1295&quot;&gt;#1295&lt;/a&gt; and some users have reported
 issues similar to &lt;a href=&quot;https://issues.apache.org/jira/projects/VFS/issues/VFS-683&quot;&gt;VFS-683&lt;/a&gt;. Possible solutions are discussed in &lt;a href=&quot;https://github.com/apache/accumulo/issues/2775&quot;&gt;#2775&lt;/a&gt;.&lt;/p&gt;

 &lt;h2 id=&quot;major-improvements&quot;&gt;Major Improvements&lt;/h2&gt;

 &lt;p&gt;This release bundles &lt;a href=&quot;https://reload4j.qos.ch/&quot;&gt;reload4j&lt;/a&gt; (&lt;a href=&quot;https://github.com/apache/accumulo/issues/2458&quot;&gt;#2458&lt;/a&gt;) in
 the convenience binary and uses that instead of log4j 1.2. This is to make it
 easier for users to avoid the many CVEs that apply to log4j 1.2, which is no
 longer being maintained. Accumulo 2.x versions will have already switched to
 use the latest log4j 2. However, doing so required making some breaking API
 changes and other substantial changes, so that can’t be done for Accumulo 1.10.
 Using reload4j instead, was deemed to be a viable interim solution until
 Accumulo 2.x.&lt;/p&gt;

 &lt;h3 id=&quot;other-improvements&quot;&gt;Other Improvements&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1808&quot;&gt;#1808&lt;/a&gt; Re-throw exceptions in threads instead of merely logging them&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1863&quot;&gt;#1863&lt;/a&gt; Avoid unnecessory redundant log sorting&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1917&quot;&gt;#1917&lt;/a&gt; Ensure RFileWriterBuilder API validates filenames&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2006&quot;&gt;#2006&lt;/a&gt; Detect system config changes in HostRegexTableLoadBalancer without restarting master&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2464&quot;&gt;#2464&lt;/a&gt; Apply timeout to socket.connect()&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h3 id=&quot;other-bug-fixes&quot;&gt;Other Bug Fixes&lt;/h3&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1775&quot;&gt;#1775&lt;/a&gt; Ensure monitor reports a dead tserver when it is killed&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/1858&quot;&gt;#1858&lt;/a&gt; Fix a bug in the monitor graphs due to use of int instead of long&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues/2370&quot;&gt;#2370&lt;/a&gt; Fix bug in getsplits command in the shell&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;note-about-jdk-15&quot;&gt;Note About JDK 15&lt;/h2&gt;

 &lt;p&gt;See the note in the 1.10.1 release notes about the use of JDK 15 or later, as
 the information pertaining to the use of the CMS garbage collector remains
 applicable to this version.&lt;/p&gt;

 &lt;h2 id=&quot;useful-links&quot;&gt;Useful Links&lt;/h2&gt;

 &lt;ul&gt;
   &lt;li&gt;&lt;a href=&quot;https://lists.apache.org/thread/bq424vnov27nwnkb471oxg5nd7m6xwn9&quot;&gt;Release VOTE email thread&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/compare/rel/1.10.1...apache:rel/1.10.2&quot;&gt;All Changes since 1.10.1&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;&lt;a href=&quot;https://github.com/apache/accumulo/issues?q=project%3Aapache%2Faccumulo%2F18&quot;&gt;GitHub&lt;/a&gt; - List of issues tracked on GitHub corresponding to this release&lt;/li&gt;
 &lt;/ul&gt;

 </description>
         <pubDate>Sun, 13 Feb 2022 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/release/accumulo-1.10.2/</link>
         <guid isPermaLink="true">https://accumulo.apache.org/release/accumulo-1.10.2/</guid>


         <category>release</category>

       </item>

       <item>
         <title>External Compactions</title>
         <description>&lt;p&gt;External compactions are a new feature in Accumulo 2.1.0 which allows
 compaction work to run outside of Tablet Servers.&lt;/p&gt;

 &lt;h2 id=&quot;overview&quot;&gt;Overview&lt;/h2&gt;

 &lt;p&gt;There are two types of &lt;a href=&quot;https://storage.googleapis.com/pub-tools-public-publication-data/pdf/68a74a85e1662fe02ff3967497f31fda7f32225c.pdf&quot;&gt;compactions&lt;/a&gt; in Accumulo - Minor and Major. Minor
 compactions flush recently written data from memory to a new file. Major
 compactions merge two or more Tablet files together into one new file. Starting
 in 2.1 Tablet Servers can run multiple major compactions for a Tablet
 concurrently; there is no longer a single thread pool per Tablet Server that
 runs compactions. Major compactions can be resource intensive and may run for a
 long time depending on several factors, to include the number and size of the
 input files, and the iterators configured to run during major compaction.
 Additionally, the Tablet Server does not currently have a mechanism in place to
 stop a major compaction that is taking too long or using too many resources.
 There is a mechanism to throttle the read and write speed of major compactions
 as a way to reduce the resource contention on a Tablet Server where many
 concurrent compactions are running. However, throttling compactions on a busy
 system will just lead to an increasing amount of queued compactions. Finally,
 major compaction work can be wasted in the event of an untimely death of the
 Tablet Server or if a Tablet is migrated to another Tablet Server.&lt;/p&gt;

 &lt;p&gt;An external compaction is a major compaction that occurs outside of a Tablet
 Server. The external compaction feature is an extension of the major compaction
 service in the Tablet Server and is configured as part of the systems
 compaction service configuration. Thus, it is an optional feature. The goal of
 the external compaction feature is to overcome some of the drawbacks of the
 Major compactions that happen inside the Tablet Server. Specifically, external
 compactions:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;Allow major compactions to continue when the originating TabletServer dies&lt;/li&gt;
   &lt;li&gt;Allow major compactions to occur while a Tablet migrates to a new Tablet Server&lt;/li&gt;
   &lt;li&gt;Reduce the load on the TabletServer, giving it more cycles to insert mutations and respond to scans (assuming it’s running on different hosts).  MapReduce jobs and compactions can lower the effectiveness of processor and page caches for scans, so moving compactions off the host can be beneficial.&lt;/li&gt;
   &lt;li&gt;Allow major compactions to be scaled differently than the number of TabletServers, giving users more flexibility in allocating resources.&lt;/li&gt;
   &lt;li&gt;Even out hotspots where a few Tablet Servers have a lot of compaction work. External compactions allow this work to spread much wider than previously possible.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;The external compaction feature in Apache Accumulo version 2.1.0 adds two new
 system-level processes and new configuration properties. The new system-level
 processes are the Compactor and the Compaction Coordinator.&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;The Compactor is a process that is responsible for executing a major compaction. There can be many Compactor’s running on a system. The Compactor communicates with the Compaction Coordinator to get information about the next major compaction it will run and to report the completion state.&lt;/li&gt;
   &lt;li&gt;The Compaction Coordinator is a single process like the Manager. It is responsible for communicating with the Tablet Servers to gather information about queued external compactions, to reserve a major compaction on the Compactor’s behalf, and to report the completion status of the reserved major compaction.  For external compactions that complete when the Tablet is offline, the Compaction Coordinator buffers this information and reports it later.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;h2 id=&quot;details&quot;&gt;Details&lt;/h2&gt;

 &lt;p&gt;Before we explain the implementation for external compactions, it’s probably
 useful to explain the changes for major compactions that were made in the 2.1.0
 branch before external compactions were added. This is most apparent in the
 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tserver.compaction.major.service&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;table.compaction.dispatcher&lt;/code&gt; configuration
 properties. The simplest way to explain this is that you can now define a
 service for executing compactions and then assign that service to a table
 (which implies you can have multiple services assigned to different tables).
 This gives the flexibility to prevent one table’s compactions from impacting
 another table. Each service has named thread pools with size thresholds.&lt;/p&gt;

 &lt;h3 id=&quot;configuration&quot;&gt;Configuration&lt;/h3&gt;

 &lt;p&gt;The configuration below defines a compaction service named cs1 using
 the DefaultCompactionPlanner that is configured to have three named thread
 pools (small, medium, and large). Each thread pool is configured with a number
 of threads to run compactions and a size threshold. If the sum of the input
 file sizes is less than 16MB, then the major compaction will be assigned to the
 small pool, for example.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner
 tserver.compaction.major.service.cs1.planner.opts.executors=[
 {&quot;name&quot;:&quot;small&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;maxSize&quot;:&quot;16M&quot;,&quot;numThreads&quot;:8},
 {&quot;name&quot;:&quot;medium&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;maxSize&quot;:&quot;128M&quot;,&quot;numThreads&quot;:4},
 {&quot;name&quot;:&quot;large&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;numThreads&quot;:2}]
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;To assign compaction service cs1 to the table ci, you would use the following properties:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;config -t ci -s table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher
 config -t ci -s table.compaction.dispatcher.opts.service=cs1
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;A small modification to the
 tserver.compaction.major.service.cs1.planner.opts.executors property in the
 example above would enable it to use external compactions. For example, let’s
 say that we wanted all of the large compactions to be done externally, you
 would use this configuration:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tserver.compaction.major.service.cs1.planner.opts.executors=[
 {&quot;name&quot;:&quot;small&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;maxSize&quot;:&quot;16M&quot;,&quot;numThreads&quot;:8},
 {&quot;name&quot;:&quot;medium&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;maxSize&quot;:&quot;128M&quot;,&quot;numThreads&quot;:4},
 {&quot;name&quot;:&quot;large&quot;,&quot;type&quot;:&quot;external&quot;,&quot;queue&quot;:&quot;DCQ1&quot;}]'
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;In this example the queue DCQ1 can be any arbitrary name and allows you to
 define multiple pools of Compactor’s.&lt;/p&gt;

 &lt;p&gt;Behind these new configurations in 2.1 lies a new algorithm for choosing which
 files to compact.  This algorithm attempts to find the smallest set of files
 that meets the compaction ratio criteria. Prior to 2.1, Accumulo looked for the
 largest set of files that met the criteria.  Both algorithms do logarithmic
 amounts of work.  The new algorithm better utilizes multiple thread pools
 available for running comactions of different sizes.&lt;/p&gt;

 &lt;h3 id=&quot;compactor&quot;&gt;Compactor&lt;/h3&gt;

 &lt;p&gt;A Compactor is started with the name of the queue for which it will complete
 major compactions. You pass in the queue name when starting the Compactor, like
 so:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;bin/accumulo compactor -q DCQ1
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;Once started the Compactor tries to find the location of the
 Compaction Coordinator in ZooKeeper and connect to it. Then, it asks the
 Compaction Coordinator for the next compaction job for the queue. The
 Compaction Coordinator will return to the Compactor the necessary information to
 run the major compaction, assuming there is work to be done. Note that the
 class performing the major compaction in the Compactor is the same one used in
 the Tablet Server, so we are just transferring all of the input parameters from
 the Tablet Server to the Compactor. The Compactor communicates information back
 to the Compaction Coordinator when the compaction has started, finished
 (successfully or not), and during the compaction (progress updates).&lt;/p&gt;

 &lt;h3 id=&quot;compaction-coordinator&quot;&gt;Compaction Coordinator&lt;/h3&gt;

 &lt;p&gt;The Compaction Coordinator is a singleton process in the system like the
 Manager. Also, like the Manager it supports standby Compaction Coordinator’s
 using locks in ZooKeeper. The Compaction Coordinator is started using the
 command:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;bin/accumulo compaction-coordinator
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;When running, the Compaction Coordinator polls the TabletServers for summary
 information about their external compaction queues. It keeps track of the major
 compaction priorities for each Tablet Server and queue. When a Compactor
 requests the next major compaction job the Compaction Coordinator finds the
 Tablet Server with the highest priority major compaction for that queue and
 communicates with that Tablet Server to reserve an external compaction. The
 priority in this case is an integer value based on the number of input files
 for the compaction. For system compactions, the number is negative starting at
 -32768 and increasing to -1 and for user compactions it’s a non-negative number
 starting at 0 and limited to 32767. When the Tablet Server reserves the
 external compaction an entry is written into the metadata table row for the
 Tablet with the address of the Compactor running the compaction and all of the
 configuration information passed back from the Tablet Server. Below is an
 example of the ecomp metadata column:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;2;10ba2e8ba2e8ba5 ecomp:ECID:94db8374-8275-4f89-ba8b-4c6b3908bc50 []    {&quot;inputs&quot;:[&quot;hdfs://accucluster/accumulo/tables/2/t-00000ur/A00001y9.rf&quot;,&quot;hdfs://accucluster/accumulo/tables/2/t-00000ur/C00005lp.rf&quot;,&quot;hdfs://accucluster/accumulo/tables/2/t-00000ur/F0000dqm.rf&quot;,&quot;hdfs://accucluster/accumulo/tables/2/t-00000ur/F0000dq1.rf&quot;],&quot;nextFiles&quot;:[],&quot;tmp&quot;:&quot;hdfs://accucluster/accumulo/tables/2/t-00000ur/C0000dqs.rf_tmp&quot;,&quot;compactor&quot;:&quot;10.2.0.139:9133&quot;,&quot;kind&quot;:&quot;SYSTEM&quot;,&quot;executorId&quot;:&quot;DCQ1&quot;,&quot;priority&quot;:-32754,&quot;propDels&quot;:true,&quot;selectedAll&quot;:false}
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;When the Compactor notifies the Compaction Coordinator that it has finished the
 major compaction, the Compaction Coordinator attempts to notify the Tablet
 Server and inserts an external compaction final state marker into the metadata
 table. Below is an example of the final state marker:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;~ecompECID:de6afc1d-64ae-4abf-8bce-02ec0a79aa6c : []        {&quot;extent&quot;:{&quot;tableId&quot;:&quot;2&quot;},&quot;state&quot;:&quot;FINISHED&quot;,&quot;fileSize&quot;:12354,&quot;entries&quot;:100000}
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;If the Compaction Coordinator is able to reach the Tablet Server and that Tablet
 Server is still hosting the Tablet, then the compaction is committed and both
 of the entries are removed from the metadata table. In the case that the Tablet
 is offline when the compaction attempts to commit, there is a thread in the
 Compaction Coordinator that looks for completed, but not yet committed, external
 compactions and periodically attempts to contact the Tablet Server hosting the
 Tablet to commit the compaction. The Compaction Coordinator periodically removes
 the final state markers related to Tablets that no longer exist. In the case of
 an external compaction failure the Compaction Coordinator notifies the Tablet
 and the Tablet cleans up file reservations and removes the metadata entry.&lt;/p&gt;

 &lt;h3 id=&quot;edge-cases&quot;&gt;Edge Cases&lt;/h3&gt;

 &lt;p&gt;There are several situations involving external compactions that we tested as part of this feature. These are:&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;Tablet migration&lt;/li&gt;
   &lt;li&gt;When a user initiated compaction is canceled&lt;/li&gt;
   &lt;li&gt;What a Table is taken offline&lt;/li&gt;
   &lt;li&gt;When a Tablet is split or merged&lt;/li&gt;
   &lt;li&gt;Coordinator restart&lt;/li&gt;
   &lt;li&gt;Tablet Server death&lt;/li&gt;
   &lt;li&gt;Table deletion&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;Compactors periodically check if the compaction they are running is related to
 a deleted table, split/merged Tablet, or canceled user initiated compaction. If
 any of these cases happen the Compactor interrupts the compaction and notifies
 the Compaction Coordinator. An external compaction continues in the case of
 Tablet Server death, Tablet migration, Coordinator restart, and the Table being
 taken offline.&lt;/p&gt;

 &lt;h2 id=&quot;cluster-test&quot;&gt;Cluster Test&lt;/h2&gt;

 &lt;p&gt;The following tests were run on a cluster to exercise this new feature.&lt;/p&gt;

 &lt;ol&gt;
   &lt;li&gt;Run continuous ingest for 24h with large compactions running externally in an autoscaled Kubernetes cluster.&lt;/li&gt;
   &lt;li&gt;After ingest completion, started a full table compaction with all compactions running externally.&lt;/li&gt;
   &lt;li&gt;Run continuous ingest verification process that looks for lost data.&lt;/li&gt;
 &lt;/ol&gt;

 &lt;h3 id=&quot;setup&quot;&gt;Setup&lt;/h3&gt;

 &lt;p&gt;For these tests Accumulo, Zookeeper, and HDFS were run on a cluster in Azure
 setup by Muchos and external compactions were run in a separate Kubernetes
 cluster running in Azure.  The Accumulo cluster had the following
 configuration.&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;Centos 7&lt;/li&gt;
   &lt;li&gt;Open JDK 11&lt;/li&gt;
   &lt;li&gt;Zookeeper 3.6.2&lt;/li&gt;
   &lt;li&gt;Hadoop 3.3.0&lt;/li&gt;
   &lt;li&gt;Accumulo 2.1.0-SNAPSHOT &lt;a href=&quot;https://github.com/apache/accumulo/commit/dad7e01ae7d450064cba5d60a1e0770311ebdb64&quot;&gt;dad7e01&lt;/a&gt;&lt;/li&gt;
   &lt;li&gt;23 D16s_v4 VMs, each with 16x128G HDDs stripped using LVM. 22 were workers.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;The following diagram shows how the two clusters were setup.  The Muchos and
 Kubernetes clusters were on the same private vnet, each with its own /16 subnet
 in the 10.x.x.x IP address space.  The Kubernetes cluster that ran external
 compactions was backed by at least 3 D8s_v4 VMs, with VMs autoscaling with the
 number of pods running.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/clusters-layout.png&quot; alt=&quot;Cluster Layout&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;One problem we ran into was communication between Compactors running inside
 Kubernetes with processes like the Compaction Coordinator and DataNodes running
 outside of Kubernetes in the Muchos cluster.  For some insights into how these
 problems were overcome, checkout the comments in the &lt;a href=&quot;/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml&quot;&gt;deployment
 spec&lt;/a&gt; used.&lt;/p&gt;

 &lt;h3 id=&quot;configuration-1&quot;&gt;Configuration&lt;/h3&gt;

 &lt;p&gt;The following Accumulo shell commands set up a new compaction service named
 cs1.  This compaction service has an internal executor with 4 threads named
 small for compactions less than 32M, an internal executor with 2 threads named
 medium for compactions less than 128M, and an external compaction queue named
 DCQ1 for all other compactions.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;config -s 'tserver.compaction.major.service.cs1.planner.opts.executors=[{&quot;name&quot;:&quot;small&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;maxSize&quot;:&quot;32M&quot;,&quot;numThreads&quot;:4},{&quot;name&quot;:&quot;medium&quot;,&quot;type&quot;:&quot;internal&quot;,&quot;maxSize&quot;:&quot;128M&quot;,&quot;numThreads&quot;:2},{&quot;name&quot;:&quot;large&quot;,&quot;type&quot;:&quot;external&quot;,&quot;queue&quot;:&quot;DCQ1&quot;}]'
 config -s tserver.compaction.major.service.cs1.planner=org.apache.accumulo.core.spi.compaction.DefaultCompactionPlanner
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;The continuous ingest table was configured to use the above compaction service.
 The table’s compaction ratio was also lowered from the default of 3 to 2.  A
 lower compaction ratio results in less files per Tablet and more compaction
 work.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;config -t ci -s table.compaction.dispatcher=org.apache.accumulo.core.spi.compaction.SimpleCompactionDispatcher
 config -t ci -s table.compaction.dispatcher.opts.service=cs1
 config -t ci -s table.compaction.major.ratio=2
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;The Compaction Coordinator was manually started on the Muchos VM where the
 Accumulo Manager, Zookeeper server, and the Namenode were running. The
 following command was used to do this.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;nohup accumulo compaction-coordinator &amp;gt;/var/data/logs/accumulo/compaction-coordinator.out 2&amp;gt;/var/data/logs/accumulo/compaction-coordinator.err &amp;amp;
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;To start Compactors, Accumulo’s
 &lt;a href=&quot;https://github.com/apache/accumulo-docker/tree/next-release&quot;&gt;docker&lt;/a&gt; image was
 built from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;next-release&lt;/code&gt; branch by checking out the Apache Accumulo git
 repo at commit &lt;a href=&quot;https://github.com/apache/accumulo/commit/dad7e01ae7d450064cba5d60a1e0770311ebdb64&quot;&gt;dad7e01&lt;/a&gt; and building the binary distribution using the
 command &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mvn clean package -DskipTests&lt;/code&gt;. The resulting tar file was copied to
 the accumulo-docker base directory and the image was built using the command:&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker build --build-arg ACCUMULO_VERSION=2.1.0-SNAPSHOT --build-arg ACCUMULO_FILE=accumulo-2.1.0-SNAPSHOT-bin.tar.gz \
              --build-arg HADOOP_FILE=hadoop-3.3.0.tar.gz \
              --build-arg ZOOKEEPER_VERSION=3.6.2  --build-arg ZOOKEEPER_FILE=apache-zookeeper-3.6.2-bin.tar.gz  \
              -t accumulo .
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;The Docker image was tagged and then pushed to a container registry accessible by
 Kubernetes. Then the following commands were run to start the Compactors using
 &lt;a href=&quot;/images/blog/202107_ecomp/accumulo-compactor-muchos.yaml&quot;&gt;accumulo-compactor-muchos.yaml&lt;/a&gt;.
 The yaml file contains comments explaining issues related to IP addresses and DNS names.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl apply -f accumulo-compactor-muchos.yaml
 kubectl autoscale deployment accumulo-compactor --cpu-percent=80 --min=10 --max=660
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;p&gt;The autoscale command causes Compactors to scale between 10
 and 660 pods based on CPU usage. When pods average CPU is above 80%, then
 pods are added to meet the 80% goal. When it’s below 80%, pods
 are stopped to meet the 80% goal with 5 minutes between scale down
 events. This can sometimes lead to running compactions being
 stopped. During the test there were ~537 dead compactions that were probably
 caused by this (there were 44K successful external compactions). The max of 660
 was chosen based on the number of datanodes in the Muchos cluster.  There were
 22 datanodes and 30x22=660, so this conceptually sets a limit of 30 external
 compactions per datanode.  This was well tolerated by the Muchos cluster.  One
 important lesson we learned is that external compactions can strain the HDFS
 DataNodes, so it’s important to consider how many concurrent external
 compactions will be running. The Muchos cluster had 22x16=352 cores on the
 worker VMs, so the max of 660 exceeds what the Muchos cluster could run itself.&lt;/p&gt;

 &lt;h3 id=&quot;ingesting-data&quot;&gt;Ingesting data&lt;/h3&gt;

 &lt;p&gt;After starting Compactors, 22 continuous ingest clients (from
 accumulo-testing) were started.  The following plot shows the number of
 compactions running in the three different compaction queues
 configured.  The executor cs1_small is for compactions &amp;lt;= 32M and it stayed
 pretty busy as minor compactions constantly produce new small files.  In 2.1.0
 merging minor compactions were removed, so it’s important to ensure a
 compaction queue is properly configured for new small files. The executor
 cs1_medium was for compactions &amp;gt;32M and &amp;lt;=128M and it was not as busy, but did
 have steady work.  The external compaction queue DCQ1 processed all compactions
 over 128M and had some spikes of work.  These spikes are to be expected with
 continuous ingest as all Tablets are written to evenly and eventually all of
 the Tablets need to run large compactions around the same time.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-running.png&quot; alt=&quot;Compactions Running&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The following plot shows the number of pods running in Kubernetes.  As
 Compactors used more and less CPU the number of pods automatically scaled up
 and down.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-pods-running.png&quot; alt=&quot;Pods Running&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The following plot shows the number of compactions queued.  When the
 compactions queued for cs1_small spiked above 750, it was adjusted from 4
 threads per Tablet Server to 6 threads.  This configuration change was made while
 everything was running and the Tablet Servers saw it and reconfigured their thread
 pools on the fly.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-queued.png&quot; alt=&quot;Pods Queued&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The metrics emitted by Accumulo for these plots had the following names.&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;TabletServer1.tserver.compactionExecutors.e_DCQ1_queued&lt;/li&gt;
   &lt;li&gt;TabletServer1.tserver.compactionExecutors.e_DCQ1_running&lt;/li&gt;
   &lt;li&gt;TabletServer1.tserver.compactionExecutors.i_cs1_medium_queued&lt;/li&gt;
   &lt;li&gt;TabletServer1.tserver.compactionExecutors.i_cs1_medium_running&lt;/li&gt;
   &lt;li&gt;TabletServer1.tserver.compactionExecutors.i_cs1_small_queued&lt;/li&gt;
   &lt;li&gt;TabletServer1.tserver.compactionExecutors.i_cs1_small_running&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;Tablet servers emit metrics about queued and running compactions for every
 compaction executor configured.  User can observe these metrics and tune
 the configuration based on what they see, as was done in this test.&lt;/p&gt;

 &lt;p&gt;The following plot shows the average files per Tablet during the
 test. The numbers are what would be expected for a compaction ratio of 2 when
 the system is keeping up with compaction work. Also, animated GIFs were created to
 show a few tablets &lt;a href=&quot;/images/blog/202107_ecomp/files_over_time.html&quot;&gt;files over time&lt;/a&gt;.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-files-per-tablet.png&quot; alt=&quot;Files Per Tablet&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The following is a plot of the number Tablets during the test.
 Eventually there were 11.28K Tablets around 512 Tablets per Tablet Server.  The
 Tablets were close to splitting again at the end of the test as each Tablet was
 getting close to 1G.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-online-tablets.png&quot; alt=&quot;Online Tablets&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The following plot shows ingest rate over time.  The rate goes down as the
 number of Tablets per Tablet Server goes up, this is expected.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-ingest-rate.png&quot; alt=&quot;Ingest Rate&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;The following plot shows the number of key/values in Accumulo during
 the test.  When ingest was stopped, there were 266 billion key values in the
 continuous ingest table.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/ci-entries.png&quot; alt=&quot;Table Entries&quot; /&gt;&lt;/p&gt;

 &lt;h3 id=&quot;full-table-compaction&quot;&gt;Full table compaction&lt;/h3&gt;

 &lt;p&gt;After stopping ingest and letting things settle, a full table compaction was
 kicked off. Since all of these compactions would be over 128M, all of them were
 scheduled on the external queue DCQ1.  The two plots below show compactions
 running and queued for the ~2 hours it took to do the compaction. When the
 compaction was initiated there were 10 Compactors running in pods.  All 11K
 Tablets were queued for compaction and because the pods were always running
 high CPU Kubernetes kept adding pods until the max was reached resulting in 660
 Compactors running until all the work was done.&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/full-table-compaction-queued.png&quot; alt=&quot;Full Table Compactions Running&quot; /&gt;&lt;/p&gt;

 &lt;p&gt;&lt;img src=&quot;/images/blog/202107_ecomp/full-table-compaction-running.png&quot; alt=&quot;Full Table Compactions Queued&quot; /&gt;&lt;/p&gt;

 &lt;h3 id=&quot;verification&quot;&gt;Verification&lt;/h3&gt;

 &lt;p&gt;After running everything mentioned above, the continuous ingest verification
 map reduce job was run.  This job looks for holes in the linked list produced
 by continuous ingest which indicate Accumulo lost data.  No holes were found.
 The counts below were emitted by the job.  If there were holes a non-zero
 UNDEFINED count would be present.&lt;/p&gt;

 &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;        org.apache.accumulo.testing.continuous.ContinuousVerify$Counts
                 REFERENCED=266225036149
                 UNREFERENCED=22010637
 &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

 &lt;h2 id=&quot;hurdles&quot;&gt;Hurdles&lt;/h2&gt;

 &lt;h3 id=&quot;how-to-scale-up&quot;&gt;How to Scale Up&lt;/h3&gt;

 &lt;p&gt;We ran into several issues running the Compactors in Kubernetes. First, we knew
 that we could use Kubernetes &lt;a href=&quot;https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/&quot;&gt;Horizontal Pod Autoscaler&lt;/a&gt; (HPA) to scale the
 Compactors up and down based on load. But the question remained how to do that.
 Probably the best metric to use for scaling the Compactors is the size of the
 external compaction queue. Another possible solution is to take the DataNode
 utilization into account somehow. We found that in scaling up the Compactors
 based on their CPU usage we could overload DataNodes.  Once DataNodes were
 overwhelmed, Compactors CPU would drop and the number of pods would naturally
 scale down.&lt;/p&gt;

 &lt;p&gt;To use custom metrics you would need to get the metrics from Accumulo into a
 metrics store that has a &lt;a href=&quot;https://github.com/kubernetes/metrics/blob/master/IMPLEMENTATIONS.md#custom-metrics-api&quot;&gt;metrics adapter&lt;/a&gt;. One possible solution, available
 in Hadoop 3.3.0, is to use Prometheus, the &lt;a href=&quot;https://github.com/kubernetes-sigs/prometheus-adapter&quot;&gt;Prometheus Adapter&lt;/a&gt;, and enable
 the Hadoop PrometheusMetricsSink added in
 &lt;a href=&quot;https://issues.apache.org/jira/browse/HADOOP-16398&quot;&gt;HADOOP-16398&lt;/a&gt; to expose the custom queue
 size metrics. This seemed like the right solution, but it also seemed like a
 lot of work that was outside the scope of this blog post. Ultimately we decided
 to take the simplest approach - use the native Kubernetes metrics-server and
 scale off CPU usage of the Compactors. As you can see in the “Compactions Queued”
 and “Compactions Running” graphs above from the full table compaction, it took about
 45 minutes for Kubernetes to scale up Compactors to the maximum configured (660). Compactors
 likely would have been scaled up much faster if scaling was done off the queued compactions
 instead of CPU usage.&lt;/p&gt;

 &lt;h3 id=&quot;gracefully-scaling-down&quot;&gt;Gracefully Scaling Down&lt;/h3&gt;

 &lt;p&gt;The Kubernetes Pod &lt;a href=&quot;https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination&quot;&gt;termination process&lt;/a&gt; provides a mechanism for the user to
 define a pre-stop hook that will be called before the Pod is terminated.
 Without this hook Kubernetes sends a SIGTERM to the Pod, followed by a
 user-defined grace period, then a SIGKILL. For the purposes of this test we did
 not define a pre-stop hook or a grace period. It’s likely possible to handle
 this situation more gracefully, but for this test our Compactors were killed
 and the compaction work lost when the HPA decided to scale down the Compactors.
 It was a good test of how we handled failed Compactors.  Investigation is
 needed to determine if changes are needed in Accumulo to facilitate graceful
 scale down.&lt;/p&gt;

 &lt;h3 id=&quot;how-to-connect&quot;&gt;How to Connect&lt;/h3&gt;

 &lt;p&gt;The other major issue we ran into was connectivity between the Compactors and
 the other server processes. The Compactor communicates with ZooKeeper and the
 Compaction Coordinator, both of which were running outside of Kubernetes.  There
 is no common DNS between the Muchos and Kubernetes cluster, but IPs were
 visible to both. The Compactor connects to ZooKeeper to find the address of the
 Compaction Coordinator so that it can connect to it and look for work. By
 default the Accumulo server processes use the hostname as their address which
 would not work as those names would not resolve inside the Kubernetes cluster.
 We had to start the Accumulo processes using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-a&lt;/code&gt; argument and set the
 hostname to the IP address. Solving connectivity issues between components
 running in Kubernetes and components external to Kubernetes depends on the capabilities
 available in the environment and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-a&lt;/code&gt; option may be part of the solution.&lt;/p&gt;

 &lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

 &lt;p&gt;In this blog post we introduced the concept and benefits of external
 compactions, the new server processes and how to configure the compaction
 service. We deployed a 23-node Accumulo cluster using Muchos with a variable
 sized Kubernetes cluster that dynamically scaled Compactors on 3 to 100 compute
 nodes from 10 to 660 instances. We ran continuous ingest on the Accumulo
 cluster to create compactions that were run both internal and external to the
 Tablet Server and demonstrated external compactions completing successfully and
 Compactors being killed.&lt;/p&gt;

 &lt;p&gt;We discussed also running the following test, but did not have time.&lt;/p&gt;

 &lt;ul&gt;
   &lt;li&gt;Agitating the Compaction Coordinator, Tablet Servers and Compactors while ingest was running.&lt;/li&gt;
   &lt;li&gt;Comparing the impact on queries for internal vs external compactions.&lt;/li&gt;
   &lt;li&gt;Having multiple external compaction queues, each with its own set of autoscaled Compactor pods.&lt;/li&gt;
   &lt;li&gt;Forcing full table compactions while ingest was running.&lt;/li&gt;
 &lt;/ul&gt;

 &lt;p&gt;The test we ran shows that basic functionality works well, it would be nice to
 stress the feature in other ways though.&lt;/p&gt;

 </description>
         <pubDate>Thu, 08 Jul 2021 00:00:00 +0000</pubDate>
         <link>https://accumulo.apache.org/blog/2021/07/08/external-compactions.html</link>
         <guid isPermaLink="true">https://accumulo.apache.org/blog/2021/07/08/external-compactions.html</guid>


         <category>blog</category>

       </item>

   </channel>
 </rss>