| |
| |
| <!DOCTYPE html> |
| <!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> |
| <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> |
| <head> |
| <meta charset="utf-8"> |
| |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| |
| <title>Compaction — Apache Cassandra Documentation v4.1</title> |
| |
| |
| |
| |
| |
| |
| |
| |
| <script type="text/javascript" src="../../_static/js/modernizr.min.js"></script> |
| |
| |
| <script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script> |
| <script type="text/javascript" src="../../_static/jquery.js"></script> |
| <script type="text/javascript" src="../../_static/underscore.js"></script> |
| <script type="text/javascript" src="../../_static/doctools.js"></script> |
| <script type="text/javascript" src="../../_static/language_data.js"></script> |
| <script async="async" type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-AMS-MML_HTMLorMML"></script> |
| |
| <script type="text/javascript" src="../../_static/js/theme.js"></script> |
| |
| |
| |
| |
| <link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> |
| <link rel="stylesheet" href="../../_static/extra.css" type="text/css" /> |
| <link rel="index" title="Index" href="../../genindex.html" /> |
| <link rel="search" title="Search" href="../../search.html" /> |
| <link rel="next" title="Bloom Filters" href="../bloom_filters.html" /> |
| <link rel="prev" title="Hints" href="../hints.html" /> |
| </head> |
| |
| <body class="wy-body-for-nav"> |
| |
| |
| <div class="wy-grid-for-nav"> |
| |
| <nav data-toggle="wy-nav-shift" class="wy-nav-side"> |
| <div class="wy-side-scroll"> |
| <div class="wy-side-nav-search" > |
| |
| |
| |
| <a href="../../index.html" class="icon icon-home"> Apache Cassandra |
| |
| |
| |
| </a> |
| |
| |
| |
| |
| <div class="version"> |
| 4.1 |
| </div> |
| |
| |
| |
| |
| <div role="search"> |
| <form id="rtd-search-form" class="wy-form" action="../../search.html" method="get"> |
| <input type="text" name="q" placeholder="Search docs" /> |
| <input type="hidden" name="check_keywords" value="yes" /> |
| <input type="hidden" name="area" value="default" /> |
| </form> |
| </div> |
| |
| |
| </div> |
| |
| <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation"> |
| |
| |
| |
| |
| |
| |
| <ul class="current"> |
| <li class="toctree-l1"><a class="reference internal" href="../../getting_started/index.html">Getting Started</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../new/index.html">New Features in Apache Cassandra 4.0</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../architecture/index.html">Architecture</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../cql/index.html">The Cassandra Query Language (CQL)</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../data_modeling/index.html">Data Modeling</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../configuration/index.html">Configuring Cassandra</a></li> |
| <li class="toctree-l1 current"><a class="reference internal" href="../index.html">Operating Cassandra</a><ul class="current"> |
| <li class="toctree-l2"><a class="reference internal" href="../snitch.html">Snitch</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../topo_changes.html">Adding, replacing, moving and removing nodes</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../repair.html">Repair</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../read_repair.html">Read repair</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../hints.html">Hints</a></li> |
| <li class="toctree-l2 current"><a class="current reference internal" href="#">Compaction</a><ul> |
| <li class="toctree-l3"><a class="reference internal" href="#strategies">Strategies</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#types-of-compaction">Types of compaction</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#when-is-a-minor-compaction-triggered">When is a minor compaction triggered?</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#merging-sstables">Merging sstables</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#tombstones-and-garbage-collection-gc-grace">Tombstones and Garbage Collection (GC) Grace</a><ul> |
| <li class="toctree-l4"><a class="reference internal" href="#why-tombstones">Why Tombstones</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#deletes-without-tombstones">Deletes without tombstones</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#deletes-with-tombstones">Deletes with Tombstones</a></li> |
| <li class="toctree-l4"><a class="reference internal" href="#the-gc-grace-seconds-parameter-and-tombstone-removal">The gc_grace_seconds parameter and Tombstone Removal</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l3"><a class="reference internal" href="#ttl">TTL</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#fully-expired-sstables">Fully expired sstables</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#repaired-unrepaired-data">Repaired/unrepaired data</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#data-directories">Data directories</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#single-sstable-tombstone-compaction">Single sstable tombstone compaction</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#common-options">Common options</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#compaction-nodetool-commands">Compaction nodetool commands</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#switching-the-compaction-strategy-and-options-using-jmx">Switching the compaction strategy and options using JMX</a></li> |
| <li class="toctree-l3"><a class="reference internal" href="#more-detailed-compaction-logging">More detailed compaction logging</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l2"><a class="reference internal" href="../bloom_filters.html">Bloom Filters</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../compression.html">Compression</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../cdc.html">Change Data Capture</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../backups.html">Backups</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../bulk_loading.html">Bulk Loading</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../metrics.html">Monitoring</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../security.html">Security</a></li> |
| <li class="toctree-l2"><a class="reference internal" href="../hardware.html">Hardware Choices</a></li> |
| </ul> |
| </li> |
| <li class="toctree-l1"><a class="reference internal" href="../../tools/index.html">Cassandra Tools</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../troubleshooting/index.html">Troubleshooting</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../development/index.html">Contributing to Cassandra</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../faq/index.html">Frequently Asked Questions</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../plugins/index.html">Third-Party Plugins</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../bugs.html">Reporting Bugs</a></li> |
| <li class="toctree-l1"><a class="reference internal" href="../../contactus.html">Contact us</a></li> |
| </ul> |
| |
| |
| |
| </div> |
| </div> |
| </nav> |
| |
| <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"> |
| |
| |
| <nav class="wy-nav-top" aria-label="top navigation"> |
| |
| <i data-toggle="wy-nav-top" class="fa fa-bars"></i> |
| <a href="../../index.html">Apache Cassandra</a> |
| |
| </nav> |
| |
| |
| <div class="wy-nav-content"> |
| |
| <div class="rst-content"> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| <div role="navigation" aria-label="breadcrumbs navigation"> |
| |
| <ul class="wy-breadcrumbs"> |
| |
| <li><a href="../../index.html">Docs</a> »</li> |
| |
| <li><a href="../index.html">Operating Cassandra</a> »</li> |
| |
| <li>Compaction</li> |
| |
| |
| <li class="wy-breadcrumbs-aside"> |
| |
| |
| <a href="../../_sources/operating/compaction/index.rst.txt" rel="nofollow"> View page source</a> |
| |
| |
| </li> |
| |
| </ul> |
| |
| |
| <hr/> |
| </div> |
| <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> |
| <div itemprop="articleBody"> |
| |
| <div class="section" id="compaction"> |
| <span id="id1"></span><h1>Compaction<a class="headerlink" href="#compaction" title="Permalink to this headline">¶</a></h1> |
| <div class="section" id="strategies"> |
| <h2>Strategies<a class="headerlink" href="#strategies" title="Permalink to this headline">¶</a></h2> |
| <p>Picking the right compaction strategy for your workload will ensure the best performance for both querying and for compaction itself.</p> |
| <dl class="simple"> |
| <dt><a class="reference internal" href="stcs.html#stcs"><span class="std std-ref">Size Tiered Compaction Strategy</span></a></dt><dd><p>The default compaction strategy. Useful as a fallback when other strategies don’t fit the workload. Most useful for |
| non pure time series workloads with spinning disks, or when the I/O from <a class="reference internal" href="lcs.html#lcs"><span class="std std-ref">LCS</span></a> is too high.</p> |
| </dd> |
| <dt><a class="reference internal" href="lcs.html#lcs"><span class="std std-ref">Leveled Compaction Strategy</span></a></dt><dd><p>Leveled Compaction Strategy (LCS) is optimized for read heavy workloads, or workloads with lots of updates and deletes. It is not a good choice for immutable time series data.</p> |
| </dd> |
| <dt><a class="reference internal" href="twcs.html#twcs"><span class="std std-ref">Time Window Compaction Strategy</span></a></dt><dd><p>Time Window Compaction Strategy is designed for TTL’ed, mostly immutable time series data.</p> |
| </dd> |
| </dl> |
| </div> |
| <div class="section" id="types-of-compaction"> |
| <h2>Types of compaction<a class="headerlink" href="#types-of-compaction" title="Permalink to this headline">¶</a></h2> |
| <p>The concept of compaction is used for different kinds of operations in Cassandra, the common thing about these |
| operations is that it takes one or more sstables and output new sstables. The types of compactions are;</p> |
| <dl class="simple"> |
| <dt>Minor compaction</dt><dd><p>triggered automatically in Cassandra.</p> |
| </dd> |
| <dt>Major compaction</dt><dd><p>a user executes a compaction over all sstables on the node.</p> |
| </dd> |
| <dt>User defined compaction</dt><dd><p>a user triggers a compaction on a given set of sstables.</p> |
| </dd> |
| <dt>Scrub</dt><dd><p>try to fix any broken sstables. This can actually remove valid data if that data is corrupted, if that happens you |
| will need to run a full repair on the node.</p> |
| </dd> |
| <dt>Upgradesstables</dt><dd><p>upgrade sstables to the latest version. Run this after upgrading to a new major version.</p> |
| </dd> |
| <dt>Cleanup</dt><dd><p>remove any ranges this node does not own anymore, typically triggered on neighbouring nodes after a node has been |
| bootstrapped since that node will take ownership of some ranges from those nodes.</p> |
| </dd> |
| <dt>Secondary index rebuild</dt><dd><p>rebuild the secondary indexes on the node.</p> |
| </dd> |
| <dt>Anticompaction</dt><dd><p>after repair the ranges that were actually repaired are split out of the sstables that existed when repair started.</p> |
| </dd> |
| <dt>Sub range compaction</dt><dd><p>It is possible to only compact a given sub range - this could be useful if you know a token that has been |
| misbehaving - either gathering many updates or many deletes. (<code class="docutils literal notranslate"><span class="pre">nodetool</span> <span class="pre">compact</span> <span class="pre">-st</span> <span class="pre">x</span> <span class="pre">-et</span> <span class="pre">y</span></code>) will pick |
| all sstables containing the range between x and y and issue a compaction for those sstables. For STCS this will |
| most likely include all sstables but with LCS it can issue the compaction for a subset of the sstables. With LCS |
| the resulting sstable will end up in L0.</p> |
| </dd> |
| </dl> |
| </div> |
| <div class="section" id="when-is-a-minor-compaction-triggered"> |
| <h2>When is a minor compaction triggered?<a class="headerlink" href="#when-is-a-minor-compaction-triggered" title="Permalink to this headline">¶</a></h2> |
| <p># When an sstable is added to the node through flushing/streaming etc. |
| # When autocompaction is enabled after being disabled (<code class="docutils literal notranslate"><span class="pre">nodetool</span> <span class="pre">enableautocompaction</span></code>) |
| # When compaction adds new sstables. |
| # A check for new minor compactions every 5 minutes.</p> |
| </div> |
| <div class="section" id="merging-sstables"> |
| <h2>Merging sstables<a class="headerlink" href="#merging-sstables" title="Permalink to this headline">¶</a></h2> |
| <p>Compaction is about merging sstables, since partitions in sstables are sorted based on the hash of the partition key it |
| is possible to efficiently merge separate sstables. Content of each partition is also sorted so each partition can be |
| merged efficiently.</p> |
| </div> |
| <div class="section" id="tombstones-and-garbage-collection-gc-grace"> |
| <h2>Tombstones and Garbage Collection (GC) Grace<a class="headerlink" href="#tombstones-and-garbage-collection-gc-grace" title="Permalink to this headline">¶</a></h2> |
| <div class="section" id="why-tombstones"> |
| <h3>Why Tombstones<a class="headerlink" href="#why-tombstones" title="Permalink to this headline">¶</a></h3> |
| <p>When a delete request is received by Cassandra it does not actually remove the data from the underlying store. Instead |
| it writes a special piece of data known as a tombstone. The Tombstone represents the delete and causes all values which |
| occurred before the tombstone to not appear in queries to the database. This approach is used instead of removing values |
| because of the distributed nature of Cassandra.</p> |
| </div> |
| <div class="section" id="deletes-without-tombstones"> |
| <h3>Deletes without tombstones<a class="headerlink" href="#deletes-without-tombstones" title="Permalink to this headline">¶</a></h3> |
| <p>Imagine a three node cluster which has the value [A] replicated to every node.:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>[A], [A], [A] |
| </pre></div> |
| </div> |
| <p>If one of the nodes fails and and our delete operation only removes existing values we can end up with a cluster that |
| looks like:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>[], [], [A] |
| </pre></div> |
| </div> |
| <p>Then a repair operation would replace the value of [A] back onto the two |
| nodes which are missing the value.:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>[A], [A], [A] |
| </pre></div> |
| </div> |
| <p>This would cause our data to be resurrected even though it had been |
| deleted.</p> |
| </div> |
| <div class="section" id="deletes-with-tombstones"> |
| <h3>Deletes with Tombstones<a class="headerlink" href="#deletes-with-tombstones" title="Permalink to this headline">¶</a></h3> |
| <p>Starting again with a three node cluster which has the value [A] replicated to every node.:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>[A], [A], [A] |
| </pre></div> |
| </div> |
| <p>If instead of removing data we add a tombstone record, our single node failure situation will look like this.:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>[A, Tombstone[A]], [A, Tombstone[A]], [A] |
| </pre></div> |
| </div> |
| <p>Now when we issue a repair the Tombstone will be copied to the replica, rather than the deleted data being |
| resurrected.:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>[A, Tombstone[A]], [A, Tombstone[A]], [A, Tombstone[A]] |
| </pre></div> |
| </div> |
| <p>Our repair operation will correctly put the state of the system to what we expect with the record [A] marked as deleted |
| on all nodes. This does mean we will end up accruing Tombstones which will permanently accumulate disk space. To avoid |
| keeping tombstones forever we have a parameter known as <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code> for every table in Cassandra.</p> |
| </div> |
| <div class="section" id="the-gc-grace-seconds-parameter-and-tombstone-removal"> |
| <h3>The gc_grace_seconds parameter and Tombstone Removal<a class="headerlink" href="#the-gc-grace-seconds-parameter-and-tombstone-removal" title="Permalink to this headline">¶</a></h3> |
| <p>The table level <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code> parameter controls how long Cassandra will retain tombstones through compaction |
| events before finally removing them. This duration should directly reflect the amount of time a user expects to allow |
| before recovering a failed node. After <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code> has expired the tombstone may be removed (meaning there will |
| no longer be any record that a certain piece of data was deleted), but as a tombstone can live in one sstable and the |
| data it covers in another, a compaction must also include both sstable for a tombstone to be removed. More precisely, to |
| be able to drop an actual tombstone the following needs to be true;</p> |
| <ul class="simple"> |
| <li><p>The tombstone must be older than <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code></p></li> |
| <li><p>If partition X contains the tombstone, the sstable containing the partition plus all sstables containing data older |
| than the tombstone containing X must be included in the same compaction. We don’t need to care if the partition is in |
| an sstable if we can guarantee that all data in that sstable is newer than the tombstone. If the tombstone is older |
| than the data it cannot shadow that data.</p></li> |
| <li><p>If the option <code class="docutils literal notranslate"><span class="pre">only_purge_repaired_tombstones</span></code> is enabled, tombstones are only removed if the data has also been |
| repaired.</p></li> |
| </ul> |
| <p>If a node remains down or disconnected for longer than <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code> it’s deleted data will be repaired back to |
| the other nodes and re-appear in the cluster. This is basically the same as in the “Deletes without Tombstones” section. |
| Note that tombstones will not be removed until a compaction event even if <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code> has elapsed.</p> |
| <p>The default value for <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code> is 864000 which is equivalent to 10 days. This can be set when creating or |
| altering a table using <code class="docutils literal notranslate"><span class="pre">WITH</span> <span class="pre">gc_grace_seconds</span></code>.</p> |
| </div> |
| </div> |
| <div class="section" id="ttl"> |
| <h2>TTL<a class="headerlink" href="#ttl" title="Permalink to this headline">¶</a></h2> |
| <p>Data in Cassandra can have an additional property called time to live - this is used to automatically drop data that has |
| expired once the time is reached. Once the TTL has expired the data is converted to a tombstone which stays around for |
| at least <code class="docutils literal notranslate"><span class="pre">gc_grace_seconds</span></code>. Note that if you mix data with TTL and data without TTL (or just different length of the |
| TTL) Cassandra will have a hard time dropping the tombstones created since the partition might span many sstables and |
| not all are compacted at once.</p> |
| </div> |
| <div class="section" id="fully-expired-sstables"> |
| <h2>Fully expired sstables<a class="headerlink" href="#fully-expired-sstables" title="Permalink to this headline">¶</a></h2> |
| <p>If an sstable contains only tombstones and it is guaranteed that that sstable is not shadowing data in any other sstable |
| compaction can drop that sstable. If you see sstables with only tombstones (note that TTL:ed data is considered |
| tombstones once the time to live has expired) but it is not being dropped by compaction, it is likely that other |
| sstables contain older data. There is a tool called <code class="docutils literal notranslate"><span class="pre">sstableexpiredblockers</span></code> that will list which sstables are |
| droppable and which are blocking them from being dropped. This is especially useful for time series compaction with |
| <code class="docutils literal notranslate"><span class="pre">TimeWindowCompactionStrategy</span></code> (and the deprecated <code class="docutils literal notranslate"><span class="pre">DateTieredCompactionStrategy</span></code>). With <code class="docutils literal notranslate"><span class="pre">TimeWindowCompactionStrategy</span></code> |
| it is possible to remove the guarantee (not check for shadowing data) by enabling <code class="docutils literal notranslate"><span class="pre">unsafe_aggressive_sstable_expiration</span></code>.</p> |
| </div> |
| <div class="section" id="repaired-unrepaired-data"> |
| <h2>Repaired/unrepaired data<a class="headerlink" href="#repaired-unrepaired-data" title="Permalink to this headline">¶</a></h2> |
| <p>With incremental repairs Cassandra must keep track of what data is repaired and what data is unrepaired. With |
| anticompaction repaired data is split out into repaired and unrepaired sstables. To avoid mixing up the data again |
| separate compaction strategy instances are run on the two sets of data, each instance only knowing about either the |
| repaired or the unrepaired sstables. This means that if you only run incremental repair once and then never again, you |
| might have very old data in the repaired sstables that block compaction from dropping tombstones in the unrepaired |
| (probably newer) sstables.</p> |
| </div> |
| <div class="section" id="data-directories"> |
| <h2>Data directories<a class="headerlink" href="#data-directories" title="Permalink to this headline">¶</a></h2> |
| <p>Since tombstones and data can live in different sstables it is important to realize that losing an sstable might lead to |
| data becoming live again - the most common way of losing sstables is to have a hard drive break down. To avoid making |
| data live tombstones and actual data are always in the same data directory. This way, if a disk is lost, all versions of |
| a partition are lost and no data can get undeleted. To achieve this a compaction strategy instance per data directory is |
| run in addition to the compaction strategy instances containing repaired/unrepaired data, this means that if you have 4 |
| data directories there will be 8 compaction strategy instances running. This has a few more benefits than just avoiding |
| data getting undeleted:</p> |
| <ul class="simple"> |
| <li><p>It is possible to run more compactions in parallel - leveled compaction will have several totally separate levelings |
| and each one can run compactions independently from the others.</p></li> |
| <li><p>Users can backup and restore a single data directory.</p></li> |
| <li><p>Note though that currently all data directories are considered equal, so if you have a tiny disk and a big disk |
| backing two data directories, the big one will be limited the by the small one. One work around to this is to create |
| more data directories backed by the big disk.</p></li> |
| </ul> |
| </div> |
| <div class="section" id="single-sstable-tombstone-compaction"> |
| <h2>Single sstable tombstone compaction<a class="headerlink" href="#single-sstable-tombstone-compaction" title="Permalink to this headline">¶</a></h2> |
| <p>When an sstable is written a histogram with the tombstone expiry times is created and this is used to try to find |
| sstables with very many tombstones and run single sstable compaction on that sstable in hope of being able to drop |
| tombstones in that sstable. Before starting this it is also checked how likely it is that any tombstones will actually |
| will be able to be dropped how much this sstable overlaps with other sstables. To avoid most of these checks the |
| compaction option <code class="docutils literal notranslate"><span class="pre">unchecked_tombstone_compaction</span></code> can be enabled.</p> |
| </div> |
| <div class="section" id="common-options"> |
| <span id="compaction-options"></span><h2>Common options<a class="headerlink" href="#common-options" title="Permalink to this headline">¶</a></h2> |
| <p>There is a number of common options for all the compaction strategies;</p> |
| <dl class="simple"> |
| <dt><code class="docutils literal notranslate"><span class="pre">enabled</span></code> (default: true)</dt><dd><p>Whether minor compactions should run. Note that you can have ‘enabled’: true as a compaction option and then do |
| ‘nodetool enableautocompaction’ to start running compactions.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">tombstone_threshold</span></code> (default: 0.2)</dt><dd><p>How much of the sstable should be tombstones for us to consider doing a single sstable compaction of that sstable.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">tombstone_compaction_interval</span></code> (default: 86400s (1 day))</dt><dd><p>Since it might not be possible to drop any tombstones when doing a single sstable compaction we need to make sure |
| that one sstable is not constantly getting recompacted - this option states how often we should try for a given |
| sstable.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">log_all</span></code> (default: false)</dt><dd><p>New detailed compaction logging, see <a class="reference internal" href="#detailed-compaction-logging"><span class="std std-ref">below</span></a>.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">unchecked_tombstone_compaction</span></code> (default: false)</dt><dd><p>The single sstable compaction has quite strict checks for whether it should be started, this option disables those |
| checks and for some usecases this might be needed. Note that this does not change anything for the actual |
| compaction, tombstones are only dropped if it is safe to do so - it might just rewrite an sstable without being able |
| to drop any tombstones.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">only_purge_repaired_tombstone</span></code> (default: false)</dt><dd><p>Option to enable the extra safety of making sure that tombstones are only dropped if the data has been repaired.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">min_threshold</span></code> (default: 4)</dt><dd><p>Lower limit of number of sstables before a compaction is triggered. Not used for <code class="docutils literal notranslate"><span class="pre">LeveledCompactionStrategy</span></code>.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">max_threshold</span></code> (default: 32)</dt><dd><p>Upper limit of number of sstables before a compaction is triggered. Not used for <code class="docutils literal notranslate"><span class="pre">LeveledCompactionStrategy</span></code>.</p> |
| </dd> |
| </dl> |
| <p>Further, see the section on each strategy for specific additional options.</p> |
| </div> |
| <div class="section" id="compaction-nodetool-commands"> |
| <h2>Compaction nodetool commands<a class="headerlink" href="#compaction-nodetool-commands" title="Permalink to this headline">¶</a></h2> |
| <p>The <span class="xref std std-ref">nodetool</span> utility provides a number of commands related to compaction:</p> |
| <dl class="simple"> |
| <dt><code class="docutils literal notranslate"><span class="pre">enableautocompaction</span></code></dt><dd><p>Enable compaction.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">disableautocompaction</span></code></dt><dd><p>Disable compaction.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">setcompactionthroughput</span></code></dt><dd><p>How fast compaction should run at most - defaults to 16MB/s, but note that it is likely not possible to reach this |
| throughput.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">compactionstats</span></code></dt><dd><p>Statistics about current and pending compactions.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">compactionhistory</span></code></dt><dd><p>List details about the last compactions.</p> |
| </dd> |
| <dt><code class="docutils literal notranslate"><span class="pre">setcompactionthreshold</span></code></dt><dd><p>Set the min/max sstable count for when to trigger compaction, defaults to 4/32.</p> |
| </dd> |
| </dl> |
| </div> |
| <div class="section" id="switching-the-compaction-strategy-and-options-using-jmx"> |
| <h2>Switching the compaction strategy and options using JMX<a class="headerlink" href="#switching-the-compaction-strategy-and-options-using-jmx" title="Permalink to this headline">¶</a></h2> |
| <p>It is possible to switch compaction strategies and its options on just a single node using JMX, this is a great way to |
| experiment with settings without affecting the whole cluster. The mbean is:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>org.apache.cassandra.db:type=ColumnFamilies,keyspace=<keyspace_name>,columnfamily=<table_name> |
| </pre></div> |
| </div> |
| <p>and the attribute to change is <code class="docutils literal notranslate"><span class="pre">CompactionParameters</span></code> or <code class="docutils literal notranslate"><span class="pre">CompactionParametersJson</span></code> if you use jconsole or jmc. The |
| syntax for the json version is the same as you would use in an <a class="reference internal" href="../../cql/ddl.html#alter-table-statement"><span class="std std-ref">ALTER TABLE</span></a> statement - |
| for example:</p> |
| <div class="highlight-none notranslate"><div class="highlight"><pre><span></span>{ 'class': 'LeveledCompactionStrategy', 'sstable_size_in_mb': 123, 'fanout_size': 10} |
| </pre></div> |
| </div> |
| <p>The setting is kept until someone executes an <a class="reference internal" href="../../cql/ddl.html#alter-table-statement"><span class="std std-ref">ALTER TABLE</span></a> that touches the compaction |
| settings or restarts the node.</p> |
| </div> |
| <div class="section" id="more-detailed-compaction-logging"> |
| <span id="detailed-compaction-logging"></span><h2>More detailed compaction logging<a class="headerlink" href="#more-detailed-compaction-logging" title="Permalink to this headline">¶</a></h2> |
| <p>Enable with the compaction option <code class="docutils literal notranslate"><span class="pre">log_all</span></code> and a more detailed compaction log file will be produced in your log |
| directory.</p> |
| </div> |
| </div> |
| |
| |
| </div> |
| |
| </div> |
| <footer> |
| |
| <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation"> |
| |
| <a href="../bloom_filters.html" class="btn btn-neutral float-right" title="Bloom Filters" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a> |
| |
| |
| <a href="../hints.html" class="btn btn-neutral float-left" title="Hints" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left"></span> Previous</a> |
| |
| </div> |
| |
| |
| <hr/> |
| |
| <div role="contentinfo"> |
| <p> |
| © Copyright 2020, The Apache Cassandra team |
| |
| </p> |
| </div> |
| Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/rtfd/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>. |
| |
| </footer> |
| |
| </div> |
| </div> |
| |
| </section> |
| |
| </div> |
| |
| |
| |
| <script type="text/javascript"> |
| jQuery(function () { |
| SphinxRtdTheme.Navigation.enable(true); |
| }); |
| </script> |
| |
| |
| |
| |
| |
| |
| </body> |
| </html> |