| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| <meta charset="utf-8" /> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1" /> |
| <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> |
| <meta name="description" content="A new open source Apache Hadoop ecosystem project, Apache Kudu completes Hadoop's storage layer to enable fast analytics on fast data" /> |
| <meta name="author" content="Cloudera" /> |
| <title>Apache Kudu - Apache Kudu Background Maintenance Tasks</title> |
| <!-- Bootstrap core CSS --> |
| <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" |
| integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" |
| crossorigin="anonymous"> |
| |
| <!-- Custom styles for this template --> |
| <link href="/css/kudu.css" rel="stylesheet"/> |
| <link href="/css/asciidoc.css" rel="stylesheet"/> |
| <link rel="shortcut icon" href="/img/logo-favicon.ico" /> |
| <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.1/css/font-awesome.min.css" /> |
| |
| |
| |
| <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> |
| <!--[if lt IE 9]> |
| <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> |
| <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> |
| <![endif]--> |
| </head> |
| <body> |
| <div class="kudu-site container-fluid"> |
| <!-- Static navbar --> |
| <nav class="navbar navbar-default"> |
| <div class="container-fluid"> |
| <div class="navbar-header"> |
| <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false" aria-controls="navbar"> |
| <span class="sr-only">Toggle navigation</span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| <span class="icon-bar"></span> |
| </button> |
| |
| <a class="logo" href="/"><img |
| src="//d3dr9sfxru4sde.cloudfront.net/i/k/apachekudu_logo_0716_80px.png" |
| srcset="//d3dr9sfxru4sde.cloudfront.net/i/k/apachekudu_logo_0716_80px.png 1x, //d3dr9sfxru4sde.cloudfront.net/i/k/apachekudu_logo_0716_160px.png 2x" |
| alt="Apache Kudu"/></a> |
| |
| </div> |
| <div id="navbar" class="collapse navbar-collapse"> |
| <ul class="nav navbar-nav navbar-right"> |
| <li > |
| <a href="/">Home</a> |
| </li> |
| <li > |
| <a href="/overview.html">Overview</a> |
| </li> |
| <li class="active"> |
| <a href="/docs/">Documentation</a> |
| </li> |
| <li > |
| <a href="/releases/">Download</a> |
| </li> |
| <li > |
| <a href="/blog/">Blog</a> |
| </li> |
| <!-- NOTE: this dropdown menu does not appear on Mobile, so don't add anything here |
| that doesn't also appear elsewhere on the site. --> |
| <li class="dropdown"> |
| <a href="/community.html" role="button" aria-haspopup="true" aria-expanded="false">Community <span class="caret"></span></a> |
| <ul class="dropdown-menu"> |
| <li class="dropdown-header">GET IN TOUCH</li> |
| <li><a class="icon email" href="/community.html">Mailing Lists</a></li> |
| <li><a class="icon slack" href="https://getkudu-slack.herokuapp.com/">Slack Channel</a></li> |
| <li role="separator" class="divider"></li> |
| <li><a href="/community.html#meetups-user-groups-and-conference-presentations">Events and Meetups</a></li> |
| <li><a href="/committers.html">Project Committers</a></li> |
| <!--<li><a href="/roadmap.html">Roadmap</a></li>--> |
| <li><a href="/community.html#contributions">How to Contribute</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header">DEVELOPER RESOURCES</li> |
| <li><a class="icon github" href="https://github.com/apache/incubator-kudu">GitHub</a></li> |
| <li><a class="icon gerrit" href="http://gerrit.cloudera.org:8080/#/q/status:open+project:kudu">Gerrit Code Review</a></li> |
| <li><a class="icon jira" href="https://issues.apache.org/jira/browse/KUDU">JIRA Issue Tracker</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header">SOCIAL MEDIA</li> |
| <li><a class="icon twitter" href="https://twitter.com/ApacheKudu">Twitter</a></li> |
| <li><a href="https://www.reddit.com/r/kudu/">Reddit</a></li> |
| <li role="separator" class="divider"></li> |
| <li class="dropdown-header">APACHE SOFTWARE FOUNDATION</li> |
| <li><a href="https://www.apache.org/security/" target="_blank">Security</a></li> |
| <li><a href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Sponsorship</a></li> |
| <li><a href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a></li> |
| <li><a href="https://www.apache.org/licenses/" target="_blank">License</a></li> |
| </ul> |
| </li> |
| <li > |
| <a href="/faq.html">FAQ</a> |
| </li> |
| </ul><!-- /.nav --> |
| </div><!-- /#navbar --> |
| </div><!-- /.container-fluid --> |
| </nav> |
| |
| <!-- |
| |
| Licensed under the Apache License, Version 2.0 (the "License"); |
| you may not use this file except in compliance with the License. |
| You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| |
| <div class="container"> |
| <div class="row"> |
| <div class="col-md-9"> |
| |
| <h1>Apache Kudu Background Maintenance Tasks</h1> |
| <div id="preamble"> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>Kudu relies on running background tasks for many important automatic |
| maintenance activities. These tasks include flushing data from memory to disk, |
| compacting data to improve performance, freeing up disk space, and more.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_maintenance_manager"><a class="link" href="#_maintenance_manager">Maintenance manager</a></h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>The maintenance manager schedules and runs background tasks. At any given point |
| in time, the maintenance manager is prioritizing the next task based on the |
| improvement needed at that moment, such as relieving memory pressure, improving |
| read performance, or freeing up disk space. The number of worker threads |
| dedicated to running background tasks can be controlled by setting |
| <code>--maintenance_manager_num_threads</code>.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_flushing_data_to_disk"><a class="link" href="#_flushing_data_to_disk">Flushing data to disk</a></h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>Flushing data from memory to disk relieves memory pressure and can improve read |
| performance by switching from a write-optimized, row-oriented in-memory format |
| in the <code>MemRowSet</code> to a read-optimized, column-oriented format on disk. |
| Background tasks that flush data include <code>FlushMRSOp</code> and |
| <code>FlushDeltaMemStoresOp</code>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The metrics associated with these ops have the prefix <code>flush_mrs</code> and |
| <code>flush_dms</code>, respectively.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_compacting_on_disk_data"><a class="link" href="#_compacting_on_disk_data">Compacting on-disk data</a></h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>Kudu constantly performs several types of compaction tasks in order to maintain |
| consistent read and write performance over time. A merging compaction, which combines |
| multiple <code>DiskRowSets</code> together into a single <code>DiskRowSet</code>, is run by |
| <code>CompactRowSetsOp</code>. There are two types of delta store compaction operations |
| that may be run as well: <code>MinorDeltaCompactionOp</code> and <code>MajorDeltaCompactionOp</code>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>For more information on what these different types of compaction operations do, |
| please see the |
| <a href="https://github.com/apache/kudu/blob/master/docs/design-docs/tablet.md">Kudu Tablet |
| design document</a>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The metrics associated with these tasks have the prefix <code>compact_rs</code>, |
| <code>delta_minor_compact_rs</code>, and <code>delta_major_compact_rs</code>, respectively.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_write_ahead_log_gc"><a class="link" href="#_write_ahead_log_gc">Write-ahead log GC</a></h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>Kudu maintains a write-ahead log (WAL) per tablet that is split into discrete |
| fixed-size segments. A tablet periodically rolls the WAL to a new log segment |
| when the active segment reaches a configured size (controlled by |
| <code>--log_segment_size_mb</code>). In order to save disk space and decrease startup |
| time, a background task called <code>LogGCOp</code> attempts to garbage-collect (GC) old |
| WAL segments by deleting them from disk once it is determined that they are no |
| longer needed by the local node for durability.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The metrics associated with this background task have the prefix <code>log_gc</code>.</p> |
| </div> |
| </div> |
| </div> |
| <div class="sect1"> |
| <h2 id="_tablet_history_gc_and_the_ancient_history_mark"><a class="link" href="#_tablet_history_gc_and_the_ancient_history_mark">Tablet history GC and the ancient history mark</a></h2> |
| <div class="sectionbody"> |
| <div class="paragraph"> |
| <p>Because Kudu uses a multiversion concurrency control (MVCC) mechanism to |
| ensure that snapshot scans can proceeed isolated from new changes to a table, |
| periodically old historical data should be garbage-collected (removed) to free |
| up disk space. While Kudu never removes rows or data that are visible in the |
| latest version of the data, Kudu does remove records of old changes that are no |
| longer visible.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The point in time in the past beyond which historical MVCC data becomes |
| inaccessible and is free to be deleted is called the <em>ancient history mark</em> |
| (AHM). The AHM can be configured by setting <code>--tablet_history_max_age_sec</code>.</p> |
| </div> |
| <div class="paragraph"> |
| <p>There are two background tasks that GC historical MVCC data older than the AHM: |
| the one that runs the merging compaction, called <code>CompactRowSetsOp</code> (see |
| above), and a separate background task that deletes old undo delta blocks, |
| called <code>UndoDeltaBlockGCOp</code>. Running <code>UndoDeltaBlockGCOp</code> reduces disk space |
| usage in all workloads, but particularly in those with a higher volume of |
| updates or upserts.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The metrics associated with this background task have the prefix |
| <code>undo_delta_block</code>.</p> |
| </div> |
| </div> |
| </div> |
| </div> |
| <div class="col-md-3"> |
| |
| <div id="toc" data-spy="affix" data-offset-top="70"> |
| <ul> |
| |
| <li> |
| |
| <a href="index.html">Introducing Kudu</a> |
| </li> |
| <li> |
| |
| <a href="release_notes.html">Kudu Release Notes</a> |
| </li> |
| <li> |
| |
| <a href="installation.html">Installation Guide</a> |
| </li> |
| <li> |
| |
| <a href="configuration.html">Configuring Kudu</a> |
| </li> |
| <li> |
| |
| <a href="kudu_impala_integration.html">Using Impala with Kudu</a> |
| </li> |
| <li> |
| |
| <a href="administration.html">Administering Kudu</a> |
| </li> |
| <li> |
| |
| <a href="troubleshooting.html">Troubleshooting Kudu</a> |
| </li> |
| <li> |
| |
| <a href="developing.html">Developing Applications with Kudu</a> |
| </li> |
| <li> |
| |
| <a href="schema_design.html">Kudu Schema Design</a> |
| </li> |
| <li> |
| |
| <a href="scaling_guide.html">Kudu Scaling Guide</a> |
| </li> |
| <li> |
| |
| <a href="security.html">Kudu Security</a> |
| </li> |
| <li> |
| |
| <a href="transaction_semantics.html">Kudu Transaction Semantics</a> |
| </li> |
| <li> |
| <span class="active-toc">Background Maintenance Tasks</span> |
| <ul class="sectlevel1"> |
| <li><a href="#_maintenance_manager">Maintenance manager</a></li> |
| <li><a href="#_flushing_data_to_disk">Flushing data to disk</a></li> |
| <li><a href="#_compacting_on_disk_data">Compacting on-disk data</a></li> |
| <li><a href="#_write_ahead_log_gc">Write-ahead log GC</a></li> |
| <li><a href="#_tablet_history_gc_and_the_ancient_history_mark">Tablet history GC and the ancient history mark</a></li> |
| </ul> |
| </li> |
| <li> |
| |
| <a href="configuration_reference.html">Kudu Configuration Reference</a> |
| </li> |
| <li> |
| |
| <a href="command_line_tools_reference.html">Kudu Command Line Tools Reference</a> |
| </li> |
| <li> |
| |
| <a href="known_issues.html">Known Issues and Limitations</a> |
| </li> |
| <li> |
| |
| <a href="contributing.html">Contributing to Kudu</a> |
| </li> |
| <li> |
| |
| <a href="export_control.html">Export Control Notice</a> |
| </li> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| <footer class="footer"> |
| <div class="row"> |
| <div class="col-md-9"> |
| <p class="small"> |
| Copyright © 2019 The Apache Software Foundation. Last updated 2019-03-12 04:38:06 UTC |
| </p> |
| <p class="small"> |
| Apache Kudu, Kudu, Apache, the Apache feather logo, and the Apache Kudu |
| project logo are either registered trademarks or trademarks of The |
| Apache Software Foundation in the United States and other countries. |
| </p> |
| </div> |
| <div class="col-md-3"> |
| <a class="pull-right" href="https://www.apache.org/events/current-event.html"> |
| <img src="https://www.apache.org/events/current-event-234x60.png"/> |
| </a> |
| </div> |
| </div> |
| </footer> |
| </div> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script> |
| <script> |
| // Try to detect touch-screen devices. Note: Many laptops have touch screens. |
| $(document).ready(function() { |
| if ("ontouchstart" in document.documentElement) { |
| $(document.documentElement).addClass("touch"); |
| } else { |
| $(document.documentElement).addClass("no-touch"); |
| } |
| }); |
| </script> |
| <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js" |
| integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" |
| crossorigin="anonymous"></script> |
| <script> |
| (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ |
| (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), |
| m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) |
| })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); |
| |
| ga('create', 'UA-68448017-1', 'auto'); |
| ga('send', 'pageview'); |
| </script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/anchor-js/3.1.0/anchor.js"></script> |
| <script> |
| anchors.options = { |
| placement: 'right', |
| visible: 'touch', |
| }; |
| anchors.add(); |
| </script> |
| </body> |
| </html> |
| |