releases/1.17.1/docs/scaling_guide.html - kudu - Git at Google

 ---
 title: Apache Kudu Scaling Guide
 layout: default
 active_nav: docs
 last_updated: 'Last updated 2024-10-28 17:38:51 -0700'
 ---
 <!--

 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 -->


 <div class="container">
   <div class="row">
     <div class="col-md-9">

 <h1>Apache Kudu Scaling Guide</h1>
       <div id="preamble">
 <div class="sectionbody">
 <div class="paragraph">
 <p>This document describes in detail how Kudu scales with respect to various system resources,
 including memory, file descriptors, and threads. See the
 <a href="known_issues.html#_scale">scaling limits</a> for the maximum recommended parameters of a Kudu
 cluster. They can be used to estimate roughly the number of servers required for a given quantity
 of data.</p>
 </div>
 <div class="admonitionblock warning">
 <table>
 <tr>
 <td class="icon">
 <i class="fa icon-warning" title="Warning"></i>
 </td>
 <td class="content">
 The recommendations and conclusions here are only approximations. Appropriate numbers
 depend on use case. There is no substitute for measurement and monitoring of resources used during a
 representative workload.
 </td>
 </tr>
 </table>
 </div>
 </div>
 </div>
 <div class="sect1">
 <h2 id="_terms"><a class="link" href="#_terms">Terms</a></h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>We will use the following terms:</p>
 </div>
 <div class="ulist">
 <ul>
 <li>
 <p><strong>hot replica</strong>: A tablet replica that is continuously receiving writes. For example, in a time
 series use case, tablet replicas for the most recent range partition on a time column would be
 continuously receiving the latest data, and would be hot replicas.</p>
 </li>
 <li>
 <p><strong>cold replica</strong>: A tablet replica that is not hot, i.e. a replica that is not frequently receiving
 writes, for example, once every few minutes. A cold replica may be read from. For example, in a time
 series use case, tablet replicas for previous range partitions on a time column would not receive
 writes at all, or only occasionally receive late updates or additions, but may be constantly read.</p>
 </li>
 <li>
 <p><strong>data on disk</strong>: The total amount of data stored on a tablet server across all disks,
 post-replication, post-compression, and post-encoding.</p>
 </li>
 </ul>
 </div>
 </div>
 </div>
 <div class="sect1">
 <h2 id="_example_workload"><a class="link" href="#_example_workload">Example Workload</a></h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>The sections below perform sample calculations using the following parameters:</p>
 </div>
 <div class="ulist">
 <ul>
 <li>
 <p>200 hot replicas per tablet server</p>
 </li>
 <li>
 <p>1600 cold replicas per tablet server</p>
 </li>
 <li>
 <p>8TB of data on disk per tablet server (about 4.5GB/replica)</p>
 </li>
 <li>
 <p>512MB block cache</p>
 </li>
 <li>
 <p>40 cores per server</p>
 </li>
 <li>
 <p>limit of 32000 file descriptors per server</p>
 </li>
 <li>
 <p>a read workload with 1 frequently-scanned table with 40 columns</p>
 </li>
 </ul>
 </div>
 <div class="paragraph">
 <p>This workload resembles a time series use case, where the hot replicas correspond to the most recent
 range partition on time.</p>
 </div>
 </div>
 </div>
 <div class="sect1">
 <h2 id="memory"><a class="link" href="#memory">Memory</a></h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>The flag <code>--memory_limit_hard_bytes</code> determines the maximum amount of memory that a Kudu tablet
 server may use. The amount of memory used by a tablet server scales with data size, write workload,
 and read concurrency. The following table provides numbers that can be used to compute a rough
 estimate of memory usage.</p>
 </div>
 <table class="tableblock frame-all grid-all stretch">
 <caption class="title">Table 1. Tablet Server Memory Usage</caption>
 <colgroup>
 <col style="width: 33.3333%;">
 <col style="width: 33.3333%;">
 <col style="width: 33.3334%;">
 </colgroup>
 <thead>
 <tr>
 <th class="tableblock halign-left valign-top">Type</th>
 <th class="tableblock halign-left valign-top">Multiplier</th>
 <th class="tableblock halign-left valign-top">Description</th>
 </tr>
 </thead>
 <tbody>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Memory required per TB of data on disk</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">1.5GB per 1TB data on disk</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Amount of memory per unit of data on disk required for
 basic operation of the tablet server.</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Hot Replicas' MemRowSets and DeltaMemStores</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">minimum 128MB per hot replica</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Minimum amount of data
 to flush per MemRowSet flush. For most use cases, updates should be rare compared to inserts, so the
 DeltaMemStores should be very small.</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Scans</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">256KB per column per core for read-heavy tables</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Amount of memory used by scanners, and which
 will be constantly needed for tables which are constantly read.</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Block Cache</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Fixed by <code>--block_cache_capacity_mb</code> (default 512MB)</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Amount of memory reserved for use by the
 block cache.</p></td>
 </tr>
 </tbody>
 </table>
 <div class="paragraph">
 <p>Using this information for the example load gives the following breakdown of memory usage:</p>
 </div>
 <table class="tableblock frame-all grid-all stretch">
 <caption class="title">Table 2. Example Tablet Server Memory Usage</caption>
 <colgroup>
 <col style="width: 50%;">
 <col style="width: 50%;">
 </colgroup>
 <thead>
 <tr>
 <th class="tableblock halign-left valign-top">Type</th>
 <th class="tableblock halign-left valign-top">Amount</th>
 </tr>
 </thead>
 <tbody>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">8TB data on disk</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">8TB * 1.5GB / 1TB = 12GB</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">200 hot replicas</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">200 * 128MB = 25.6GB</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">1 40-column, frequently-scanned table</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">40 * 40 * 256KB = 409.6MB</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Block Cache</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock"><code>--block_cache_capacity_mb=512</code> = 512MB</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Expected memory usage</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">38.5GB</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Recommended hard limit</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">52GB</p></td>
 </tr>
 </tbody>
 </table>
 <div class="paragraph">
 <p>Using this as a rough estimate of Kudu&#8217;s memory usage, select a memory limit so that the expected
 memory usage of Kudu is around 50-75% of the hard limit.</p>
 </div>
 <div class="sect2">
 <h3 id="_verifying_if_a_memory_limit_is_sufficient"><a class="link" href="#_verifying_if_a_memory_limit_is_sufficient">Verifying if a Memory Limit is sufficient</a></h3>
 <div class="paragraph">
 <p>After configuring an appropriate memory limit with <code>--memory_limit_hard_bytes</code>, run a workload and
 monitor the Kudu tablet server process&#8217;s RAM usage. The memory usage should stay around 50-75% of
 the hard limit, with occasional spikes above 75% but below 100%. If the tablet server runs above 75%
 consistently, the memory limit should be increased.</p>
 </div>
 <div class="paragraph">
 <p>Additionally, it&#8217;s also useful to monitor the logs for memory rejections, which look like:</p>
 </div>
 <div class="listingblock">
 <div class="content">
 <pre>Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)</pre>
 </div>
 </div>
 <div class="paragraph">
 <p>and watch the memory rejections metrics:</p>
 </div>
 <div class="ulist">
 <ul>
 <li>
 <p><code>leader_memory_pressure_rejections</code></p>
 </li>
 <li>
 <p><code>follower_memory_pressure_rejections</code></p>
 </li>
 <li>
 <p><code>transaction_memory_pressure_rejections</code></p>
 </li>
 </ul>
 </div>
 <div class="paragraph">
 <p>Occasional rejections due to memory pressure are fine and act as backpressure to clients. Clients
 will transparently retry operations. However, no operations should time out.</p>
 </div>
 </div>
 </div>
 </div>
 <div class="sect1">
 <h2 id="file_descriptors"><a class="link" href="#file_descriptors">File Descriptors</a></h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Processes are allotted a maximum number of open file descriptors (also referred to as fds). If a
 tablet server attempts to open too many fds, it will typically crash with a message saying something
 like "too many open files". The following table summarizes the sources of file descriptor usage in a
 Kudu tablet server process:</p>
 </div>
 <table class="tableblock frame-all grid-all stretch">
 <caption class="title">Table 3. Tablet Server File Descriptor Usage</caption>
 <colgroup>
 <col style="width: 33.3333%;">
 <col style="width: 33.3333%;">
 <col style="width: 33.3334%;">
 </colgroup>
 <thead>
 <tr>
 <th class="tableblock halign-left valign-top">Type</th>
 <th class="tableblock halign-left valign-top">Multiplier</th>
 <th class="tableblock halign-left valign-top">Description</th>
 </tr>
 </thead>
 <tbody>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">File cache</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Fixed by <code>--block_manager_max_open_files</code> (default 40% of process maximum)</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Maximum allowed open fds reserved for use by
 the file cache.</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Hot replicas</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">2 per WAL segment, 1 per WAL index</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Number of fds used by hot replicas. See below
 for more explanation.</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Cold replicas</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">3 per cold replica</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Number of fds used per cold replica: 2 for the single WAL
 segment and 1 for the single WAL index.</p></td>
 </tr>
 </tbody>
 </table>
 <div class="paragraph">
 <p>Every replica has at least one WAL segment and at least one WAL index, and should have the same
 number of segments and indices; however, the number of segments and indices can be greater for a
 replica if one of its peer replicas is falling behind. WAL segment and index fds are closed as WALs
 are garbage collected.</p>
 </div>
 <div class="paragraph">
 <p>Using this information for the example load gives the following breakdown of file descriptor usage,
 under the assumption that some replicas are lagging and using 10 WAL segments:</p>
 </div>
 <table class="tableblock frame-all grid-all stretch">
 <caption class="title">Table 4. Example Tablet Server File Descriptor Usage</caption>
 <colgroup>
 <col style="width: 50%;">
 <col style="width: 50%;">
 </colgroup>
 <thead>
 <tr>
 <th class="tableblock halign-left valign-top">Type</th>
 <th class="tableblock halign-left valign-top">Amount</th>
 </tr>
 </thead>
 <tbody>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">file cache</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">40% * 32000 fds = 12800 fds</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">1600 cold replicas</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">1600 cold replicas * 3 fds / cold replica = 4800 fds</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">200 hot replicas</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">(2 / segment * 10 segments/hot replica * 200 hot replicas) + (1 / index * 10 indices / hot replica * 200 hot replicas) = 6000 fds</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><p class="tableblock">Total</p></td>
 <td class="tableblock halign-left valign-top"><p class="tableblock">23600 fds</p></td>
 </tr>
 </tbody>
 </table>
 <div class="paragraph">
 <p>So for this example, the tablet server process has about 32000 - 23600 = 8400 fds to spare.</p>
 </div>
 <div class="paragraph">
 <p>There is typically no downside to configuring a higher file descriptor limit if approaching the
 currently configured limit.</p>
 </div>
 </div>
 </div>
 <div class="sect1">
 <h2 id="threads"><a class="link" href="#threads">Threads</a></h2>
 <div class="sectionbody">
 <div class="paragraph">
 <p>Processes are allotted a maximum number of threads by the operating system, and this limit is
 typically difficult or impossible to change. Therefore, this section is more informational than
 advisory.</p>
 </div>
 <div class="paragraph">
 <p>If a Kudu tablet server&#8217;s thread count exceeds the OS limit, it will crash, usually with a message
 in the logs like "pthread_create failed: Resource temporarily unavailable". If the system thread
 count limit is exceeded, other processes on the same node may also crash.</p>
 </div>
 <div class="paragraph">
 <p>Threads and threadpools are used all over Kudu for various purposes, but the number of threads found
 in nearly all of these does not scale with load or data/tablet size; instead, the number of threads
 is either a hardcoded constant, a constant defined by a configuration parameter, or based on a
 static dimension (such as the number of CPU cores).</p>
 </div>
 <div class="paragraph">
 <p>The only exception to this is the WAL append thread, one of which exists for every "hot" replica.
 Note that all replicas may be considered hot at startup, so tablet servers' thread usage will
 generally peak when started and settle down thereafter.</p>
 </div>
 </div>
 </div>
     </div>
     <div class="col-md-3">

   <div id="toc" data-spy="affix" data-offset-top="70">
   <ul>

       <li>

           <a href="index.html">Introducing Kudu</a>
       </li>
       <li>

           <a href="release_notes.html">Kudu Release Notes</a>
       </li>
       <li>

           <a href="quickstart.html">Quickstart Guide</a>
       </li>
       <li>

           <a href="installation.html">Installation Guide</a>
       </li>
       <li>

           <a href="configuration.html">Configuring Kudu</a>
       </li>
       <li>

           <a href="hive_metastore.html">Using the Hive Metastore with Kudu</a>
       </li>
       <li>

           <a href="kudu_impala_integration.html">Using Impala with Kudu</a>
       </li>
       <li>

           <a href="administration.html">Administering Kudu</a>
       </li>
       <li>

           <a href="troubleshooting.html">Troubleshooting Kudu</a>
       </li>
       <li>

           <a href="developing.html">Developing Applications with Kudu</a>
       </li>
       <li>

           <a href="schema_design.html">Kudu Schema Design</a>
       </li>
       <li>
 <span class="active-toc">Kudu Scaling Guide</span>
             <ul class="sectlevel1">
 <li><a href="#_terms">Terms</a></li>
 <li><a href="#_example_workload">Example Workload</a></li>
 <li><a href="#memory">Memory</a>
 <ul class="sectlevel2">
 <li><a href="#_verifying_if_a_memory_limit_is_sufficient">Verifying if a Memory Limit is sufficient</a></li>
 </ul>
 </li>
 <li><a href="#file_descriptors">File Descriptors</a></li>
 <li><a href="#threads">Threads</a></li>
 </ul>
       </li>
       <li>

           <a href="security.html">Kudu Security</a>
       </li>
       <li>

           <a href="transaction_semantics.html">Kudu Transaction Semantics</a>
       </li>
       <li>

           <a href="background_tasks.html">Background Maintenance Tasks</a>
       </li>
       <li>

           <a href="configuration_reference.html">Kudu Configuration Reference</a>
       </li>
       <li>

           <a href="command_line_tools_reference.html">Kudu Command Line Tools Reference</a>
       </li>
       <li>

           <a href="metrics_reference.html">Kudu Metrics Reference</a>
       </li>
       <li>

           <a href="known_issues.html">Known Issues and Limitations</a>
       </li>
       <li>

           <a href="contributing.html">Contributing to Kudu</a>
       </li>
       <li>

           <a href="export_control.html">Export Control Notice</a>
       </li>
   </ul>
   </div>
     </div>
   </div>
 </div>
	---
	title: Apache Kudu Scaling Guide
	layout: default
	active_nav: docs
	last_updated: 'Last updated 2024-10-28 17:38:51 -0700'
	---
	<!--

	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->


	<div class="container">
	<div class="row">
	<div class="col-md-9">

	<h1>Apache Kudu Scaling Guide</h1>
	<div id="preamble">
	<div class="sectionbody">
	<div class="paragraph">
	<p>This document describes in detail how Kudu scales with respect to various system resources,
	including memory, file descriptors, and threads. See the
	<a href="known_issues.html#_scale">scaling limits</a> for the maximum recommended parameters of a Kudu
	cluster. They can be used to estimate roughly the number of servers required for a given quantity
	of data.</p>
	</div>
	<div class="admonitionblock warning">
	<table>
	<tr>
	<td class="icon">
	<i class="fa icon-warning" title="Warning"></i>
	</td>
	<td class="content">
	The recommendations and conclusions here are only approximations. Appropriate numbers
	depend on use case. There is no substitute for measurement and monitoring of resources used during a
	representative workload.
	</td>
	</tr>
	</table>
	</div>
	</div>
	</div>
	<div class="sect1">
	<h2 id="_terms"><a class="link" href="#_terms">Terms</a></h2>
	<div class="sectionbody">
	<div class="paragraph">
	<p>We will use the following terms:</p>
	</div>
	<div class="ulist">
	<ul>
	<li>
	<p><strong>hot replica</strong>: A tablet replica that is continuously receiving writes. For example, in a time
	series use case, tablet replicas for the most recent range partition on a time column would be
	continuously receiving the latest data, and would be hot replicas.</p>
	</li>
	<li>
	<p><strong>cold replica</strong>: A tablet replica that is not hot, i.e. a replica that is not frequently receiving
	writes, for example, once every few minutes. A cold replica may be read from. For example, in a time
	series use case, tablet replicas for previous range partitions on a time column would not receive
	writes at all, or only occasionally receive late updates or additions, but may be constantly read.</p>
	</li>
	<li>
	<p><strong>data on disk</strong>: The total amount of data stored on a tablet server across all disks,
	post-replication, post-compression, and post-encoding.</p>
	</li>
	</ul>
	</div>
	</div>
	</div>
	<div class="sect1">
	<h2 id="_example_workload"><a class="link" href="#_example_workload">Example Workload</a></h2>
	<div class="sectionbody">
	<div class="paragraph">
	<p>The sections below perform sample calculations using the following parameters:</p>
	</div>
	<div class="ulist">
	<ul>
	<li>
	<p>200 hot replicas per tablet server</p>
	</li>
	<li>
	<p>1600 cold replicas per tablet server</p>
	</li>
	<li>
	<p>8TB of data on disk per tablet server (about 4.5GB/replica)</p>
	</li>
	<li>
	<p>512MB block cache</p>
	</li>
	<li>
	<p>40 cores per server</p>
	</li>
	<li>
	<p>limit of 32000 file descriptors per server</p>
	</li>
	<li>
	<p>a read workload with 1 frequently-scanned table with 40 columns</p>
	</li>
	</ul>
	</div>
	<div class="paragraph">
	<p>This workload resembles a time series use case, where the hot replicas correspond to the most recent
	range partition on time.</p>
	</div>
	</div>
	</div>
	<div class="sect1">
	<h2 id="memory"><a class="link" href="#memory">Memory</a></h2>
	<div class="sectionbody">
	<div class="paragraph">
	<p>The flag <code>--memory_limit_hard_bytes</code> determines the maximum amount of memory that a Kudu tablet
	server may use. The amount of memory used by a tablet server scales with data size, write workload,
	and read concurrency. The following table provides numbers that can be used to compute a rough
	estimate of memory usage.</p>
	</div>
	<table class="tableblock frame-all grid-all stretch">
	<caption class="title">Table 1. Tablet Server Memory Usage</caption>
	<colgroup>
	<col style="width: 33.3333%;">
	<col style="width: 33.3333%;">
	<col style="width: 33.3334%;">
	</colgroup>
	<thead>
	<tr>
	<th class="tableblock halign-left valign-top">Type</th>
	<th class="tableblock halign-left valign-top">Multiplier</th>
	<th class="tableblock halign-left valign-top">Description</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Memory required per TB of data on disk</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">1.5GB per 1TB data on disk</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Amount of memory per unit of data on disk required for
	basic operation of the tablet server.</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Hot Replicas' MemRowSets and DeltaMemStores</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">minimum 128MB per hot replica</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Minimum amount of data
	to flush per MemRowSet flush. For most use cases, updates should be rare compared to inserts, so the
	DeltaMemStores should be very small.</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Scans</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">256KB per column per core for read-heavy tables</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Amount of memory used by scanners, and which
	will be constantly needed for tables which are constantly read.</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Block Cache</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Fixed by <code>--block_cache_capacity_mb</code> (default 512MB)</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Amount of memory reserved for use by the
	block cache.</p></td>
	</tr>
	</tbody>
	</table>
	<div class="paragraph">
	<p>Using this information for the example load gives the following breakdown of memory usage:</p>
	</div>
	<table class="tableblock frame-all grid-all stretch">
	<caption class="title">Table 2. Example Tablet Server Memory Usage</caption>
	<colgroup>
	<col style="width: 50%;">
	<col style="width: 50%;">
	</colgroup>
	<thead>
	<tr>
	<th class="tableblock halign-left valign-top">Type</th>
	<th class="tableblock halign-left valign-top">Amount</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">8TB data on disk</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">8TB * 1.5GB / 1TB = 12GB</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">200 hot replicas</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">200 * 128MB = 25.6GB</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">1 40-column, frequently-scanned table</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">40 * 40 * 256KB = 409.6MB</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Block Cache</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock"><code>--block_cache_capacity_mb=512</code> = 512MB</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Expected memory usage</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">38.5GB</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Recommended hard limit</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">52GB</p></td>
	</tr>
	</tbody>
	</table>
	<div class="paragraph">
	<p>Using this as a rough estimate of Kudu’s memory usage, select a memory limit so that the expected
	memory usage of Kudu is around 50-75% of the hard limit.</p>
	</div>
	<div class="sect2">
	<h3 id="_verifying_if_a_memory_limit_is_sufficient"><a class="link" href="#_verifying_if_a_memory_limit_is_sufficient">Verifying if a Memory Limit is sufficient</a></h3>
	<div class="paragraph">
	<p>After configuring an appropriate memory limit with <code>--memory_limit_hard_bytes</code>, run a workload and
	monitor the Kudu tablet server process’s RAM usage. The memory usage should stay around 50-75% of
	the hard limit, with occasional spikes above 75% but below 100%. If the tablet server runs above 75%
	consistently, the memory limit should be increased.</p>
	</div>
	<div class="paragraph">
	<p>Additionally, it’s also useful to monitor the logs for memory rejections, which look like:</p>
	</div>
	<div class="listingblock">
	<div class="content">
	<pre>Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)</pre>
	</div>
	</div>
	<div class="paragraph">
	<p>and watch the memory rejections metrics:</p>
	</div>
	<div class="ulist">
	<ul>
	<li>
	<p><code>leader_memory_pressure_rejections</code></p>
	</li>
	<li>
	<p><code>follower_memory_pressure_rejections</code></p>
	</li>
	<li>
	<p><code>transaction_memory_pressure_rejections</code></p>
	</li>
	</ul>
	</div>
	<div class="paragraph">
	<p>Occasional rejections due to memory pressure are fine and act as backpressure to clients. Clients
	will transparently retry operations. However, no operations should time out.</p>
	</div>
	</div>
	</div>
	</div>
	<div class="sect1">
	<h2 id="file_descriptors"><a class="link" href="#file_descriptors">File Descriptors</a></h2>
	<div class="sectionbody">
	<div class="paragraph">
	<p>Processes are allotted a maximum number of open file descriptors (also referred to as fds). If a
	tablet server attempts to open too many fds, it will typically crash with a message saying something
	like "too many open files". The following table summarizes the sources of file descriptor usage in a
	Kudu tablet server process:</p>
	</div>
	<table class="tableblock frame-all grid-all stretch">
	<caption class="title">Table 3. Tablet Server File Descriptor Usage</caption>
	<colgroup>
	<col style="width: 33.3333%;">
	<col style="width: 33.3333%;">
	<col style="width: 33.3334%;">
	</colgroup>
	<thead>
	<tr>
	<th class="tableblock halign-left valign-top">Type</th>
	<th class="tableblock halign-left valign-top">Multiplier</th>
	<th class="tableblock halign-left valign-top">Description</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">File cache</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Fixed by <code>--block_manager_max_open_files</code> (default 40% of process maximum)</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Maximum allowed open fds reserved for use by
	the file cache.</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Hot replicas</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">2 per WAL segment, 1 per WAL index</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Number of fds used by hot replicas. See below
	for more explanation.</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Cold replicas</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">3 per cold replica</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Number of fds used per cold replica: 2 for the single WAL
	segment and 1 for the single WAL index.</p></td>
	</tr>
	</tbody>
	</table>
	<div class="paragraph">
	<p>Every replica has at least one WAL segment and at least one WAL index, and should have the same
	number of segments and indices; however, the number of segments and indices can be greater for a
	replica if one of its peer replicas is falling behind. WAL segment and index fds are closed as WALs
	are garbage collected.</p>
	</div>
	<div class="paragraph">
	<p>Using this information for the example load gives the following breakdown of file descriptor usage,
	under the assumption that some replicas are lagging and using 10 WAL segments:</p>
	</div>
	<table class="tableblock frame-all grid-all stretch">
	<caption class="title">Table 4. Example Tablet Server File Descriptor Usage</caption>
	<colgroup>
	<col style="width: 50%;">
	<col style="width: 50%;">
	</colgroup>
	<thead>
	<tr>
	<th class="tableblock halign-left valign-top">Type</th>
	<th class="tableblock halign-left valign-top">Amount</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">file cache</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">40% * 32000 fds = 12800 fds</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">1600 cold replicas</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">1600 cold replicas * 3 fds / cold replica = 4800 fds</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">200 hot replicas</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">(2 / segment * 10 segments/hot replica * 200 hot replicas) + (1 / index * 10 indices / hot replica * 200 hot replicas) = 6000 fds</p></td>
	</tr>
	<tr>
	<td class="tableblock halign-left valign-top"><p class="tableblock">Total</p></td>
	<td class="tableblock halign-left valign-top"><p class="tableblock">23600 fds</p></td>
	</tr>
	</tbody>
	</table>
	<div class="paragraph">
	<p>So for this example, the tablet server process has about 32000 - 23600 = 8400 fds to spare.</p>
	</div>
	<div class="paragraph">
	<p>There is typically no downside to configuring a higher file descriptor limit if approaching the
	currently configured limit.</p>
	</div>
	</div>
	</div>
	<div class="sect1">
	<h2 id="threads"><a class="link" href="#threads">Threads</a></h2>
	<div class="sectionbody">
	<div class="paragraph">
	<p>Processes are allotted a maximum number of threads by the operating system, and this limit is
	typically difficult or impossible to change. Therefore, this section is more informational than
	advisory.</p>
	</div>
	<div class="paragraph">
	<p>If a Kudu tablet server’s thread count exceeds the OS limit, it will crash, usually with a message
	in the logs like "pthread_create failed: Resource temporarily unavailable". If the system thread
	count limit is exceeded, other processes on the same node may also crash.</p>
	</div>
	<div class="paragraph">
	<p>Threads and threadpools are used all over Kudu for various purposes, but the number of threads found
	in nearly all of these does not scale with load or data/tablet size; instead, the number of threads
	is either a hardcoded constant, a constant defined by a configuration parameter, or based on a
	static dimension (such as the number of CPU cores).</p>
	</div>
	<div class="paragraph">
	<p>The only exception to this is the WAL append thread, one of which exists for every "hot" replica.
	Note that all replicas may be considered hot at startup, so tablet servers' thread usage will
	generally peak when started and settle down thereafter.</p>
	</div>
	</div>
	</div>
	</div>
	<div class="col-md-3">

	<div id="toc" data-spy="affix" data-offset-top="70">
	<ul>

	<li>

	<a href="index.html">Introducing Kudu</a>
	</li>
	<li>

	<a href="release_notes.html">Kudu Release Notes</a>
	</li>
	<li>

	<a href="quickstart.html">Quickstart Guide</a>
	</li>
	<li>

	<a href="installation.html">Installation Guide</a>
	</li>
	<li>

	<a href="configuration.html">Configuring Kudu</a>
	</li>
	<li>

	<a href="hive_metastore.html">Using the Hive Metastore with Kudu</a>
	</li>
	<li>

	<a href="kudu_impala_integration.html">Using Impala with Kudu</a>
	</li>
	<li>

	<a href="administration.html">Administering Kudu</a>
	</li>
	<li>

	<a href="troubleshooting.html">Troubleshooting Kudu</a>
	</li>
	<li>

	<a href="developing.html">Developing Applications with Kudu</a>
	</li>
	<li>

	<a href="schema_design.html">Kudu Schema Design</a>
	</li>
	<li>
	<span class="active-toc">Kudu Scaling Guide</span>
	<ul class="sectlevel1">
	<li><a href="#_terms">Terms</a></li>
	<li><a href="#_example_workload">Example Workload</a></li>
	<li><a href="#memory">Memory</a>
	<ul class="sectlevel2">
	<li><a href="#_verifying_if_a_memory_limit_is_sufficient">Verifying if a Memory Limit is sufficient</a></li>
	</ul>
	</li>
	<li><a href="#file_descriptors">File Descriptors</a></li>
	<li><a href="#threads">Threads</a></li>
	</ul>
	</li>
	<li>

	<a href="security.html">Kudu Security</a>
	</li>
	<li>

	<a href="transaction_semantics.html">Kudu Transaction Semantics</a>
	</li>
	<li>

	<a href="background_tasks.html">Background Maintenance Tasks</a>
	</li>
	<li>

	<a href="configuration_reference.html">Kudu Configuration Reference</a>
	</li>
	<li>

	<a href="command_line_tools_reference.html">Kudu Command Line Tools Reference</a>
	</li>
	<li>

	<a href="metrics_reference.html">Kudu Metrics Reference</a>
	</li>
	<li>

	<a href="known_issues.html">Known Issues and Limitations</a>
	</li>
	<li>

	<a href="contributing.html">Contributing to Kudu</a>
	</li>
	<li>

	<a href="export_control.html">Export Control Notice</a>
	</li>
	</ul>
	</div>
	</div>
	</div>
	</div>