blob: bb50e6af31bf7c93be060818d372668b348f61be [file] [log] [blame]
<!doctype html>
<html lang="en" dir="ltr" class="docs-wrapper docs-doc-page docs-version-current plugin-docs plugin-id-default docs-doc-id-design/indexer">
<head>
<meta charset="UTF-8">
<meta name="generator" content="Docusaurus v2.4.1">
<title data-rh="true">Indexer Process | Apache® Druid</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://druid.apache.org/img/druid_nav.png"><meta data-rh="true" name="twitter:image" content="https://druid.apache.org/img/druid_nav.png"><meta data-rh="true" property="og:url" content="https://druid.apache.org/docs/latest/design/indexer"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Indexer Process | Apache® Druid"><meta data-rh="true" name="description" content="&lt;!--"><meta data-rh="true" property="og:description" content="&lt;!--"><link data-rh="true" rel="icon" href="/img/favicon.png"><link data-rh="true" rel="canonical" href="https://druid.apache.org/docs/latest/design/indexer"><link data-rh="true" rel="alternate" href="https://druid.apache.org/docs/latest/design/indexer" hreflang="en"><link data-rh="true" rel="alternate" href="https://druid.apache.org/docs/latest/design/indexer" hreflang="x-default"><link rel="preconnect" href="https://www.google-analytics.com">
<link rel="preconnect" href="https://www.googletagmanager.com">
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-131010415-1"></script>
<script>function gtag(){dataLayer.push(arguments)}window.dataLayer=window.dataLayer||[],gtag("js",new Date),gtag("config","UA-131010415-1",{})</script>
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.7.2/css/all.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js"></script><link rel="stylesheet" href="/assets/css/styles.546f39eb.css">
<link rel="preload" href="/assets/js/runtime~main.4c9a7172.js" as="script">
<link rel="preload" href="/assets/js/main.3a5ab01b.js" as="script">
</head>
<body class="navigation-with-keyboard">
<script>!function(){function t(t){document.documentElement.setAttribute("data-theme",t)}var e=function(){var t=null;try{t=new URLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}return t}()||function(){var t=null;try{t=localStorage.getItem("theme")}catch(t){}return t}();t(null!==e?e:"light")}()</script><div id="__docusaurus">
<div role="region" aria-label="Skip to main content"><a class="skipToContent_fXgn" href="#__docusaurus_skipToContent_fallback">Skip to main content</a></div><nav aria-label="Main" class="navbar navbar--fixed-top navbar--dark"><div class="navbar__inner"><div class="navbar__items"><button aria-label="Toggle navigation bar" aria-expanded="false" class="navbar__toggle clean-btn" type="button"><svg width="30" height="30" viewBox="0 0 30 30" aria-hidden="true"><path stroke="currentColor" stroke-linecap="round" stroke-miterlimit="10" stroke-width="2" d="M4 7h22M4 15h22M4 23h22"></path></svg></button><a class="navbar__brand" href="/"><div class="navbar__logo"><img src="/img/druid_nav.png" alt="Apache® Druid" class="themedImage_ToTc themedImage--light_HNdA"><img src="/img/druid_nav.png" alt="Apache® Druid" class="themedImage_ToTc themedImage--dark_i4oU"></div></a></div><div class="navbar__items navbar__items--right"><a class="navbar__item navbar__link" href="/technology">Technology</a><a class="navbar__item navbar__link" href="/use-cases">Use Cases</a><a class="navbar__item navbar__link" href="/druid-powered">Powered By</a><a aria-current="page" class="navbar__item navbar__link navbar__link--active" href="/docs/latest/design/">Docs</a><a class="navbar__item navbar__link" href="/community/">Community</a><div class="navbar__item dropdown dropdown--hoverable dropdown--right"><a href="#" aria-haspopup="true" aria-expanded="false" role="button" class="navbar__link">Apache®</a><ul class="dropdown__menu"><li><a href="https://www.apache.org/" target="_blank" rel="noopener noreferrer" class="dropdown__link">Foundation<svg width="12" height="12" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li><a href="https://apachecon.com/?ref=druid.apache.org" target="_blank" rel="noopener noreferrer" class="dropdown__link">Events<svg width="12" height="12" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li><a href="https://www.apache.org/licenses/" target="_blank" rel="noopener noreferrer" class="dropdown__link">License<svg width="12" height="12" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li><a href="https://www.apache.org/foundation/thanks.html" target="_blank" rel="noopener noreferrer" class="dropdown__link">Thanks<svg width="12" height="12" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li><a href="https://www.apache.org/security/" target="_blank" rel="noopener noreferrer" class="dropdown__link">Security<svg width="12" height="12" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li><li><a href="https://www.apache.org/foundation/sponsorship.html" target="_blank" rel="noopener noreferrer" class="dropdown__link">Sponsorship<svg width="12" height="12" aria-hidden="true" viewBox="0 0 24 24" class="iconExternalLink_nPIU"><path fill="currentColor" d="M21 13v10h-21v-19h12v2h-10v15h17v-8h2zm3-12h-10.988l4.035 4-6.977 7.07 2.828 2.828 6.977-7.07 4.125 4.172v-11z"></path></svg></a></li></ul></div><a class="navbar__item navbar__link" href="/downloads/">Download</a><div class="searchBox_ZlJk"><div class="navbar__search"><span aria-label="expand searchbar" role="button" class="search-icon" tabindex="0"></span><input type="search" id="search_input_react" placeholder="Loading..." aria-label="Search" class="navbar__search-input search-bar" disabled=""></div></div></div></div><div role="presentation" class="navbar-sidebar__backdrop"></div></nav><div id="__docusaurus_skipToContent_fallback" class="main-wrapper mainWrapper_z2l0 docsWrapper_BCFX"><button aria-label="Scroll back to top" class="clean-btn theme-back-to-top-button backToTopButton_sjWU" type="button"></button><div class="docPage__5DB"><main class="docMainContainer_gTbr docMainContainerEnhanced_Uz_u"><div class="container padding-top--md padding-bottom--lg"><div class="row"><div class="col docItemCol_VOVn"><div class="docItemContainer_Djhp"><article><div class="tocCollapsible_ETCw theme-doc-toc-mobile tocMobile_ITEo"><button type="button" class="clean-btn tocCollapsibleButton_TO0P">On this page</button></div><div class="theme-doc-markdown markdown"><header><h1>Indexer Process</h1></header><div class="theme-admonition theme-admonition-info alert alert--info admonition_LlT9"><div class="admonitionHeading_tbUL"><span class="admonitionIcon_kALy"><svg viewBox="0 0 14 16"><path fill-rule="evenodd" d="M7 2.3c3.14 0 5.7 2.56 5.7 5.7s-2.56 5.7-5.7 5.7A5.71 5.71 0 0 1 1.3 8c0-3.14 2.56-5.7 5.7-5.7zM7 1C3.14 1 0 4.14 0 8s3.14 7 7 7 7-3.14 7-7-3.14-7-7-7zm1 3H6v5h2V4zm0 6H6v2h2v-2z"></path></svg></span>info</div><div class="admonitionContent_S0QG"><p> The Indexer is an optional and <a href="/docs/latest/development/experimental">experimental</a> feature.
Its memory management system is still under development and will be significantly enhanced in later releases.</p></div></div><p>The Apache Druid Indexer process is an alternative to the MiddleManager + Peon task execution system. Instead of forking a separate JVM process per-task, the Indexer runs tasks as separate threads within a single JVM process.</p><p>The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks.</p><h3 class="anchor anchorWithStickyNavbar_LWe7" id="configuration">Configuration<a href="#configuration" class="hash-link" aria-label="Direct link to Configuration" title="Direct link to Configuration"></a></h3><p>For Apache Druid Indexer Process Configuration, see <a href="/docs/latest/configuration/#indexer">Indexer Configuration</a>.</p><h3 class="anchor anchorWithStickyNavbar_LWe7" id="http-endpoints">HTTP endpoints<a href="#http-endpoints" class="hash-link" aria-label="Direct link to HTTP endpoints" title="Direct link to HTTP endpoints"></a></h3><p>The Indexer process shares the same HTTP endpoints as the <a href="/docs/latest/api-reference/service-status-api#middlemanager">MiddleManager</a>.</p><h3 class="anchor anchorWithStickyNavbar_LWe7" id="running">Running<a href="#running" class="hash-link" aria-label="Direct link to Running" title="Direct link to Running"></a></h3><div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">org.apache.druid.cli.Main server indexer</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="task-resource-sharing">Task resource sharing<a href="#task-resource-sharing" class="hash-link" aria-label="Direct link to Task resource sharing" title="Direct link to Task resource sharing"></a></h3><p>The following resources are shared across all tasks running inside an Indexer process.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="query-resources">Query resources<a href="#query-resources" class="hash-link" aria-label="Direct link to Query resources" title="Direct link to Query resources"></a></h4><p>The query processing threads and buffers are shared across all tasks. The Indexer will serve queries from a single endpoint shared by all tasks.</p><p>If <a href="/docs/latest/configuration/#indexer-caching">query caching</a> is enabled, the query cache is also shared across all tasks.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="server-http-threads">Server HTTP threads<a href="#server-http-threads" class="hash-link" aria-label="Direct link to Server HTTP threads" title="Direct link to Server HTTP threads"></a></h4><p>The Indexer maintains two equally sized pools of HTTP threads. </p><p>One pool is exclusively used for task control messages between the Overlord and the Indexer (&quot;chat handler threads&quot;). The other pool is used for handling all other HTTP requests.</p><p>The size of the pools are configured by the <code>druid.server.http.numThreads</code> configuration (e.g., if this is set to 10, there will be 10 chat handler threads and 10 non-chat handler threads).</p><p>In addition to these two pools, 2 separate threads are allocated for lookup handling. If lookups are not used, these threads will not be used.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="memory-sharing">Memory sharing<a href="#memory-sharing" class="hash-link" aria-label="Direct link to Memory sharing" title="Direct link to Memory sharing"></a></h4><p>The Indexer uses the <code>druid.worker.globalIngestionHeapLimitBytes</code> configuration to impose a global heap limit across all of the tasks it is running. </p><p>This global limit is evenly divided across the number of task slots configured by <code>druid.worker.capacity</code>.</p><p>To apply the per-task heap limit, the Indexer will override <code>maxBytesInMemory</code> in task tuning configs (i.e., ignoring the default value or any user configured value). <code>maxRowsInMemory</code> will also be overridden to an essentially unlimited value: the Indexer does not support row limits.</p><p>By default, <code>druid.worker.globalIngestionHeapLimitBytes</code> is set to 1/6th of the available JVM heap. This default is chosen to align with the default value of <code>maxBytesInMemory</code> in task tuning configs when using the MiddleManager/Peon system, which is also 1/6th of the JVM heap.</p><p>The peak usage for rows held in heap memory relates to the interaction between the <code>maxBytesInMemory</code> and <code>maxPendingPersists</code> properties in the task tuning configs. When the amount of row data held in-heap by a task reaches the limit specified by <code>maxBytesInMemory</code>, a task will persist the in-heap row data. After the persist has been started, the task can again ingest up to <code>maxBytesInMemory</code> bytes worth of row data while the persist is running.</p><p>This means that the peak in-heap usage for row data can be up to approximately <code>maxBytesInMemory</code> * (2 + <code>maxPendingPersists</code>). The default value of <code>maxPendingPersists</code> is 0, which allows for 1 persist to run concurrently with ingestion work.</p><p>The remaining portion of the heap is reserved for query processing and segment persist/merge operations, and miscellaneous heap usage.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="concurrent-segment-persistmerge-limits">Concurrent segment persist/merge limits<a href="#concurrent-segment-persistmerge-limits" class="hash-link" aria-label="Direct link to Concurrent segment persist/merge limits" title="Direct link to Concurrent segment persist/merge limits"></a></h4><p>To help reduce peak memory usage, the Indexer imposes a limit on the number of concurrent segment persist/merge operations across all running tasks.</p><p>By default, the number of concurrent persist/merge operations is limited to (<code>druid.worker.capacity</code> / 2), rounded down. This limit can be configured with the <code>druid.worker.numConcurrentMerges</code> property.</p><h3 class="anchor anchorWithStickyNavbar_LWe7" id="current-limitations">Current limitations<a href="#current-limitations" class="hash-link" aria-label="Direct link to Current limitations" title="Direct link to Current limitations"></a></h3><p>Separate task logs are not currently supported when using the Indexer; all task log messages will instead be logged in the Indexer process log.</p><p>The Indexer currently imposes an identical memory limit on each task. In later releases, the per-task memory limit will be removed and only the global limit will apply. The limit on concurrent merges will also be removed.</p><p>In later releases, per-task memory usage will be dynamically managed. Please see <a href="https://github.com/apache/druid/issues/7900" target="_blank" rel="noopener noreferrer">https://github.com/apache/druid/issues/7900</a> for details on future enhancements to the Indexer.</p></div></article><nav class="pagination-nav docusaurus-mt-lg" aria-label="Docs pages"></nav></div></div><div class="col col--3"><div class="tableOfContents_bqdL thin-scrollbar theme-doc-toc-desktop"><ul class="table-of-contents table-of-contents__left-border"><li><a href="#configuration" class="table-of-contents__link toc-highlight">Configuration</a></li><li><a href="#http-endpoints" class="table-of-contents__link toc-highlight">HTTP endpoints</a></li><li><a href="#running" class="table-of-contents__link toc-highlight">Running</a></li><li><a href="#task-resource-sharing" class="table-of-contents__link toc-highlight">Task resource sharing</a></li><li><a href="#current-limitations" class="table-of-contents__link toc-highlight">Current limitations</a></li></ul></div></div></div></div></main></div></div><footer class="footer"><div class="container container-fluid"><div class="footer__bottom text--center"><div class="margin-bottom--sm"><img src="/img/favicon.png" class="themedImage_ToTc themedImage--light_HNdA footer__logo"><img src="/img/favicon.png" class="themedImage_ToTc themedImage--dark_i4oU footer__logo"></div><div class="footer__copyright">Copyright © 2023 Apache Software Foundation. Except where otherwise noted, licensed under CC BY-SA 4.0. Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</div></div></div></footer></div>
<script src="/assets/js/runtime~main.4c9a7172.js"></script>
<script src="/assets/js/main.3a5ab01b.js"></script>
</body>
</html>