blob: 576e6e7da3ac3c3dfc44a5a4303866f323ba21fd [file] [log] [blame]
<table class="configuration table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 20%">Key</th>
<th class="text-left" style="width: 15%">Default</th>
<th class="text-left" style="width: 10%">Type</th>
<th class="text-left" style="width: 55%">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><h5>cluster.evenly-spread-out-slots</h5></td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>Enable the slot spread out allocation strategy. This strategy tries to spread out the slots evenly across all available <code class="highlighter-rouge">TaskExecutors</code>.</td>
</tr>
<tr>
<td><h5>execution.batch.adaptive.auto-parallelism.avg-data-volume-per-task</h5></td>
<td style="word-wrap: break-word;">16 mb</td>
<td>MemorySize</td>
<td>The average size of data volume to expect each task instance to process if <code class="highlighter-rouge">jobmanager.scheduler</code> has been set to <code class="highlighter-rouge">AdaptiveBatch</code>. Note that when data skew occurs or the decided parallelism reaches the <code class="highlighter-rouge">execution.batch.adaptive.auto-parallelism.max-parallelism</code> (due to too much data), the data actually processed by some tasks may far exceed this value.</td>
</tr>
<tr>
<td><h5>execution.batch.adaptive.auto-parallelism.default-source-parallelism</h5></td>
<td style="word-wrap: break-word;">1</td>
<td>Integer</td>
<td>The default parallelism of source vertices if <code class="highlighter-rouge">jobmanager.scheduler</code> has been set to <code class="highlighter-rouge">AdaptiveBatch</code></td>
</tr>
<tr>
<td><h5>execution.batch.adaptive.auto-parallelism.enabled</h5></td>
<td style="word-wrap: break-word;">true</td>
<td>Boolean</td>
<td>If true, Flink will automatically decide the parallelism of operators in batch jobs.</td>
</tr>
<tr>
<td><h5>execution.batch.adaptive.auto-parallelism.max-parallelism</h5></td>
<td style="word-wrap: break-word;">128</td>
<td>Integer</td>
<td>The upper bound of allowed parallelism to set adaptively if <code class="highlighter-rouge">jobmanager.scheduler</code> has been set to <code class="highlighter-rouge">AdaptiveBatch</code></td>
</tr>
<tr>
<td><h5>execution.batch.adaptive.auto-parallelism.min-parallelism</h5></td>
<td style="word-wrap: break-word;">1</td>
<td>Integer</td>
<td>The lower bound of allowed parallelism to set adaptively if <code class="highlighter-rouge">jobmanager.scheduler</code> has been set to <code class="highlighter-rouge">AdaptiveBatch</code></td>
</tr>
<tr>
<td><h5>execution.batch.speculative.block-slow-node-duration</h5></td>
<td style="word-wrap: break-word;">1 min</td>
<td>Duration</td>
<td>Controls how long an detected slow node should be blocked for.</td>
</tr>
<tr>
<td><h5>execution.batch.speculative.enabled</h5></td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>Controls whether to enable speculative execution.</td>
</tr>
<tr>
<td><h5>execution.batch.speculative.max-concurrent-executions</h5></td>
<td style="word-wrap: break-word;">2</td>
<td>Integer</td>
<td>Controls the maximum number of execution attempts of each operator that can execute concurrently, including the original one and speculative ones.</td>
</tr>
<tr>
<td><h5>fine-grained.shuffle-mode.all-blocking</h5></td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>Whether to convert all PIPELINE edges to BLOCKING when apply fine-grained resource management in batch jobs.</td>
</tr>
<tr>
<td><h5>jobmanager.adaptive-scheduler.min-parallelism-increase</h5></td>
<td style="word-wrap: break-word;">1</td>
<td>Integer</td>
<td>Configure the minimum increase in parallelism for a job to scale up.</td>
</tr>
<tr>
<td><h5>jobmanager.adaptive-scheduler.resource-stabilization-timeout</h5></td>
<td style="word-wrap: break-word;">10 s</td>
<td>Duration</td>
<td>The resource stabilization timeout defines the time the JobManager will wait if fewer than the desired but sufficient resources are available. The timeout starts once sufficient resources for running the job are available. Once this timeout has passed, the job will start executing with the available resources.<br />If <code class="highlighter-rouge">scheduler-mode</code> is configured to <code class="highlighter-rouge">REACTIVE</code>, this configuration value will default to 0, so that jobs are starting immediately with the available resources.</td>
</tr>
<tr>
<td><h5>jobmanager.adaptive-scheduler.resource-wait-timeout</h5></td>
<td style="word-wrap: break-word;">5 min</td>
<td>Duration</td>
<td>The maximum time the JobManager will wait to acquire all required resources after a job submission or restart. Once elapsed it will try to run the job with a lower parallelism, or fail if the minimum amount of resources could not be acquired.<br />Increasing this value will make the cluster more resilient against temporary resources shortages (e.g., there is more time for a failed TaskManager to be restarted).<br />Setting a negative duration will disable the resource timeout: The JobManager will wait indefinitely for resources to appear.<br />If <code class="highlighter-rouge">scheduler-mode</code> is configured to <code class="highlighter-rouge">REACTIVE</code>, this configuration value will default to a negative value to disable the resource timeout.</td>
</tr>
<tr>
<td><h5>jobmanager.partition.hybrid.partition-data-consume-constraint</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td><p>Enum</p></td>
<td>Controls the constraint that hybrid partition data can be consumed. Note that this option is allowed only when <code class="highlighter-rouge">jobmanager.scheduler</code> has been set to <code class="highlighter-rouge">AdaptiveBatch</code>. Accepted values are:<ul><li>'<code class="highlighter-rouge">ALL_PRODUCERS_FINISHED</code>': hybrid partition data can be consumed only when all producers are finished.</li><li>'<code class="highlighter-rouge">ONLY_FINISHED_PRODUCERS</code>': hybrid partition data can be consumed when its producer is finished.</li><li>'<code class="highlighter-rouge">UNFINISHED_PRODUCERS</code>': hybrid partition data can be consumed even if its producer is un-finished.</li></ul><br /><br />Possible values:<ul><li>"ALL_PRODUCERS_FINISHED"</li><li>"ONLY_FINISHED_PRODUCERS"</li><li>"UNFINISHED_PRODUCERS"</li></ul></td>
</tr>
<tr>
<td><h5>jobmanager.scheduler</h5></td>
<td style="word-wrap: break-word;">Default</td>
<td><p>Enum</p></td>
<td>Determines which scheduler implementation is used to schedule tasks. Accepted values are:<ul><li>'Default': Default scheduler</li><li>'Adaptive': Adaptive scheduler. More details can be found <a href="{{.Site.BaseURL}}{{.Site.LanguagePrefix}}/docs/deployment/elastic_scaling#adaptive-scheduler">here</a>.</li><li>'AdaptiveBatch': Adaptive batch scheduler. More details can be found <a href="{{.Site.BaseURL}}{{.Site.LanguagePrefix}}/docs/deployment/elastic_scaling#adaptive-batch-scheduler">here</a>.</li></ul><br /><br />Possible values:<ul><li>"Default"</li><li>"Adaptive"</li><li>"AdaptiveBatch"</li></ul></td>
</tr>
<tr>
<td><h5>scheduler-mode</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td><p>Enum</p></td>
<td>Determines the mode of the scheduler. Note that <code class="highlighter-rouge">scheduler-mode</code>=<code class="highlighter-rouge">REACTIVE</code> is only supported by standalone application deployments, not by active resource managers (YARN, Kubernetes) or session clusters.<br /><br />Possible values:<ul><li>"REACTIVE"</li></ul></td>
</tr>
<tr>
<td><h5>slot.idle.timeout</h5></td>
<td style="word-wrap: break-word;">50000</td>
<td>Long</td>
<td>The timeout in milliseconds for a idle slot in Slot Pool.</td>
</tr>
<tr>
<td><h5>slot.request.timeout</h5></td>
<td style="word-wrap: break-word;">300000</td>
<td>Long</td>
<td>The timeout in milliseconds for requesting a slot from Slot Pool.</td>
</tr>
<tr>
<td><h5>slotmanager.max-total-resource.cpu</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>Double</td>
<td>Maximum cpu cores the Flink cluster allocates for slots. Resources for JobManager and TaskManager framework are excluded. If not configured, it will be derived from 'slotmanager.number-of-slots.max'.</td>
</tr>
<tr>
<td><h5>slotmanager.max-total-resource.memory</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>MemorySize</td>
<td>Maximum memory size the Flink cluster allocates for slots. Resources for JobManager and TaskManager framework are excluded. If not configured, it will be derived from 'slotmanager.number-of-slots.max'.</td>
</tr>
<tr>
<td><h5>slotmanager.min-total-resource.cpu</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>Double</td>
<td>Minimum cpu cores the Flink cluster allocates for slots. Resources for JobManager and TaskManager framework are excluded. If not configured, it will be derived from 'slotmanager.number-of-slots.min'.</td>
</tr>
<tr>
<td><h5>slotmanager.min-total-resource.memory</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>MemorySize</td>
<td>Minimum memory size the Flink cluster allocates for slots. Resources for JobManager and TaskManager framework are excluded. If not configured, it will be derived from 'slotmanager.number-of-slots.min'.</td>
</tr>
<tr>
<td><h5>slotmanager.number-of-slots.max</h5></td>
<td style="word-wrap: break-word;">infinite</td>
<td>Integer</td>
<td>Defines the maximum number of slots that the Flink cluster allocates. This configuration option is meant for limiting the resource consumption for batch workloads. It is not recommended to configure this option for streaming workloads, which may fail if there are not enough slots. Note that this configuration option does not take effect for standalone clusters, where how many slots are allocated is not controlled by Flink.</td>
</tr>
<tr>
<td><h5>slotmanager.number-of-slots.min</h5></td>
<td style="word-wrap: break-word;">infinite</td>
<td>Integer</td>
<td>Defines the minimum number of slots that the Flink cluster allocates. This configuration option is meant for cluster to initialize certain workers in best efforts when starting. This can be used to speed up a job startup process. Note that this configuration option does not take effect for standalone clusters, where how many slots are allocated is not controlled by Flink.</td>
</tr>
<tr>
<td><h5>slow-task-detector.check-interval</h5></td>
<td style="word-wrap: break-word;">1 s</td>
<td>Duration</td>
<td>The interval to check slow tasks.</td>
</tr>
<tr>
<td><h5>slow-task-detector.execution-time.baseline-lower-bound</h5></td>
<td style="word-wrap: break-word;">1 min</td>
<td>Duration</td>
<td>The lower bound of slow task detection baseline.</td>
</tr>
<tr>
<td><h5>slow-task-detector.execution-time.baseline-multiplier</h5></td>
<td style="word-wrap: break-word;">1.5</td>
<td>Double</td>
<td>The multiplier to calculate the slow tasks detection baseline. Given that the parallelism is N and the ratio is R, define T as the median of the first N*R finished tasks' execution time. The baseline will be T*M, where M is the multiplier of the baseline. Note that the execution time will be weighted with the task's input bytes to ensure the accuracy of the detection if data skew occurs.</td>
</tr>
<tr>
<td><h5>slow-task-detector.execution-time.baseline-ratio</h5></td>
<td style="word-wrap: break-word;">0.75</td>
<td>Double</td>
<td>The finished execution ratio threshold to calculate the slow tasks detection baseline. Given that the parallelism is N and the ratio is R, define T as the median of the first N*R finished tasks' execution time. The baseline will be T*M, where M is the multiplier of the baseline. Note that the execution time will be weighted with the task's input bytes to ensure the accuracy of the detection if data skew occurs.</td>
</tr>
</tbody>
</table>