docs/_includes/generated/execution_config_configuration.html - flink - Git at Google

 <table class="table table-bordered">
     <thead>
         <tr>
             <th class="text-left" style="width: 20%">Key</th>
             <th class="text-left" style="width: 15%">Default</th>
             <th class="text-left" style="width: 10%">Type</th>
             <th class="text-left" style="width: 55%">Description</th>
         </tr>
     </thead>
     <tbody>
         <tr>
             <td><h5>table.exec.async-lookup.buffer-capacity</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">100</td>
             <td>Integer</td>
             <td>The max number of async i/o operation that the async lookup join can trigger.</td>
         </tr>
         <tr>
             <td><h5>table.exec.async-lookup.timeout</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">3 min</td>
             <td>Duration</td>
             <td>The async timeout for the asynchronous operation to complete.</td>
         </tr>
         <tr>
             <td><h5>table.exec.disabled-operators</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
             <td>Mainly for testing. A comma-separated list of operator names, each name represents a kind of disabled operator.
 Operators that can be disabled include "NestedLoopJoin", "ShuffleHashJoin", "BroadcastHashJoin", "SortMergeJoin", "HashAgg", "SortAgg".
 By default no operator is disabled.</td>
         </tr>
         <tr>
             <td><h5>table.exec.mini-batch.allow-latency</h5><br> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">0 ms</td>
             <td>Duration</td>
             <td>The maximum latency can be used for MiniBatch to buffer input records. MiniBatch is an optimization to buffer input records to reduce state access. MiniBatch is triggered with the allowed latency interval and when the maximum number of buffered records reached. NOTE: If table.exec.mini-batch.enabled is set true, its value must be greater than zero.</td>
         </tr>
         <tr>
             <td><h5>table.exec.mini-batch.enabled</h5><br> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
             <td>Specifies whether to enable MiniBatch optimization. MiniBatch is an optimization to buffer input records to reduce state access. This is disabled by default. To enable this, users should set this config to true. NOTE: If mini-batch is enabled, 'table.exec.mini-batch.allow-latency' and 'table.exec.mini-batch.size' must be set.</td>
         </tr>
         <tr>
             <td><h5>table.exec.mini-batch.size</h5><br> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">-1</td>
             <td>Long</td>
             <td>The maximum number of input records can be buffered for MiniBatch. MiniBatch is an optimization to buffer input records to reduce state access. MiniBatch is triggered with the allowed latency interval and when the maximum number of buffered records reached. NOTE: MiniBatch only works for non-windowed aggregations currently. If table.exec.mini-batch.enabled is set true, its value must be positive.</td>
         </tr>
         <tr>
             <td><h5>table.exec.resource.default-parallelism</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">-1</td>
             <td>Integer</td>
             <td>Sets default parallelism for all operators (such as aggregate, join, filter) to run with parallel instances. This config has a higher priority than parallelism of StreamExecutionEnvironment (actually, this config overrides the parallelism of StreamExecutionEnvironment). A value of -1 indicates that no default parallelism is set, then it will fallback to use the parallelism of StreamExecutionEnvironment.</td>
         </tr>
         <tr>
             <td><h5>table.exec.shuffle-mode</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">"ALL_EDGES_BLOCKING"</td>
             <td>String</td>
             <td>Sets exec shuffle mode.<br />Accepted values are:<ul><li><span markdown="span">`ALL_EDGES_BLOCKING`</span>: All edges will use blocking shuffle.</li><li><span markdown="span">`FORWARD_EDGES_PIPELINED`</span>: Forward edges will use pipelined shuffle, others blocking.</li><li><span markdown="span">`POINTWISE_EDGES_PIPELINED`</span>: Pointwise edges will use pipelined shuffle, others blocking. Pointwise edges include forward and rescale edges.</li><li><span markdown="span">`ALL_EDGES_PIPELINED`</span>: All edges will use pipelined shuffle.</li><li><span markdown="span">`batch`</span>: the same as <span markdown="span">`ALL_EDGES_BLOCKING`</span>. Deprecated.</li><li><span markdown="span">`pipelined`</span>: the same as <span markdown="span">`ALL_EDGES_PIPELINED`</span>. Deprecated.</li></ul>Note: Blocking shuffle means data will be fully produced before sent to consumer tasks. Pipelined shuffle means data will be sent to consumer tasks once produced.</td>
         </tr>
         <tr>
             <td><h5>table.exec.sink.not-null-enforcer</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">ERROR</td>
             <td><p>Enum</p>Possible values: [ERROR, DROP]</td>
             <td>The NOT NULL column constraint on a table enforces that null values can't be inserted into the table. Flink supports 'error' (default) and 'drop' enforcement behavior. By default, Flink will check values and throw runtime exception when null values writing into NOT NULL columns. Users can change the behavior to 'drop' to silently drop such records without throwing exception.</td>
         </tr>
         <tr>
             <td><h5>table.exec.sort.async-merge-enabled</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
             <td>Whether to asynchronously merge sorted spill files.</td>
         </tr>
         <tr>
             <td><h5>table.exec.sort.default-limit</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">-1</td>
             <td>Integer</td>
             <td>Default limit when user don't set a limit after order by. -1 indicates that this configuration is ignored.</td>
         </tr>
         <tr>
             <td><h5>table.exec.sort.max-num-file-handles</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">128</td>
             <td>Integer</td>
             <td>The maximal fan-in for external merge sort. It limits the number of file handles per operator. If it is too small, may cause intermediate merging. But if it is too large, it will cause too many files opened at the same time, consume memory and lead to random reading.</td>
         </tr>
         <tr>
             <td><h5>table.exec.source.cdc-events-duplicate</h5><br> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
             <td>Indicates whether the CDC (Change Data Capture) sources in the job will produce duplicate change events that requires the framework to deduplicate and get consistent result. CDC source refers to the source that produces full change events, including INSERT/UPDATE_BEFORE/UPDATE_AFTER/DELETE, for example Kafka source with Debezium format. The value of this configuration is false by default.<br /><br />However, it's a common case that there are duplicate change events. Because usually the CDC tools (e.g. Debezium) work in at-least-once delivery when failover happens. Thus, in the abnormal situations Debezium may deliver duplicate change events to Kafka and Flink will get the duplicate events. This may cause Flink query to get wrong results or unexpected exceptions.<br /><br />Therefore, it is recommended to turn on this configuration if your CDC tool is at-least-once delivery. Enabling this configuration requires to define PRIMARY KEY on the CDC sources. The primary key will be used to deduplicate change events and generate normalized changelog stream at the cost of an additional stateful operator.</td>
         </tr>
         <tr>
             <td><h5>table.exec.source.idle-timeout</h5><br> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">0 ms</td>
             <td>Duration</td>
             <td>When a source do not receive any elements for the timeout time, it will be marked as temporarily idle. This allows downstream tasks to advance their watermarks without the need to wait for watermarks from this source while it is idle. Default value is 0, which means detecting source idleness is not enabled.</td>
         </tr>
         <tr>
             <td><h5>table.exec.spill-compression.block-size</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">"64 kb"</td>
             <td>String</td>
             <td>The memory size used to do compress when spilling data. The larger the memory, the higher the compression ratio, but more memory resource will be consumed by the job.</td>
         </tr>
         <tr>
             <td><h5>table.exec.spill-compression.enabled</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
             <td>Whether to compress spilled data. Currently we only support compress spilled data for sort and hash-agg and hash-join operators.</td>
         </tr>
         <tr>
             <td><h5>table.exec.state.ttl</h5><br> <span class="label label-primary">Streaming</span></td>
             <td style="word-wrap: break-word;">0 ms</td>
             <td>Duration</td>
             <td>Specifies a minimum time interval for how long idle state (i.e. state which was not updated), will be retained. State will never be cleared until it was idle for less than the minimum time, and will be cleared at some time after it was idle. Default is never clean-up the state. NOTE: Cleaning up state requires additional overhead for bookkeeping. Default value is 0, which means that it will never clean up state.</td>
         </tr>
         <tr>
             <td><h5>table.exec.window-agg.buffer-size-limit</h5><br> <span class="label label-primary">Batch</span></td>
             <td style="word-wrap: break-word;">100000</td>
             <td>Integer</td>
             <td>Sets the window elements buffer size limit used in group window agg operator.</td>
         </tr>
     </tbody>
 </table>
	<table class="table table-bordered">
	<thead>
	<tr>
	<th class="text-left" style="width: 20%">Key</th>
	<th class="text-left" style="width: 15%">Default</th>
	<th class="text-left" style="width: 10%">Type</th>
	<th class="text-left" style="width: 55%">Description</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td><h5>table.exec.async-lookup.buffer-capacity</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">100</td>
	<td>Integer</td>
	<td>The max number of async i/o operation that the async lookup join can trigger.</td>
	</tr>
	<tr>
	<td><h5>table.exec.async-lookup.timeout</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">3 min</td>
	<td>Duration</td>
	<td>The async timeout for the asynchronous operation to complete.</td>
	</tr>
	<tr>
	<td><h5>table.exec.disabled-operators</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>String</td>
	<td>Mainly for testing. A comma-separated list of operator names, each name represents a kind of disabled operator.
	Operators that can be disabled include "NestedLoopJoin", "ShuffleHashJoin", "BroadcastHashJoin", "SortMergeJoin", "HashAgg", "SortAgg".
	By default no operator is disabled.</td>
	</tr>
	<tr>
	<td><h5>table.exec.mini-batch.allow-latency</h5><br> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">0 ms</td>
	<td>Duration</td>
	<td>The maximum latency can be used for MiniBatch to buffer input records. MiniBatch is an optimization to buffer input records to reduce state access. MiniBatch is triggered with the allowed latency interval and when the maximum number of buffered records reached. NOTE: If table.exec.mini-batch.enabled is set true, its value must be greater than zero.</td>
	</tr>
	<tr>
	<td><h5>table.exec.mini-batch.enabled</h5><br> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">false</td>
	<td>Boolean</td>
	<td>Specifies whether to enable MiniBatch optimization. MiniBatch is an optimization to buffer input records to reduce state access. This is disabled by default. To enable this, users should set this config to true. NOTE: If mini-batch is enabled, 'table.exec.mini-batch.allow-latency' and 'table.exec.mini-batch.size' must be set.</td>
	</tr>
	<tr>
	<td><h5>table.exec.mini-batch.size</h5><br> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">-1</td>
	<td>Long</td>
	<td>The maximum number of input records can be buffered for MiniBatch. MiniBatch is an optimization to buffer input records to reduce state access. MiniBatch is triggered with the allowed latency interval and when the maximum number of buffered records reached. NOTE: MiniBatch only works for non-windowed aggregations currently. If table.exec.mini-batch.enabled is set true, its value must be positive.</td>
	</tr>
	<tr>
	<td><h5>table.exec.resource.default-parallelism</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">-1</td>
	<td>Integer</td>
	<td>Sets default parallelism for all operators (such as aggregate, join, filter) to run with parallel instances. This config has a higher priority than parallelism of StreamExecutionEnvironment (actually, this config overrides the parallelism of StreamExecutionEnvironment). A value of -1 indicates that no default parallelism is set, then it will fallback to use the parallelism of StreamExecutionEnvironment.</td>
	</tr>
	<tr>
	<td><h5>table.exec.shuffle-mode</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">"ALL_EDGES_BLOCKING"</td>
	<td>String</td>
	<td>Sets exec shuffle mode.<br />Accepted values are:<ul><li><span markdown="span">`ALL_EDGES_BLOCKING`</span>: All edges will use blocking shuffle.</li><li><span markdown="span">`FORWARD_EDGES_PIPELINED`</span>: Forward edges will use pipelined shuffle, others blocking.</li><li><span markdown="span">`POINTWISE_EDGES_PIPELINED`</span>: Pointwise edges will use pipelined shuffle, others blocking. Pointwise edges include forward and rescale edges.</li><li><span markdown="span">`ALL_EDGES_PIPELINED`</span>: All edges will use pipelined shuffle.</li><li><span markdown="span">`batch`</span>: the same as <span markdown="span">`ALL_EDGES_BLOCKING`</span>. Deprecated.</li><li><span markdown="span">`pipelined`</span>: the same as <span markdown="span">`ALL_EDGES_PIPELINED`</span>. Deprecated.</li></ul>Note: Blocking shuffle means data will be fully produced before sent to consumer tasks. Pipelined shuffle means data will be sent to consumer tasks once produced.</td>
	</tr>
	<tr>
	<td><h5>table.exec.sink.not-null-enforcer</h5><br> <span class="label label-primary">Batch</span> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">ERROR</td>
	<td><p>Enum</p>Possible values: [ERROR, DROP]</td>
	<td>The NOT NULL column constraint on a table enforces that null values can't be inserted into the table. Flink supports 'error' (default) and 'drop' enforcement behavior. By default, Flink will check values and throw runtime exception when null values writing into NOT NULL columns. Users can change the behavior to 'drop' to silently drop such records without throwing exception.</td>
	</tr>
	<tr>
	<td><h5>table.exec.sort.async-merge-enabled</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">true</td>
	<td>Boolean</td>
	<td>Whether to asynchronously merge sorted spill files.</td>
	</tr>
	<tr>
	<td><h5>table.exec.sort.default-limit</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">-1</td>
	<td>Integer</td>
	<td>Default limit when user don't set a limit after order by. -1 indicates that this configuration is ignored.</td>
	</tr>
	<tr>
	<td><h5>table.exec.sort.max-num-file-handles</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">128</td>
	<td>Integer</td>
	<td>The maximal fan-in for external merge sort. It limits the number of file handles per operator. If it is too small, may cause intermediate merging. But if it is too large, it will cause too many files opened at the same time, consume memory and lead to random reading.</td>
	</tr>
	<tr>
	<td><h5>table.exec.source.cdc-events-duplicate</h5><br> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">false</td>
	<td>Boolean</td>
	<td>Indicates whether the CDC (Change Data Capture) sources in the job will produce duplicate change events that requires the framework to deduplicate and get consistent result. CDC source refers to the source that produces full change events, including INSERT/UPDATE_BEFORE/UPDATE_AFTER/DELETE, for example Kafka source with Debezium format. The value of this configuration is false by default.<br /><br />However, it's a common case that there are duplicate change events. Because usually the CDC tools (e.g. Debezium) work in at-least-once delivery when failover happens. Thus, in the abnormal situations Debezium may deliver duplicate change events to Kafka and Flink will get the duplicate events. This may cause Flink query to get wrong results or unexpected exceptions.<br /><br />Therefore, it is recommended to turn on this configuration if your CDC tool is at-least-once delivery. Enabling this configuration requires to define PRIMARY KEY on the CDC sources. The primary key will be used to deduplicate change events and generate normalized changelog stream at the cost of an additional stateful operator.</td>
	</tr>
	<tr>
	<td><h5>table.exec.source.idle-timeout</h5><br> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">0 ms</td>
	<td>Duration</td>
	<td>When a source do not receive any elements for the timeout time, it will be marked as temporarily idle. This allows downstream tasks to advance their watermarks without the need to wait for watermarks from this source while it is idle. Default value is 0, which means detecting source idleness is not enabled.</td>
	</tr>
	<tr>
	<td><h5>table.exec.spill-compression.block-size</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">"64 kb"</td>
	<td>String</td>
	<td>The memory size used to do compress when spilling data. The larger the memory, the higher the compression ratio, but more memory resource will be consumed by the job.</td>
	</tr>
	<tr>
	<td><h5>table.exec.spill-compression.enabled</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">true</td>
	<td>Boolean</td>
	<td>Whether to compress spilled data. Currently we only support compress spilled data for sort and hash-agg and hash-join operators.</td>
	</tr>
	<tr>
	<td><h5>table.exec.state.ttl</h5><br> <span class="label label-primary">Streaming</span></td>
	<td style="word-wrap: break-word;">0 ms</td>
	<td>Duration</td>
	<td>Specifies a minimum time interval for how long idle state (i.e. state which was not updated), will be retained. State will never be cleared until it was idle for less than the minimum time, and will be cleared at some time after it was idle. Default is never clean-up the state. NOTE: Cleaning up state requires additional overhead for bookkeeping. Default value is 0, which means that it will never clean up state.</td>
	</tr>
	<tr>
	<td><h5>table.exec.window-agg.buffer-size-limit</h5><br> <span class="label label-primary">Batch</span></td>
	<td style="word-wrap: break-word;">100000</td>
	<td>Integer</td>
	<td>Sets the window elements buffer size limit used in group window agg operator.</td>
	</tr>
	</tbody>
	</table>