docs/_includes/generated/pipeline_configuration.html - flink - Git at Google

 <table class="table table-bordered">
     <thead>
         <tr>
             <th class="text-left" style="width: 20%">Key</th>
             <th class="text-left" style="width: 15%">Default</th>
             <th class="text-left" style="width: 10%">Type</th>
             <th class="text-left" style="width: 55%">Description</th>
         </tr>
     </thead>
     <tbody>
         <tr>
             <td><h5>pipeline.auto-generate-uids</h5></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
             <td>When auto-generated UIDs are disabled, users are forced to manually specify UIDs on DataStream applications.<br /><br />It is highly recommended that users specify UIDs before deploying to production since they are used to match state in savepoints to operators in a job. Because auto-generated ID's are likely to change when modifying a job, specifying custom IDs allow an application to evolve over time without discarding state.</td>
         </tr>
         <tr>
             <td><h5>pipeline.auto-type-registration</h5></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
             <td>Controls whether Flink is automatically registering all types in the user programs with Kryo.</td>
         </tr>
         <tr>
             <td><h5>pipeline.auto-watermark-interval</h5></td>
             <td style="word-wrap: break-word;">0 ms</td>
             <td>Duration</td>
             <td>The interval of the automatic watermark emission. Watermarks are used throughout the streaming system to keep track of the progress of time. They are used, for example, for time based windowing.</td>
         </tr>
         <tr>
             <td><h5>pipeline.cached-files</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>List&lt;String&gt;</td>
             <td>Files to be registered at the distributed cache under the given name. The files will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (which will be distributed via BlobServer), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.<br /><br />Example:<br /><span markdown="span">`name:file1,path:`file:///tmp/file1`;name:file2,path:`hdfs:///tmp/file2``</span></td>
         </tr>
         <tr>
             <td><h5>pipeline.classpaths</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>List&lt;String&gt;</td>
             <td>A semicolon-separated list of the classpaths to package with the job jars to be sent to the cluster. These have to be valid URLs.</td>
         </tr>
         <tr>
             <td><h5>pipeline.closure-cleaner-level</h5></td>
             <td style="word-wrap: break-word;">RECURSIVE</td>
             <td><p>Enum</p>Possible values: [NONE, TOP_LEVEL, RECURSIVE]</td>
             <td>Configures the mode in which the closure cleaner works<ul><li><span markdown="span">`NONE`</span> - disables the closure cleaner completely</li><li><span markdown="span">`TOP_LEVEL`</span> - cleans only the top-level class without recursing into fields</li><li><span markdown="span">`RECURSIVE`</span> - cleans all the fields recursively</li></ul></td>
         </tr>
         <tr>
             <td><h5>pipeline.default-kryo-serializers</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>List&lt;String&gt;</td>
             <td>Semicolon separated list of pairs of class names and Kryo serializers class names to be used as Kryo default serializers<br /><br />Example:<br /><span markdown="span">`class:org.example.ExampleClass,serializer:org.example.ExampleSerializer1; class:org.example.ExampleClass2,serializer:org.example.ExampleSerializer2`</span></td>
         </tr>
         <tr>
             <td><h5>pipeline.force-avro</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
             <td>Forces Flink to use the Apache Avro serializer for POJOs.<br /><br />Important: Make sure to include the <span markdown="span">`flink-avro`</span> module.</td>
         </tr>
         <tr>
             <td><h5>pipeline.force-kryo</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
             <td>If enabled, forces TypeExtractor to use Kryo serializer for POJOS even though we could analyze as POJO. In some cases this might be preferable. For example, when using interfaces with subclasses that cannot be analyzed as POJO.</td>
         </tr>
         <tr>
             <td><h5>pipeline.generic-types</h5></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
             <td>If the use of generic types is disabled, Flink will throw an <span markdown="span">`UnsupportedOperationException`</span> whenever it encounters a data type that would go through Kryo for serialization.<br /><br />Disabling generic types can be helpful to eagerly find and eliminate the use of types that would go through Kryo serialization during runtime. Rather than checking types individually, using this option will throw exceptions eagerly in the places where generic types are used.<br /><br />We recommend to use this option only during development and pre-production phases, not during actual production use. The application program and/or the input data may be such that new, previously unseen, types occur at some point. In that case, setting this option would cause the program to fail.</td>
         </tr>
         <tr>
             <td><h5>pipeline.global-job-parameters</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>Map</td>
             <td>Register a custom, serializable user configuration object. The configuration can be  accessed in operators</td>
         </tr>
         <tr>
             <td><h5>pipeline.jars</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>List&lt;String&gt;</td>
             <td>A semicolon-separated list of the jars to package with the job jars to be sent to the cluster. These have to be valid paths.</td>
         </tr>
         <tr>
             <td><h5>pipeline.max-parallelism</h5></td>
             <td style="word-wrap: break-word;">-1</td>
             <td>Integer</td>
             <td>The program-wide maximum parallelism used for operators which haven't specified a maximum parallelism. The maximum parallelism specifies the upper limit for dynamic scaling and the number of key groups used for partitioned state.</td>
         </tr>
         <tr>
             <td><h5>pipeline.name</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
             <td>The job name used for printing and logging.</td>
         </tr>
         <tr>
             <td><h5>pipeline.object-reuse</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
             <td>When enabled objects that Flink internally uses for deserialization and passing data to user-code functions will be reused. Keep in mind that this can lead to bugs when the user-code function of an operation is not aware of this behaviour.</td>
         </tr>
         <tr>
             <td><h5>pipeline.operator-chaining</h5></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
             <td>Operator chaining allows non-shuffle operations to be co-located in the same thread fully avoiding serialization and de-serialization.</td>
         </tr>
         <tr>
             <td><h5>pipeline.registered-kryo-types</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>List&lt;String&gt;</td>
             <td>Semicolon separated list of types to be registered with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written.</td>
         </tr>
         <tr>
             <td><h5>pipeline.registered-pojo-types</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>List&lt;String&gt;</td>
             <td>Semicolon separated list of types to be registered with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written.</td>
         </tr>
     </tbody>
 </table>
	<table class="table table-bordered">
	<thead>
	<tr>
	<th class="text-left" style="width: 20%">Key</th>
	<th class="text-left" style="width: 15%">Default</th>
	<th class="text-left" style="width: 10%">Type</th>
	<th class="text-left" style="width: 55%">Description</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td><h5>pipeline.auto-generate-uids</h5></td>
	<td style="word-wrap: break-word;">true</td>
	<td>Boolean</td>
	<td>When auto-generated UIDs are disabled, users are forced to manually specify UIDs on DataStream applications.<br /><br />It is highly recommended that users specify UIDs before deploying to production since they are used to match state in savepoints to operators in a job. Because auto-generated ID's are likely to change when modifying a job, specifying custom IDs allow an application to evolve over time without discarding state.</td>
	</tr>
	<tr>
	<td><h5>pipeline.auto-type-registration</h5></td>
	<td style="word-wrap: break-word;">true</td>
	<td>Boolean</td>
	<td>Controls whether Flink is automatically registering all types in the user programs with Kryo.</td>
	</tr>
	<tr>
	<td><h5>pipeline.auto-watermark-interval</h5></td>
	<td style="word-wrap: break-word;">0 ms</td>
	<td>Duration</td>
	<td>The interval of the automatic watermark emission. Watermarks are used throughout the streaming system to keep track of the progress of time. They are used, for example, for time based windowing.</td>
	</tr>
	<tr>
	<td><h5>pipeline.cached-files</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>List<String></td>
	<td>Files to be registered at the distributed cache under the given name. The files will be accessible from any user-defined function in the (distributed) runtime under a local path. Files may be local files (which will be distributed via BlobServer), or files in a distributed file system. The runtime will copy the files temporarily to a local cache, if needed.<br /><br />Example:<br /><span markdown="span">`name:file1,path:`file:///tmp/file1`;name:file2,path:`hdfs:///tmp/file2``</span></td>
	</tr>
	<tr>
	<td><h5>pipeline.classpaths</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>List<String></td>
	<td>A semicolon-separated list of the classpaths to package with the job jars to be sent to the cluster. These have to be valid URLs.</td>
	</tr>
	<tr>
	<td><h5>pipeline.closure-cleaner-level</h5></td>
	<td style="word-wrap: break-word;">RECURSIVE</td>
	<td><p>Enum</p>Possible values: [NONE, TOP_LEVEL, RECURSIVE]</td>
	<td>Configures the mode in which the closure cleaner works<ul><li><span markdown="span">`NONE`</span> - disables the closure cleaner completely</li><li><span markdown="span">`TOP_LEVEL`</span> - cleans only the top-level class without recursing into fields</li><li><span markdown="span">`RECURSIVE`</span> - cleans all the fields recursively</li></ul></td>
	</tr>
	<tr>
	<td><h5>pipeline.default-kryo-serializers</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>List<String></td>
	<td>Semicolon separated list of pairs of class names and Kryo serializers class names to be used as Kryo default serializers<br /><br />Example:<br /><span markdown="span">`class:org.example.ExampleClass,serializer:org.example.ExampleSerializer1; class:org.example.ExampleClass2,serializer:org.example.ExampleSerializer2`</span></td>
	</tr>
	<tr>
	<td><h5>pipeline.force-avro</h5></td>
	<td style="word-wrap: break-word;">false</td>
	<td>Boolean</td>
	<td>Forces Flink to use the Apache Avro serializer for POJOs.<br /><br />Important: Make sure to include the <span markdown="span">`flink-avro`</span> module.</td>
	</tr>
	<tr>
	<td><h5>pipeline.force-kryo</h5></td>
	<td style="word-wrap: break-word;">false</td>
	<td>Boolean</td>
	<td>If enabled, forces TypeExtractor to use Kryo serializer for POJOS even though we could analyze as POJO. In some cases this might be preferable. For example, when using interfaces with subclasses that cannot be analyzed as POJO.</td>
	</tr>
	<tr>
	<td><h5>pipeline.generic-types</h5></td>
	<td style="word-wrap: break-word;">true</td>
	<td>Boolean</td>
	<td>If the use of generic types is disabled, Flink will throw an <span markdown="span">`UnsupportedOperationException`</span> whenever it encounters a data type that would go through Kryo for serialization.<br /><br />Disabling generic types can be helpful to eagerly find and eliminate the use of types that would go through Kryo serialization during runtime. Rather than checking types individually, using this option will throw exceptions eagerly in the places where generic types are used.<br /><br />We recommend to use this option only during development and pre-production phases, not during actual production use. The application program and/or the input data may be such that new, previously unseen, types occur at some point. In that case, setting this option would cause the program to fail.</td>
	</tr>
	<tr>
	<td><h5>pipeline.global-job-parameters</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>Map</td>
	<td>Register a custom, serializable user configuration object. The configuration can be accessed in operators</td>
	</tr>
	<tr>
	<td><h5>pipeline.jars</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>List<String></td>
	<td>A semicolon-separated list of the jars to package with the job jars to be sent to the cluster. These have to be valid paths.</td>
	</tr>
	<tr>
	<td><h5>pipeline.max-parallelism</h5></td>
	<td style="word-wrap: break-word;">-1</td>
	<td>Integer</td>
	<td>The program-wide maximum parallelism used for operators which haven't specified a maximum parallelism. The maximum parallelism specifies the upper limit for dynamic scaling and the number of key groups used for partitioned state.</td>
	</tr>
	<tr>
	<td><h5>pipeline.name</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>String</td>
	<td>The job name used for printing and logging.</td>
	</tr>
	<tr>
	<td><h5>pipeline.object-reuse</h5></td>
	<td style="word-wrap: break-word;">false</td>
	<td>Boolean</td>
	<td>When enabled objects that Flink internally uses for deserialization and passing data to user-code functions will be reused. Keep in mind that this can lead to bugs when the user-code function of an operation is not aware of this behaviour.</td>
	</tr>
	<tr>
	<td><h5>pipeline.operator-chaining</h5></td>
	<td style="word-wrap: break-word;">true</td>
	<td>Boolean</td>
	<td>Operator chaining allows non-shuffle operations to be co-located in the same thread fully avoiding serialization and de-serialization.</td>
	</tr>
	<tr>
	<td><h5>pipeline.registered-kryo-types</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>List<String></td>
	<td>Semicolon separated list of types to be registered with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written.</td>
	</tr>
	<tr>
	<td><h5>pipeline.registered-pojo-types</h5></td>
	<td style="word-wrap: break-word;">(none)</td>
	<td>List<String></td>
	<td>Semicolon separated list of types to be registered with the serialization stack. If the type is eventually serialized as a POJO, then the type is registered with the POJO serializer. If the type ends up being serialized with Kryo, then it will be registered at Kryo to make sure that only tags are written.</td>
	</tr>
	</tbody>
	</table>