docs/layouts/shortcodes/generated/kafka_sync_database.html - paimon - Git at Google

 {{/*
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 */}}
 {{ $ref := ref . "maintenance/configurations.md" }}
 <table class="configuration table table-bordered">
     <thead>
     <tr>
         <th class="text-left" style="width: 15%">Configuration</th>
         <th class="text-left" style="width: 85%">Description</th>
     </tr>
     </thead>
     <tbody>
     <tr>
         <td><h5>--warehouse</h5></td>
         <td>The path to Paimon warehouse.</td>
     </tr>
     <tr>
         <td><h5>--database</h5></td>
         <td>The database name in Paimon catalog.</td>
     </tr>
     <tr>
         <td><h5>--ignore_incompatible</h5></td>
         <td>It is default false, in this case, if MySQL table name exists in Paimon and their schema is incompatible,an exception will be thrown. You can specify it to true explicitly to ignore the incompatible tables and exception.</td>
     </tr>
     <tr>
         <td><h5>--table_mapping</h5></td>
         <td>The table name mapping between source database and Paimon. For example, if you want to synchronize a source table named "test" to a Paimon table named "paimon_test", you can specify "--table_mapping test=paimon_test". Multiple mappings could be specified with multiple "--table_mapping" options. "--table_mapping" has higher priority than "--table_prefix" and "--table_suffix".</td>
     </tr>
     <tr>
         <td><h5>--table_prefix</h5></td>
         <td>The prefix of all Paimon tables to be synchronized except those specified by "--table_mapping" or "--table_prefix_db". For example, if you want all synchronized tables to have "ods_" as prefix, you can specify "--table_prefix ods_".</td>
     </tr>
     <tr>
         <td><h5>--table_suffix</h5></td>
         <td>The suffix of all Paimon tables to be synchronized except those specified by "--table_mapping" or "--table_suffix_db". The usage is same as "--table_prefix".</td>
     </tr>
     <tr>
         <td><h5>--table_prefix_db</h5></td>
         <td>The prefix of the Paimon tables to be synchronized from the specified db. For example, if you want to prefix the tables from db1 with "ods_db1_", you can specify "--table_prefix_db db1=ods_db1_". Multiple mappings could be specified multiple "--table_prefix_db" options. "--table_prefix_db" has higher priority than "--table_prefix".</td>
     </tr>
     <tr>
         <td><h5>--table_suffix_db</h5></td>
         <td>The suffix of the Paimon tables to be synchronized from the specified db. The usage is same as "--table_prefix_db".</td>
     </tr>
     <tr>
         <td><h5>--including_tables</h5></td>
         <td>It is used to specify which source tables are to be synchronized. You must use '|' to separate multiple tables.Because '|' is a special character, a comma is required, for example: 'a|b|c'.Regular expression is supported, for example, specifying "--including_tables test|paimon.*" means to synchronize table 'test' and all tables start with 'paimon'.</td>
     </tr>
     <tr>
         <td><h5>--excluding_tables</h5></td>
         <td>It is used to specify which source tables are not to be synchronized. The usage is same as "--including_tables". "--excluding_tables" has higher priority than "--including_tables" if you specified both.</td>
     </tr>
     <tr>
         <td><h5>--including_dbs</h5></td>
         <td>It is used to specify the databases within which the tables are to be synchronized. The usage is same as "--including_tables".</td>
     </tr>
     <tr>
         <td><h5>--excluding_dbs</h5></td>
         <td>It is used to specify the databases within which the tables are not to be synchronized. The usage is same as "--excluding_tables". "--excluding_dbs" has higher priority than "--including_dbs" if you specified both.</td>
     </tr>
     <tr>
         <td><h5>--type_mapping</h5></td>
         <td>It is used to specify how to map MySQL data type to Paimon type.<br />
             Supported options:
             <ul>
                 <li>"tinyint1-not-bool": maps MySQL TINYINT(1) to TINYINT instead of BOOLEAN.</li>
                 <li>"to-nullable": ignores all NOT NULL constraints (except for primary keys).
                     This is used to solve the problem that Flink cannot accept the MySQL 'ALTER TABLE ADD COLUMN column type NOT NULL DEFAULT x' operation.
                 </li>
                 <li>"to-string": maps all MySQL types to STRING.</li>
                 <li>"char-to-string": maps MySQL CHAR(length)/VARCHAR(length) types to STRING.</li>
                 <li>"longtext-to-bytes": maps MySQL LONGTEXT types to BYTES.</li>
                 <li>"bigint-unsigned-to-bigint": maps MySQL BIGINT UNSIGNED, BIGINT UNSIGNED ZEROFILL, SERIAL to BIGINT. You should ensure overflow won't occur when using this option.</li>
                 <li>"decimal-no-change": Ignore decimal type change.</li>
             </ul>
         </td>
     </tr>
     <tr>
         <td><h5>--computed_column</h5></td>
         <td>The definitions of computed columns. The argument field is from Kafka topic's table field name. See <a href="../overview/#computed-functions">here</a> for a complete list of configurations. NOTICE: It returns null if the referenced column does not exist in the source table.</td>
     </tr>
     <tr>
         <td><h5>--eager_init</h5></td>
         <td>It is default false. If true, all relevant tables commiter will be initialized eagerly, which means those tables could be forced to create snapshot.</td>
     </tr>
     <tr>
     <tr>
         <td><h5>--partition_keys</h5></td>
         <td>The partition keys for Paimon table. If there are multiple partition keys, connect them with comma, for example "dt,hh,mm".
             If the keys are not in source table, the sink table won't set partition keys.</td>
     </tr>
     <tr>
         <td><h5>--multiple_table_partition_keys</h5></td>
         <td>The partition keys for each different Paimon table. If there are multiple partition keys, connect them with comma, for example
             <li>--multiple_table_partition_keys  tableName1=col1,col2.col3</li>
             <li>--multiple_table_partition_keys  tableName2=col4,col5.col6</li>
             <li>--multiple_table_partition_keys  tableName3=col7,col8.col9</li>
             If the keys are not in source table, the sink table won't set partition keys.</td>
     </tr>
     <tr>
         <td><h5>--primary_keys</h5></td>
         <td>The primary keys for Paimon table. If there are multiple primary keys, connect them with comma, for example "buyer_id,seller_id".
             If the keys are not provided, but the source has primary keys, the sink table will use source's primary keys.
             Otherwise, the sink table won't set primary keys.
             If the keys are not provided, but the source has primary keys, and you don't want to use source's primary keys,
             use --sync_primary_keys_from_source_schema.</td>
     </tr>
     <tr>
         <td><h5>--sync_primary_keys_from_source_schema</h5></td>
         <td>This is used to specify if primary keys from source should be used in paimon schema if primary keys using --primary_keys are not specified. The default is true.</td>
     </tr>
     <tr>
     <tr>
         <td><h5>--kafka_conf</h5></td>
         <td>The configuration for Flink Kafka sources. Each configuration should be specified in the format `key=value`. `properties.bootstrap.servers`, `topic/topic-pattern`, `properties.group.id`,  and `value.format` are required configurations, others are optional.See its <a href="https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/table/kafka/#connector-options">document</a> for a complete list of configurations.</td>
     </tr>
     <tr>
         <td><h5>--catalog_conf</h5></td>
         <td>The configuration for Paimon catalog. Each configuration should be specified in the format "key=value". See <a href="{{ $ref }}#catalogoptions">here</a> for a complete list of catalog configurations.</td>
     </tr>
     <tr>
         <td><h5>--table_conf</h5></td>
         <td>The configuration for Paimon table sink. Each configuration should be specified in the format "key=value". See <a href="{{ $ref }}">here</a> for a complete list of table configurations.</td>
     </tr>
     </tbody>
 </table>
	{{/*
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	*/}}
	{{ $ref := ref . "maintenance/configurations.md" }}
	<table class="configuration table table-bordered">
	<thead>
	<tr>
	<th class="text-left" style="width: 15%">Configuration</th>
	<th class="text-left" style="width: 85%">Description</th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td><h5>--warehouse</h5></td>
	<td>The path to Paimon warehouse.</td>
	</tr>
	<tr>
	<td><h5>--database</h5></td>
	<td>The database name in Paimon catalog.</td>
	</tr>
	<tr>
	<td><h5>--ignore_incompatible</h5></td>
	<td>It is default false, in this case, if MySQL table name exists in Paimon and their schema is incompatible,an exception will be thrown. You can specify it to true explicitly to ignore the incompatible tables and exception.</td>
	</tr>
	<tr>
	<td><h5>--table_mapping</h5></td>
	<td>The table name mapping between source database and Paimon. For example, if you want to synchronize a source table named "test" to a Paimon table named "paimon_test", you can specify "--table_mapping test=paimon_test". Multiple mappings could be specified with multiple "--table_mapping" options. "--table_mapping" has higher priority than "--table_prefix" and "--table_suffix".</td>
	</tr>
	<tr>
	<td><h5>--table_prefix</h5></td>
	<td>The prefix of all Paimon tables to be synchronized except those specified by "--table_mapping" or "--table_prefix_db". For example, if you want all synchronized tables to have "ods_" as prefix, you can specify "--table_prefix ods_".</td>
	</tr>
	<tr>
	<td><h5>--table_suffix</h5></td>
	<td>The suffix of all Paimon tables to be synchronized except those specified by "--table_mapping" or "--table_suffix_db". The usage is same as "--table_prefix".</td>
	</tr>
	<tr>
	<td><h5>--table_prefix_db</h5></td>
	<td>The prefix of the Paimon tables to be synchronized from the specified db. For example, if you want to prefix the tables from db1 with "ods_db1_", you can specify "--table_prefix_db db1=ods_db1_". Multiple mappings could be specified multiple "--table_prefix_db" options. "--table_prefix_db" has higher priority than "--table_prefix".</td>
	</tr>
	<tr>
	<td><h5>--table_suffix_db</h5></td>
	<td>The suffix of the Paimon tables to be synchronized from the specified db. The usage is same as "--table_prefix_db".</td>
	</tr>
	<tr>
	<td><h5>--including_tables</h5></td>
	<td>It is used to specify which source tables are to be synchronized. You must use '\|' to separate multiple tables.Because '\|' is a special character, a comma is required, for example: 'a\|b\|c'.Regular expression is supported, for example, specifying "--including_tables test\|paimon.*" means to synchronize table 'test' and all tables start with 'paimon'.</td>
	</tr>
	<tr>
	<td><h5>--excluding_tables</h5></td>
	<td>It is used to specify which source tables are not to be synchronized. The usage is same as "--including_tables". "--excluding_tables" has higher priority than "--including_tables" if you specified both.</td>
	</tr>
	<tr>
	<td><h5>--including_dbs</h5></td>
	<td>It is used to specify the databases within which the tables are to be synchronized. The usage is same as "--including_tables".</td>
	</tr>
	<tr>
	<td><h5>--excluding_dbs</h5></td>
	<td>It is used to specify the databases within which the tables are not to be synchronized. The usage is same as "--excluding_tables". "--excluding_dbs" has higher priority than "--including_dbs" if you specified both.</td>
	</tr>
	<tr>
	<td><h5>--type_mapping</h5></td>
	<td>It is used to specify how to map MySQL data type to Paimon type.<br />
	Supported options:
	<ul>
	<li>"tinyint1-not-bool": maps MySQL TINYINT(1) to TINYINT instead of BOOLEAN.</li>
	<li>"to-nullable": ignores all NOT NULL constraints (except for primary keys).
	This is used to solve the problem that Flink cannot accept the MySQL 'ALTER TABLE ADD COLUMN column type NOT NULL DEFAULT x' operation.
	</li>
	<li>"to-string": maps all MySQL types to STRING.</li>
	<li>"char-to-string": maps MySQL CHAR(length)/VARCHAR(length) types to STRING.</li>
	<li>"longtext-to-bytes": maps MySQL LONGTEXT types to BYTES.</li>
	<li>"bigint-unsigned-to-bigint": maps MySQL BIGINT UNSIGNED, BIGINT UNSIGNED ZEROFILL, SERIAL to BIGINT. You should ensure overflow won't occur when using this option.</li>
	<li>"decimal-no-change": Ignore decimal type change.</li>
	</ul>
	</td>
	</tr>
	<tr>
	<td><h5>--computed_column</h5></td>
	<td>The definitions of computed columns. The argument field is from Kafka topic's table field name. See <a href="../overview/#computed-functions">here</a> for a complete list of configurations. NOTICE: It returns null if the referenced column does not exist in the source table.</td>
	</tr>
	<tr>
	<td><h5>--eager_init</h5></td>
	<td>It is default false. If true, all relevant tables commiter will be initialized eagerly, which means those tables could be forced to create snapshot.</td>
	</tr>
	<tr>
	<tr>
	<td><h5>--partition_keys</h5></td>
	<td>The partition keys for Paimon table. If there are multiple partition keys, connect them with comma, for example "dt,hh,mm".
	If the keys are not in source table, the sink table won't set partition keys.</td>
	</tr>
	<tr>
	<td><h5>--multiple_table_partition_keys</h5></td>
	<td>The partition keys for each different Paimon table. If there are multiple partition keys, connect them with comma, for example
	<li>--multiple_table_partition_keys tableName1=col1,col2.col3</li>
	<li>--multiple_table_partition_keys tableName2=col4,col5.col6</li>
	<li>--multiple_table_partition_keys tableName3=col7,col8.col9</li>
	If the keys are not in source table, the sink table won't set partition keys.</td>
	</tr>
	<tr>
	<td><h5>--primary_keys</h5></td>
	<td>The primary keys for Paimon table. If there are multiple primary keys, connect them with comma, for example "buyer_id,seller_id".
	If the keys are not provided, but the source has primary keys, the sink table will use source's primary keys.
	Otherwise, the sink table won't set primary keys.
	If the keys are not provided, but the source has primary keys, and you don't want to use source's primary keys,
	use --sync_primary_keys_from_source_schema.</td>
	</tr>
	<tr>
	<td><h5>--sync_primary_keys_from_source_schema</h5></td>
	<td>This is used to specify if primary keys from source should be used in paimon schema if primary keys using --primary_keys are not specified. The default is true.</td>
	</tr>
	<tr>
	<tr>
	<td><h5>--kafka_conf</h5></td>
	<td>The configuration for Flink Kafka sources. Each configuration should be specified in the format `key=value`. `properties.bootstrap.servers`, `topic/topic-pattern`, `properties.group.id`, and `value.format` are required configurations, others are optional.See its <a href="https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/table/kafka/#connector-options">document</a> for a complete list of configurations.</td>
	</tr>
	<tr>
	<td><h5>--catalog_conf</h5></td>
	<td>The configuration for Paimon catalog. Each configuration should be specified in the format "key=value". See <a href="{{ $ref }}#catalogoptions">here</a> for a complete list of catalog configurations.</td>
	</tr>
	<tr>
	<td><h5>--table_conf</h5></td>
	<td>The configuration for Paimon table sink. Each configuration should be specified in the format "key=value". See <a href="{{ $ref }}">here</a> for a complete list of table configurations.</td>
	</tr>
	</tbody>
	</table>