docs/_docs/sql-reference/operational-commands.adoc - ignite-3 - Git at Google

 // Licensed to the Apache Software Foundation (ASF) under one or more
 // contributor license agreements.  See the NOTICE file distributed with
 // this work for additional information regarding copyright ownership.
 // The ASF licenses this file to You under the Apache License, Version 2.0
 // (the "License"); you may not use this file except in compliance with
 // the License.  You may obtain a copy of the License at
 //
 // http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing, software
 // distributed under the License is distributed on an "AS IS" BASIS,
 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 // See the License for the specific language governing permissions and
 // limitations under the License.
 = Operational Commands

 == SELECT

 Retrieves data from a table or multiple tables.

 [.diagram-container]
 Diagram(
 Terminal('SELECT'),
 Optional(NonTerminal('hint_comment')),
 Optional(Sequence(
 Choice(1,
 Terminal('ALL'),
 Terminal('DISTINCT')
 ))),
 Sequence(
 OneOrMore(Sequence(Terminal('project_item', {href:'./grammar-reference/#project_item'})
 ),
 Terminal(',')
 ),
 ),
 Terminal('FROM'),
 NonTerminal('table_expression', {href:'./grammar-reference/#table_expression'}),
 Optional(
 Sequence(
 NonTerminal('WHERE'),
 Terminal('where_boolean_expression')
 )
 ),
 End({type:'complex'}))

 [.diagram-container]
 Diagram(
 Start({type:'complex'}),
 Optional(
 Sequence(
 NonTerminal('GROUP BY'),
 Optional(Choice(1,
 Terminal('ALL'),
 Terminal('DISTINCT')
 )),
 OneOrMore(Sequence(Terminal('group_item', {href:'./grammar-reference/#group_item'})
 ),
 Terminal(',')
 ),
 )
 ),
 Optional(
 Sequence(
 NonTerminal('HAVING'),
 Terminal('having_boolean_expression')
 )))

 === Parameters

 - `hint_comment` - an sql link:sql-tuning#optimizer-hints[optimizer hint].
 - `where_boolean_expression` - an SQL expression that is run against table records and returns a boolean value. Only the records for which `TRUE` was returned will be returned. If not specified, all matching records are returned.
 - `having_boolean_expression` - an SQL expression that is run against groups and returns a boolean value. Can use link:sql-reference/operators-and-functions#aggregate-functions[aggregate functions]. Only the groups for which `TRUE` was returned will be returned. If not specified, all matching groups are returned.


 === JOINs

 Apache Ignite supports colocated and non-colocated distributed SQL joins. Furthermore, if the data resides in different tables, Apache Ignite allows for cross-table joins as well.


 == COPY INTO

 === Description

 Imports data from an external source into a table or exports data to a file from the table. The table the data is imported into must exist.

 [.diagram-container]
 Diagram(
 Terminal('COPY'),
 Terminal('FROM'),
 NonTerminal('source'),
 Terminal('INTO'),
 NonTerminal('target'),
 Terminal('FORMAT'),
 NonTerminal('formatType'),
 Optional(
 Sequence(
 Terminal('WITH'),
 OneOrMore (NonTerminal('parameter', {href:'./grammar-reference/#parameter'}), Terminal(','))
 )
 )
 )

 === Parameters

 * `source` - the full path to the file to import the data from (if `target` is a table). Then name of the table and the list of columns to export the data from (if `target` is a file). If the source is a table, you can further narrow down the selection by using a query.
 * `target` - the table and table columns when importing data to the table. The path to the file when exporting data.
 * `formatType` - the format of the file to work with:
 ** `CSV`
 ** `PARQUET`
 ** `ICEBERG`
 * `WITH` - accepts additional parameters. All properties are case-sensitive:
 ** `delimiter` -  for CSV files only; default delimiter is `,`. Delimiter syntax is `'char'`. Any alphanumeric character can be a delimiter.
 ** `pattern` - the file pattern used when importing partitioned Parquet tables in the regular expression format. The regular expression must be enclosed in `'` signs. For example, `'.*'` imports all files. Partitioned column will not be imported.
 ** `header` - for CSV files only; if set to `true`, specifies that the created file should contain a header line with column names. Column names of the table are used to create the header line.
 ** `quoteChar` - for CSV files only; the character to use for quoted elements. Default quote character is `"`.
 ** `escapeChar` - for CSV files only; the character to use for escaping a separator or quote. Default escape character is `\`.
 ** `ignoreLeadingWhiteSpace` - for CSV files only; if `true`, ignore leading whitespace, keep them otherwise. Default value is `true`.
 ** `ignoreQuotations` - for CSV files only; if `true`, quoted elements are ignored, otherwise they are kept. Default value is `false`.
 ** `strictQuotes` - for CSV files only; if `true`, characters outside the quoted elements are ignored, otherwise they are kept. Default value is `false`.
 ** `batchSize` - the number of entries loaded in each batch. Default value is `1024`.
 ** `null` - what value empty values are mapped to on import, and `null` values are mapped to on export. By default, empty values are mapped to `null` on import, and `null` values are mapped to an empty string (`""`) on export.
 ** `S3`:
 *** `s3.client-region` - when using S3 storage, the region to store the data in.
 *** `s3.access-key-id` - when using S3 storage, the AWS access key.
 *** `s3.secret-access-key` - when using S3 storage,  the AWS secret key.
 *** Iceberg-specific:
 **** The link:https://iceberg.apache.org/docs/latest/configuration/#catalog-properties[catalog properties] are supported.
 **** The `table-identifier` property describes Apache Iceberg TableIdentifier names. The names can be dot-separated. For example, `db_name.table_name` or `table`.
 **** The `warehouse` path can be defined explicitly as a `'warehouse'='path'` property, or implicitly as a source or target `COPY FROM source INTO target`. If both ways are defined, the explicit property is used.

 === Examples

 Imports data from columns `name` and `age` of a CSV file with header into Table1 columns `name` and `age`:

 [source,sql]
 ----
 /* Import data from CSV with column headers */
 COPY FROM '/path/to/dir/data.csv'
 INTO Table1 (name, age)
 FORMAT CSV
 ----

 Imports data from the first two columns of a CSV file without header into Table1 columns `name` and `age`:

 [source,sql]
 ----
 /* Import data from CSV without column headers  */
 COPY FROM '/path/to/dir/data.csv'
 INTO Table1 (name, age)
 ----

 Imports data from columns `name` and `age` of a CSV file that uses the `~` symbol as quotation character:

 [source,sql]
 ----
 /* Import data from CSV with custom quotation character */
 COPY FROM '/path/to/dir/data.csv'
 INTO Table1 (name, age)
 FORMAT CSV
 WITH 'quoteChar'='~'
 ----

 Imports data from columns `name`, `age` and `empty` of a CSV file and maps all empty values to `no data`:

 [source,sql]
 ----
 /* Import data from CSV with custom quotation character */
 COPY FROM '/path/to/dir/data.csv'
 INTO Table1 (name, age, empty)
 FORMAT CSV
 WITH 'null'='no data'
 ----

 Imports data from CSV table in batches of 2048 entries:

 [source,sql]
 ----
 /* Import data from CSV without column headers  */
 COPY FROM '/path/to/dir/data.csv'
 INTO Table1 (name, age)
 WITH 'batchSize'='2048'
 ----

 Exports data from Table1 to a CSV file:

 [source,sql]
 ----
 /* Export data to CSV */
 COPY FROM (SELECT name, age FROM Table1)
 INTO  '/path/to/dir/data.csv'
 FORMAT CSV
 ----

 Imports CSV file from AWS S3 into Table1:

 [source,sql]
 ----
 /* Import CSV file from s3 */
 COPY FROM 's3://mybucket/data.csv'
 INTO Table1 (name, age)
 FORMAT CSV
 WITH 'delimiter'= '|', 's3.access-key-id' = 'keyid', 's3.secret-access-key' = 'secretkey'
 ----


 A simple example of exporting data to Iceberg. For working with local file system you can use HadoopCatalog that does not need to connect to a Hive MetaStore.

 [source,sql]
 ----
 COPY FROM Table1 (id,name,height)
 INTO '/tmp/person.i/'
 FORMAT ICEBERG
 WITH 'table-identifier'='person', 'catalog-impl'='org.apache.iceberg.hadoop.HadoopCatalog'
 ----

 Exports data into Iceberg on AWS S3:

 [source,sql]
 ----
 COPY FROM person (id,name,height)
 INTO 's3://iceberg-warehouse/glue-catalog'
 FORMAT ICEBERG
 WITH
     'table-identifier'='iceberg_db_1.person',
     'io-impl'='org.apache.iceberg.aws.s3.S3FileIO',
     'catalog-impl'='org.apache.iceberg.aws.glue.GlueCatalog',
     's3.client-region'='eu-central-1',
     's3.access-key-id'='YOUR_KEY',
     's3.secret-access-key'='YOUR_SECRET'
 ----

 WARNING: Glue catalog requires the table identifier pattern `db_name.table_name - [a-z0-9_]` (all letters must be in low case with underscores and no spaces). It can be disabled by setting the `glue.skip-name-validation` property to `true` to skip validation. When database name and table name validation are skipped, there is no guarantee that downstream systems would all support the names.

 Imports data from partitioned Parquet database:

 [source,sql]
 ----
 COPY FROM '/tmp/partitioned_table_dir'
 INTO city (id, name, population)
 FORMAT PARQUET
 WITH 'pattern' = '.*'
 ----

 Where the Parquet table looks like this:

 ----
 partitioned_table_dir/
 ├─ CountryCode=USA/
 │  ├─ 000000_0.parquet
 ├─ CountryCode=FR/
 │  ├─ 000000_0.parquet
 ----

 == KILL QUERY

 Cancels a running query. When a query is canceled with the `KILL` command, all parts of the query running on all other nodes are canceled too.

 [.diagram-container]
 Diagram(
 Terminal('KILL'),
 Terminal('QUERY'),
 NonTerminal('query_id'),
 Optional(Terminal('NO WAIT'))
 )

 === Parameters

 * `query_id` - query identifier that can be retrieved via the `SQL_QUERIES` link:administrators-guide/metrics/system-views[system view].
 * `NO WAIT` - if specified, the command will return control immediately, without waiting for the query to be cancelled. You can monitor query status through the `SQL_QUERIES` link:administrators-guide/metrics/system-views[system view] to make sure it was cancelled.

 == KILL TRANSACTION

 Cancels an active transaction.

 [.diagram-container]
 Diagram(
 Terminal('KILL'),
 Terminal('TRANSACTION'),
 NonTerminal('transaction_id'),
 Optional(Terminal('NO WAIT'))
 )

 === Parameters

 * `transaction_id` - transaction identifier that can be retrieved via the `TRANSACTIONS` link:administrators-guide/metrics/system-views[system view].
 * `NO WAIT` - if specified, the command will return control immediately, without waiting for the transaction to be cancelled. You can monitor transaction status through the `TRANSACTIONS` link:administrators-guide/metrics/system-views[system view] to make sure it was cancelled.

 == KILL COMPUTE

 Cancels a running compute job. When a job is canceled with the `KILL` command, all parts of the job running on all other nodes are canceled too.

 [.diagram-container]
 Diagram(
 Terminal('KILL'),
 Terminal('COMPUTE'),
 NonTerminal('job_id'),
 Optional(Terminal('NO WAIT'))
 )

 === Parameters

 * `job_id` - job identifier that can be retrieved via the `COMPUTE_JOBS` link:administrators-guide/metrics/system-views[system view].
 * `NO WAIT` - if specified, the command will return control immediately, without waiting for the job to be cancelled. You can monitor job status through the `COMPUTE_JOBS` link:administrators-guide/metrics/system-views[system view] to make sure it was cancelled.
	// Licensed to the Apache Software Foundation (ASF) under one or more
	// contributor license agreements. See the NOTICE file distributed with
	// this work for additional information regarding copyright ownership.
	// The ASF licenses this file to You under the Apache License, Version 2.0
	// (the "License"); you may not use this file except in compliance with
	// the License. You may obtain a copy of the License at
	//
	// http://www.apache.org/licenses/LICENSE-2.0
	//
	// Unless required by applicable law or agreed to in writing, software
	// distributed under the License is distributed on an "AS IS" BASIS,
	// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	// See the License for the specific language governing permissions and
	// limitations under the License.
	= Operational Commands

	== SELECT

	Retrieves data from a table or multiple tables.

	[.diagram-container]
	Diagram(
	Terminal('SELECT'),
	Optional(NonTerminal('hint_comment')),
	Optional(Sequence(
	Choice(1,
	Terminal('ALL'),
	Terminal('DISTINCT')
	))),
	Sequence(
	OneOrMore(Sequence(Terminal('project_item', {href:'./grammar-reference/#project_item'})
	),
	Terminal(',')
	),
	),
	Terminal('FROM'),
	NonTerminal('table_expression', {href:'./grammar-reference/#table_expression'}),
	Optional(
	Sequence(
	NonTerminal('WHERE'),
	Terminal('where_boolean_expression')
	)
	),
	End({type:'complex'}))

	[.diagram-container]
	Diagram(
	Start({type:'complex'}),
	Optional(
	Sequence(
	NonTerminal('GROUP BY'),
	Optional(Choice(1,
	Terminal('ALL'),
	Terminal('DISTINCT')
	)),
	OneOrMore(Sequence(Terminal('group_item', {href:'./grammar-reference/#group_item'})
	),
	Terminal(',')
	),
	)
	),
	Optional(
	Sequence(
	NonTerminal('HAVING'),
	Terminal('having_boolean_expression')
	)))

	=== Parameters

	- `hint_comment` - an sql link:sql-tuning#optimizer-hints[optimizer hint].
	- `where_boolean_expression` - an SQL expression that is run against table records and returns a boolean value. Only the records for which `TRUE` was returned will be returned. If not specified, all matching records are returned.
	- `having_boolean_expression` - an SQL expression that is run against groups and returns a boolean value. Can use link:sql-reference/operators-and-functions#aggregate-functions[aggregate functions]. Only the groups for which `TRUE` was returned will be returned. If not specified, all matching groups are returned.


	=== JOINs

	Apache Ignite supports colocated and non-colocated distributed SQL joins. Furthermore, if the data resides in different tables, Apache Ignite allows for cross-table joins as well.


	== COPY INTO

	=== Description

	Imports data from an external source into a table or exports data to a file from the table. The table the data is imported into must exist.

	[.diagram-container]
	Diagram(
	Terminal('COPY'),
	Terminal('FROM'),
	NonTerminal('source'),
	Terminal('INTO'),
	NonTerminal('target'),
	Terminal('FORMAT'),
	NonTerminal('formatType'),
	Optional(
	Sequence(
	Terminal('WITH'),
	OneOrMore (NonTerminal('parameter', {href:'./grammar-reference/#parameter'}), Terminal(','))
	)
	)
	)

	=== Parameters

	* `source` - the full path to the file to import the data from (if `target` is a table). Then name of the table and the list of columns to export the data from (if `target` is a file). If the source is a table, you can further narrow down the selection by using a query.
	* `target` - the table and table columns when importing data to the table. The path to the file when exporting data.
	* `formatType` - the format of the file to work with:
	** `CSV`
	** `PARQUET`
	** `ICEBERG`
	* `WITH` - accepts additional parameters. All properties are case-sensitive:
	** `delimiter` - for CSV files only; default delimiter is `,`. Delimiter syntax is `'char'`. Any alphanumeric character can be a delimiter.
	** `pattern` - the file pattern used when importing partitioned Parquet tables in the regular expression format. The regular expression must be enclosed in `'` signs. For example, `'.*'` imports all files. Partitioned column will not be imported.
	** `header` - for CSV files only; if set to `true`, specifies that the created file should contain a header line with column names. Column names of the table are used to create the header line.
	** `quoteChar` - for CSV files only; the character to use for quoted elements. Default quote character is `"`.
	** `escapeChar` - for CSV files only; the character to use for escaping a separator or quote. Default escape character is `\`.
	** `ignoreLeadingWhiteSpace` - for CSV files only; if `true`, ignore leading whitespace, keep them otherwise. Default value is `true`.
	** `ignoreQuotations` - for CSV files only; if `true`, quoted elements are ignored, otherwise they are kept. Default value is `false`.
	** `strictQuotes` - for CSV files only; if `true`, characters outside the quoted elements are ignored, otherwise they are kept. Default value is `false`.
	** `batchSize` - the number of entries loaded in each batch. Default value is `1024`.
	** `null` - what value empty values are mapped to on import, and `null` values are mapped to on export. By default, empty values are mapped to `null` on import, and `null` values are mapped to an empty string (`""`) on export.
	** `S3`:
	*** `s3.client-region` - when using S3 storage, the region to store the data in.
	*** `s3.access-key-id` - when using S3 storage, the AWS access key.
	*** `s3.secret-access-key` - when using S3 storage, the AWS secret key.
	*** Iceberg-specific:
	**** The link:https://iceberg.apache.org/docs/latest/configuration/#catalog-properties[catalog properties] are supported.
	**** The `table-identifier` property describes Apache Iceberg TableIdentifier names. The names can be dot-separated. For example, `db_name.table_name` or `table`.
	**** The `warehouse` path can be defined explicitly as a `'warehouse'='path'` property, or implicitly as a source or target `COPY FROM source INTO target`. If both ways are defined, the explicit property is used.

	=== Examples

	Imports data from columns `name` and `age` of a CSV file with header into Table1 columns `name` and `age`:

	[source,sql]
	----
	/* Import data from CSV with column headers */
	COPY FROM '/path/to/dir/data.csv'
	INTO Table1 (name, age)
	FORMAT CSV
	----

	Imports data from the first two columns of a CSV file without header into Table1 columns `name` and `age`:

	[source,sql]
	----
	/* Import data from CSV without column headers */
	COPY FROM '/path/to/dir/data.csv'
	INTO Table1 (name, age)
	----

	Imports data from columns `name` and `age` of a CSV file that uses the `~` symbol as quotation character:

	[source,sql]
	----
	/* Import data from CSV with custom quotation character */
	COPY FROM '/path/to/dir/data.csv'
	INTO Table1 (name, age)
	FORMAT CSV
	WITH 'quoteChar'='~'
	----

	Imports data from columns `name`, `age` and `empty` of a CSV file and maps all empty values to `no data`:

	[source,sql]
	----
	/* Import data from CSV with custom quotation character */
	COPY FROM '/path/to/dir/data.csv'
	INTO Table1 (name, age, empty)
	FORMAT CSV
	WITH 'null'='no data'
	----

	Imports data from CSV table in batches of 2048 entries:

	[source,sql]
	----
	/* Import data from CSV without column headers */
	COPY FROM '/path/to/dir/data.csv'
	INTO Table1 (name, age)
	WITH 'batchSize'='2048'
	----

	Exports data from Table1 to a CSV file:

	[source,sql]
	----
	/* Export data to CSV */
	COPY FROM (SELECT name, age FROM Table1)
	INTO '/path/to/dir/data.csv'
	FORMAT CSV
	----

	Imports CSV file from AWS S3 into Table1:

	[source,sql]
	----
	/* Import CSV file from s3 */
	COPY FROM 's3://mybucket/data.csv'
	INTO Table1 (name, age)
	FORMAT CSV
	WITH 'delimiter'= '\|', 's3.access-key-id' = 'keyid', 's3.secret-access-key' = 'secretkey'
	----


	A simple example of exporting data to Iceberg. For working with local file system you can use HadoopCatalog that does not need to connect to a Hive MetaStore.

	[source,sql]
	----
	COPY FROM Table1 (id,name,height)
	INTO '/tmp/person.i/'
	FORMAT ICEBERG
	WITH 'table-identifier'='person', 'catalog-impl'='org.apache.iceberg.hadoop.HadoopCatalog'
	----

	Exports data into Iceberg on AWS S3:

	[source,sql]
	----
	COPY FROM person (id,name,height)
	INTO 's3://iceberg-warehouse/glue-catalog'
	FORMAT ICEBERG
	WITH
	'table-identifier'='iceberg_db_1.person',
	'io-impl'='org.apache.iceberg.aws.s3.S3FileIO',
	'catalog-impl'='org.apache.iceberg.aws.glue.GlueCatalog',
	's3.client-region'='eu-central-1',
	's3.access-key-id'='YOUR_KEY',
	's3.secret-access-key'='YOUR_SECRET'
	----

	WARNING: Glue catalog requires the table identifier pattern `db_name.table_name - [a-z0-9_]` (all letters must be in low case with underscores and no spaces). It can be disabled by setting the `glue.skip-name-validation` property to `true` to skip validation. When database name and table name validation are skipped, there is no guarantee that downstream systems would all support the names.

	Imports data from partitioned Parquet database:

	[source,sql]
	----
	COPY FROM '/tmp/partitioned_table_dir'
	INTO city (id, name, population)
	FORMAT PARQUET
	WITH 'pattern' = '.*'
	----

	Where the Parquet table looks like this:

	----
	partitioned_table_dir/
	├─ CountryCode=USA/
	│ ├─ 000000_0.parquet
	├─ CountryCode=FR/
	│ ├─ 000000_0.parquet
	----

	== KILL QUERY

	Cancels a running query. When a query is canceled with the `KILL` command, all parts of the query running on all other nodes are canceled too.

	[.diagram-container]
	Diagram(
	Terminal('KILL'),
	Terminal('QUERY'),
	NonTerminal('query_id'),
	Optional(Terminal('NO WAIT'))
	)

	=== Parameters

	* `query_id` - query identifier that can be retrieved via the `SQL_QUERIES` link:administrators-guide/metrics/system-views[system view].
	* `NO WAIT` - if specified, the command will return control immediately, without waiting for the query to be cancelled. You can monitor query status through the `SQL_QUERIES` link:administrators-guide/metrics/system-views[system view] to make sure it was cancelled.

	== KILL TRANSACTION

	Cancels an active transaction.

	[.diagram-container]
	Diagram(
	Terminal('KILL'),
	Terminal('TRANSACTION'),
	NonTerminal('transaction_id'),
	Optional(Terminal('NO WAIT'))
	)

	=== Parameters

	* `transaction_id` - transaction identifier that can be retrieved via the `TRANSACTIONS` link:administrators-guide/metrics/system-views[system view].
	* `NO WAIT` - if specified, the command will return control immediately, without waiting for the transaction to be cancelled. You can monitor transaction status through the `TRANSACTIONS` link:administrators-guide/metrics/system-views[system view] to make sure it was cancelled.

	== KILL COMPUTE

	Cancels a running compute job. When a job is canceled with the `KILL` command, all parts of the job running on all other nodes are canceled too.

	[.diagram-container]
	Diagram(
	Terminal('KILL'),
	Terminal('COMPUTE'),
	NonTerminal('job_id'),
	Optional(Terminal('NO WAIT'))
	)

	=== Parameters

	* `job_id` - job identifier that can be retrieved via the `COMPUTE_JOBS` link:administrators-guide/metrics/system-views[system view].
	* `NO WAIT` - if specified, the command will return control immediately, without waiting for the job to be cancelled. You can monitor job status through the `COMPUTE_JOBS` link:administrators-guide/metrics/system-views[system view] to make sure it was cancelled.