| <!--- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| # DML |
| |
| DML stands for "Data Manipulation Language" and relates to inserting |
| and modifying data in tables. |
| |
| ## COPY |
| |
| Copies the contents of a table or query to file(s). Supported file |
| formats are `parquet`, `csv`, `json`, and `arrow`. |
| |
| <pre> |
| COPY { <i><b>table_name</i></b> | <i><b>query</i></b> } |
| TO '<i><b>file_name</i></b>' |
| [ STORED AS <i><b>format</i></b> ] |
| [ PARTITIONED BY <i><b>column_name</i></b> [, ...] ] |
| [ OPTIONS( <i><b>option</i></b> [, ... ] ) ] |
| </pre> |
| |
| `STORED AS` specifies the file format the `COPY` command will write. If this |
| clause is not specified, it will be inferred from the file extension if possible. |
| |
| `PARTITIONED BY` specifies the columns to use for partitioning the output files into |
| separate hive-style directories. By default, columns used in `PARTITIONED BY` will be removed |
| from the output format. If you want to keep the columns, you should provide the option |
| `execution.keep_partition_by_columns true`. `execution.keep_partition_by_columns` flag can also |
| be enabled through `ExecutionOptions` within `SessionConfig`. |
| |
| The output format is determined by the first match of the following rules: |
| |
| 1. Value of `STORED AS` |
| 2. Filename extension (e.g. `foo.parquet` implies `PARQUET` format) |
| |
| For a detailed list of valid OPTIONS, see [Format Options](format_options.md). |
| |
| ### Examples |
| |
| Copy the contents of `source_table` to `file_name.json` in JSON format: |
| |
| ```sql |
| > COPY source_table TO 'file_name.json'; |
| +-------+ |
| | count | |
| +-------+ |
| | 2 | |
| +-------+ |
| ``` |
| |
| Copy the contents of `source_table` to one or more Parquet formatted |
| files in the `dir_name` directory: |
| |
| ```sql |
| > COPY source_table TO 'dir_name' STORED AS PARQUET; |
| +-------+ |
| | count | |
| +-------+ |
| | 2 | |
| +-------+ |
| ``` |
| |
| Copy the contents of `source_table` to multiple directories |
| of hive-style partitioned parquet files: |
| |
| ```sql |
| > COPY source_table TO 'dir_name' STORED AS parquet, PARTITIONED BY (column1, column2); |
| +-------+ |
| | count | |
| +-------+ |
| | 2 | |
| +-------+ |
| ``` |
| |
| If the data contains values of `x` and `y` in column1 and only `a` in |
| column2, output files will appear in the following directory structure: |
| |
| ```text |
| dir_name/ |
| column1=x/ |
| column2=a/ |
| <file>.parquet |
| <file>.parquet |
| ... |
| column1=y/ |
| column2=a/ |
| <file>.parquet |
| <file>.parquet |
| ... |
| ``` |
| |
| Run the query `SELECT * from source ORDER BY time` and write the |
| results (maintaining the order) to a parquet file named |
| `output.parquet` with a maximum parquet row group size of 10MB: |
| |
| ```sql |
| > COPY (SELECT * from source ORDER BY time) TO 'output.parquet' OPTIONS (MAX_ROW_GROUP_SIZE 10000000); |
| +-------+ |
| | count | |
| +-------+ |
| | 2 | |
| +-------+ |
| ``` |
| |
| ## INSERT |
| |
| ### Examples |
| |
| Insert values into a table. |
| |
| <pre> |
| INSERT INTO <i><b>table_name</i></b> { VALUES ( <i><b>expression</i></b> [, ...] ) [, ...] | <i><b>query</i></b> } |
| </pre> |
| |
| ```sql |
| > INSERT INTO target_table VALUES (1, 'Foo'), (2, 'Bar'); |
| +-------+ |
| | count | |
| +-------+ |
| | 2 | |
| +-------+ |
| ``` |