docs/data-operate/export/outfile.md - doris-website - Git at Google

 ---
 {
     "title": "Select Into Outfile",
     "language": "en"
 }
 ---

 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->

 This document introduces how to use the `SELECT INTO OUTFILE` command to export query results.

 For a detailed introduction to the `SELECT INTO OUTFILE` command, refer to: [SELECT INTO OUTFILE](../../sql-manual/sql-statements/Data-Manipulation-Statements/OUTFILE.md).

 ## Overview

 The `SELECT INTO OUTFILE` command exports the result data of the `SELECT` statement to a target storage system, such as object storage, HDFS, or the local file system, in a specified file format.

 `SELECT INTO OUTFILE` is a synchronous command, meaning it completes when the command returns. If successful, it returns information about the number, size, and paths of the exported files. If it fails, it returns error information.

 For guidance on choosing between `SELECT INTO OUTFILE` and `EXPORT`, see the [Export Overview](./export-overview.md).

 ### Supported Export Formats

 `SELECT INTO OUTFILE` currently supports the following export formats:

 - Parquet
 - ORC
 - CSV
 - CSV with column names (`csv_with_names`)
 - CSV with column names and types (`csv_with_names_and_types`)

 Compressed formats are not supported.

 ### Example

 ```sql
 mysql> SELECT * FROM tbl1 LIMIT 10 INTO OUTFILE "file:///home/work/path/result_";
 +------------+-----------+----------+--------------------------------------------------------------------+
 | FileNumber | TotalRows | FileSize | URL                                                                |
 +------------+-----------+----------+--------------------------------------------------------------------+
 |          1 |         2 |        8 | file:///192.168.1.10/home/work/path/result_{fragment_instance_id}_ |
 +------------+-----------+----------+--------------------------------------------------------------------+
 ```

 Explanation of the returned results:

 - **FileNumber**: The number of generated files.
 - **TotalRows**: The number of rows in the result set.
 - **FileSize**: The total size of the exported files in bytes.
 - **URL**: The prefix of the exported file paths. Multiple files will be numbered sequentially with suffixes `_0`, `_1`, etc.

 ## Export File Column Type Mapping

 `SELECT INTO OUTFILE` supports exporting to Parquet and ORC file formats. Parquet and ORC have their own data types, and Doris can automatically map its data types to corresponding Parquet and ORC data types. Refer to the "Export File Column Type Mapping" section in the [Export Overview](./export-overview.md) document for the specific mapping relationships.

 ## Examples

 ### Export to HDFS

 Export query results to the `hdfs://path/to/` directory, specifying the export format as PARQUET:

 ```sql
 SELECT c1, c2, c3 FROM tbl
 INTO OUTFILE "hdfs://${host}:${fileSystem_port}/path/to/result_"
 FORMAT AS PARQUET
 PROPERTIES
 (
     "fs.defaultFS" = "hdfs://ip:port",
     "hadoop.username" = "hadoop"
 );
 ```

 If HDFS is configured for high availability, provide HA information, such as:

 ```sql
 SELECT c1, c2, c3 FROM tbl
 INTO OUTFILE "hdfs://HDFS8000871/path/to/result_"
 FORMAT AS PARQUET
 PROPERTIES
 (
     "fs.defaultFS" = "hdfs://HDFS8000871",
     "hadoop.username" = "hadoop",
     "dfs.nameservices" = "your-nameservices",
     "dfs.ha.namenodes.your-nameservices" = "nn1,nn2",
     "dfs.namenode.rpc-address.HDFS8000871.nn1" = "ip:port",
     "dfs.namenode.rpc-address.HDFS8000871.nn2" = "ip:port",
     "dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
 );
 ```

 If the Hadoop cluster is configured for high availability and Kerberos authentication is enabled, you can refer to the following SQL statement:

 ```sql
 SELECT * FROM tbl
 INTO OUTFILE "hdfs://path/to/result_"
 FORMAT AS PARQUET
 PROPERTIES
 (
     "fs.defaultFS"="hdfs://hacluster/",
     "hadoop.username" = "hadoop",
     "dfs.nameservices"="hacluster",
     "dfs.ha.namenodes.hacluster"="n1,n2",
     "dfs.namenode.rpc-address.hacluster.n1"="192.168.0.1:8020",
     "dfs.namenode.rpc-address.hacluster.n2"="192.168.0.2:8020",
     "dfs.client.failover.proxy.provider.hacluster"="org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
     "dfs.namenode.kerberos.principal"="hadoop/_HOST@REALM.COM",
     "hadoop.security.authentication"="kerberos",
     "hadoop.kerberos.principal"="doris_test@REALM.COM",
     "hadoop.kerberos.keytab"="/path/to/doris_test.keytab"
 );
 ```

 ### Export to S3

 Export query results to the S3 storage at `s3://path/to/` directory, specifying the export format as ORC. Provide `sk`, `ak`, and other necessary information:

 ```sql
 SELECT * FROM tbl
 INTO OUTFILE "s3://path/to/result_"
 FORMAT AS ORC
 PROPERTIES(
     "s3.endpoint" = "https://xxx",
     "s3.region" = "ap-beijing",
     "s3.access_key"= "your-ak",
     "s3.secret_key" = "your-sk"
 );
 ```

 ### Export to Local File System
 >
 > To export to the local file system, add `enable_outfile_to_local=true` in `fe.conf` and restart FE.

 Export query results to the BE's `file:///path/to/` directory, specifying the export format as CSV, with a comma as the column separator:

 ```sql
 SELECT k1 FROM tbl1 UNION SELECT k2 FROM tbl1
 INTO OUTFILE "file:///path/to/result_"
 FORMAT AS CSV
 PROPERTIES(
     "column_separator" = ","
 );
 ```

 > Note:
 Exporting to local files is not suitable for public cloud users and is intended for private deployment users only. By default, users have full control over cluster nodes. Doris does not check the validity of the export path provided by the user. If the Doris process user does not have write permissions for the path, or the path does not exist, an error will be reported. Additionally, for security reasons, if a file with the same name already exists at the path, the export will fail. Doris does not manage exported local files or check disk space. Users need to manage these files themselves, including cleanup and other tasks.

 ## Best Practices

 ### Generate Export Success Indicator File

 The `SELECT INTO OUTFILE` command is synchronous, meaning that the task connection could be interrupted during SQL execution, leaving uncertainty about whether the export completed successfully or whether the data is complete. You can use the `success_file_name` parameter to generate an indicator file upon successful export.

 Similar to Hive, users can determine whether the export completed successfully and whether the files in the export directory are complete by checking for the presence of the file specified by the `success_file_name` parameter.

 For example, exporting the results of a `SELECT` statement to Tencent Cloud COS `s3://${bucket_name}/path/my_file_`, specifying the export format as CSV, and setting the success indicator file name to `SUCCESS`:

 ```sql
 SELECT k1, k2, v1 FROM tbl1 LIMIT 100000
 INTO OUTFILE "s3://my_bucket/path/my_file_"
 FORMAT AS CSV
 PROPERTIES
 (
     "s3.endpoint" = "${endpoint}",
     "s3.region" = "ap-beijing",
     "s3.access_key"= "ak",
     "s3.secret_key" = "sk",
     "column_separator" = ",",
     "line_delimiter" = "\n",
     "success_file_name" = "SUCCESS"
 )
 ```

 Upon completion, an additional file named `SUCCESS` will be generated.

 ### Concurrent Export

 By default, the query results in the `SELECT` section are aggregated to a single BE node, which exports data single-threadedly. However, in some cases (e.g., queries without an `ORDER BY` clause), concurrent export can be enabled to have multiple BE nodes export data simultaneously, improving export performance.

 Here’s an example demonstrating how to enable concurrent export:

 1. Enable the concurrent export session variable:

 ```sql
 mysql> SET enable_parallel_outfile = true;
 ```

 2. Execute the export command:

 ```sql
 mysql> SELECT * FROM demo.tbl
     -> INTO OUTFILE "file:///path/to/ftw/export/exp_"
     -> FORMAT AS PARQUET;
 +------------+-----------+----------+-------------------------------------------------------------------------------+
 | FileNumber | TotalRows | FileSize | URL                                                                           |
 +------------+-----------+----------+-------------------------------------------------------------------------------+
 |          1 |    104494 |  7998308 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d3_ |
 |          1 |    104984 |  8052491 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d5_ |
 |          1 |    104345 |  7981406 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d1_ |
 |          1 |    104034 |  7977301 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d4_ |
 |          1 |    104238 |  7979757 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d2_ |
 |          1 |    159450 | 11870222 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d0_ |
 |          1 |    209691 | 16082100 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7ce_ |
 |          1 |    208769 | 16004096 | file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7cf_ |
 +------------+-----------+----------+-------------------------------------------------------------------------------+
 ```

 With concurrent export successfully enabled, the result may consist of multiple rows, indicating that multiple threads exported data concurrently.

 Adding an `ORDER BY` clause to the query prevents concurrent export, as the top-level sorting node necessitates single-threaded export:

 ```sql
 mysql> SELECT * FROM demo.tbl ORDER BY id
     -> INTO OUTFILE "file:///path/to/ftw/export/exp_"
     -> FORMAT AS PARQUET;
 +------------+-----------+----------+-------------------------------------------------------------------------------+
 | FileNumber | TotalRows | FileSize | URL                                                                           |
 +------------+-----------+----------+-------------------------------------------------------------------------------+
 |          1 |   1100005 | 80664607 | file:///127.0.0.1/mnt/disk2/ftw/export/exp_20c5461055774128-826256c0cfb3d8fc_ |
 +------------+-----------+----------+-------------------------------------------------------------------------------+
 ```

 Here, the result is a single row, indicating no concurrent export was triggered.

 Refer to the appendix for more details on concurrent export principles.

 ### Clear Export Directory Before Exporting

 ```sql
 SELECT * FROM tbl1
 INTO OUTFILE "s3://my_bucket/export/my_file_"
 FORMAT AS CSV
 PROPERTIES
 (
     "s3.endpoint" = "${endpoint}",
     "s3.region" = "region",
     "s3.access_key"= "ak",
     "s3.secret_key" = "sk",
     "column_separator" = ",",
     "line_delimiter" = "\n",
     "delete_existing_files" = "true"
 )
 ```

 If `"delete_existing_files" = "true"` is set, the export job will first delete all files and directories under `s3://my_bucket/export/`, then export data to that directory.

 > Note:
 To use the `delete_existing_files` parameter, add `enable_delete_existing_files = true` to `fe.conf` and restart FE. This parameter is potentially dangerous and should only be used in a testing environment.

 ### Set Export File Size

 ```sql
 SELECT * FROM tbl
 INTO OUTFILE "s3://path/to/result_"
 FORMAT AS ORC
 PROPERTIES(
     "s3.endpoint" = "https://xxx",
     "s3.region" = "ap-beijing",
     "s3.access_key"= "your-ak",
     "s3.secret_key" = "your-sk",
     "max_file_size" = "2048MB"
 );
 ```

 Specifying `"max_file_size" = "2048MB"` ensures that the final file size does not exceed 2GB. If the total size exceeds 2GB, multiple files will be generated.

 ## Considerations

 - Export Data Volume and Efficiency
     The `SELECT INTO OUTFILE` function executes a SQL query. Without concurrent export, a single BE node and thread export the query results. The total export time includes both the query execution time and the result set write-out time. Enabling concurrent export can reduce the export time.
 - Export Timeout
     The export command shares the same timeout as the query. If the data volume is large and causes the export to timeout, you can extend the query timeout by setting the session variable `query_timeout`.
 - Export File Management
     Doris does not manage exported files, whether successfully exported or remaining from failed exports. Users must handle these files themselves.
     Additionally, `SELECT INTO OUTFILE` does not check for the existence of files or file paths. Whether `SELECT INTO OUTFILE` automatically creates paths or overwrites existing files depends entirely on the semantics of the remote storage system.
 - Empty Result Sets
     Exporting an empty result set still generates an empty file.
 - File Splitting
     File splitting ensures that a single row of data is stored completely in one file. Thus, the file size may not exactly equal `max_file_size`.
 - Non-visible Character Functions
     For functions outputting non-visible characters (e.g., BITMAP, HLL types), CSV output is `\N`, and Parquet/ORC output is NULL.
     Currently, some geographic functions like `ST_Point` output VARCHAR but with encoded binary characters, causing garbled output. Use `ST_AsText` for geographic functions.

 ## Appendix

 ### Concurrent Export Principles

 - Principle Overview

    Doris is a high-performance, real-time analytical database based on the MPP (Massively Parallel Processing) architecture. MPP divides large datasets into small chunks and processes them in parallel across multiple nodes.
    Concurrent export in `SELECT INTO OUTFILE` leverages this parallel processing capability, allowing multiple BE nodes to export parts of the result set simultaneously.

 - How to Determine Concurrent Export Eligibility

     - Ensure Session Variable is Enabled: `set enable_parallel_outfile = true;`
     - Check Execution Plan with `EXPLAIN`:

     ```sql
     mysql> EXPLAIN SELECT ... INTO OUTFILE "s3://xxx" ...;
     +-----------------------------------------------------------------------------+
     | Explain String                                                              |
     +-----------------------------------------------------------------------------+
     | PLAN FRAGMENT 0                                                             |
     |  OUTPUT EXPRS:<slot 2> | <slot 3> | <slot 4> | <slot 5>                     |
     |   PARTITION: UNPARTITIONED                                                  |
     |                                                                             |
     |   RESULT SINK                                                               |
     |                                                                             |
     |   1:EXCHANGE                                                                |
     |                                                                             |
     | PLAN FRAGMENT 1                                                             |
     |  OUTPUT EXPRS:`k1`

     + `k2`                                                   |
     |   PARTITION: HASH_PARTITIONED: `default_cluster:test`.`multi_tablet`.`k1`   |
     |                                                                             |
     |   RESULT FILE SINK                                                          |
     |   FILE PATH: s3://ml-bd-repo/bpit_test/outfile_1951_                        |
     |   STORAGE TYPE: S3                                                          |
     |                                                                             |
     |   0:OlapScanNode                                                            |
     |      TABLE: multi_tablet                                                    |
     +-----------------------------------------------------------------------------+
     ```

     The `EXPLAIN` command returns the query plan. If `RESULT FILE SINK` appears in `PLAN FRAGMENT 1`, the query can be exported concurrently. If it appears in `PLAN FRAGMENT 0`, concurrent export is not possible.

 - Export Concurrency

     When concurrent export conditions are met, the export task's concurrency is determined by: `BE nodes * parallel_fragment_exec_instance_num`.
	---
	{
	"title": "Select Into Outfile",
	"language": "en"
	}
	---

	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->

	This document introduces how to use the `SELECT INTO OUTFILE` command to export query results.

	For a detailed introduction to the `SELECT INTO OUTFILE` command, refer to: [SELECT INTO OUTFILE](../../sql-manual/sql-statements/Data-Manipulation-Statements/OUTFILE.md).

	## Overview

	The `SELECT INTO OUTFILE` command exports the result data of the `SELECT` statement to a target storage system, such as object storage, HDFS, or the local file system, in a specified file format.

	`SELECT INTO OUTFILE` is a synchronous command, meaning it completes when the command returns. If successful, it returns information about the number, size, and paths of the exported files. If it fails, it returns error information.

	For guidance on choosing between `SELECT INTO OUTFILE` and `EXPORT`, see the [Export Overview](./export-overview.md).

	### Supported Export Formats

	`SELECT INTO OUTFILE` currently supports the following export formats:

	- Parquet
	- ORC
	- CSV
	- CSV with column names (`csv_with_names`)
	- CSV with column names and types (`csv_with_names_and_types`)

	Compressed formats are not supported.

	### Example

	```sql
	mysql> SELECT * FROM tbl1 LIMIT 10 INTO OUTFILE "file:///home/work/path/result_";
	+------------+-----------+----------+--------------------------------------------------------------------+
	\| FileNumber \| TotalRows \| FileSize \| URL \|
	+------------+-----------+----------+--------------------------------------------------------------------+
	\| 1 \| 2 \| 8 \| file:///192.168.1.10/home/work/path/result_{fragment_instance_id}_ \|
	+------------+-----------+----------+--------------------------------------------------------------------+
	```

	Explanation of the returned results:

	- FileNumber: The number of generated files.
	- TotalRows: The number of rows in the result set.
	- FileSize: The total size of the exported files in bytes.
	- URL: The prefix of the exported file paths. Multiple files will be numbered sequentially with suffixes `_0`, `_1`, etc.

	## Export File Column Type Mapping

	`SELECT INTO OUTFILE` supports exporting to Parquet and ORC file formats. Parquet and ORC have their own data types, and Doris can automatically map its data types to corresponding Parquet and ORC data types. Refer to the "Export File Column Type Mapping" section in the [Export Overview](./export-overview.md) document for the specific mapping relationships.

	## Examples

	### Export to HDFS

	Export query results to the `hdfs://path/to/` directory, specifying the export format as PARQUET:

	```sql
	SELECT c1, c2, c3 FROM tbl
	INTO OUTFILE "hdfs://${host}:${fileSystem_port}/path/to/result_"
	FORMAT AS PARQUET
	PROPERTIES
	(
	"fs.defaultFS" = "hdfs://ip:port",
	"hadoop.username" = "hadoop"
	);
	```

	If HDFS is configured for high availability, provide HA information, such as:

	```sql
	SELECT c1, c2, c3 FROM tbl
	INTO OUTFILE "hdfs://HDFS8000871/path/to/result_"
	FORMAT AS PARQUET
	PROPERTIES
	(
	"fs.defaultFS" = "hdfs://HDFS8000871",
	"hadoop.username" = "hadoop",
	"dfs.nameservices" = "your-nameservices",
	"dfs.ha.namenodes.your-nameservices" = "nn1,nn2",
	"dfs.namenode.rpc-address.HDFS8000871.nn1" = "ip:port",
	"dfs.namenode.rpc-address.HDFS8000871.nn2" = "ip:port",
	"dfs.client.failover.proxy.provider.HDFS8000871" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
	);
	```

	If the Hadoop cluster is configured for high availability and Kerberos authentication is enabled, you can refer to the following SQL statement:

	```sql
	SELECT * FROM tbl
	INTO OUTFILE "hdfs://path/to/result_"
	FORMAT AS PARQUET
	PROPERTIES
	(
	"fs.defaultFS"="hdfs://hacluster/",
	"hadoop.username" = "hadoop",
	"dfs.nameservices"="hacluster",
	"dfs.ha.namenodes.hacluster"="n1,n2",
	"dfs.namenode.rpc-address.hacluster.n1"="192.168.0.1:8020",
	"dfs.namenode.rpc-address.hacluster.n2"="192.168.0.2:8020",
	"dfs.client.failover.proxy.provider.hacluster"="org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
	"dfs.namenode.kerberos.principal"="hadoop/_HOST@REALM.COM",
	"hadoop.security.authentication"="kerberos",
	"hadoop.kerberos.principal"="doris_test@REALM.COM",
	"hadoop.kerberos.keytab"="/path/to/doris_test.keytab"
	);
	```

	### Export to S3

	Export query results to the S3 storage at `s3://path/to/` directory, specifying the export format as ORC. Provide `sk`, `ak`, and other necessary information:

	```sql
	SELECT * FROM tbl
	INTO OUTFILE "s3://path/to/result_"
	FORMAT AS ORC
	PROPERTIES(
	"s3.endpoint" = "https://xxx",
	"s3.region" = "ap-beijing",
	"s3.access_key"= "your-ak",
	"s3.secret_key" = "your-sk"
	);
	```

	### Export to Local File System
	>
	> To export to the local file system, add `enable_outfile_to_local=true` in `fe.conf` and restart FE.

	Export query results to the BE's `file:///path/to/` directory, specifying the export format as CSV, with a comma as the column separator:

	```sql
	SELECT k1 FROM tbl1 UNION SELECT k2 FROM tbl1
	INTO OUTFILE "file:///path/to/result_"
	FORMAT AS CSV
	PROPERTIES(
	"column_separator" = ","
	);
	```

	> Note:
	Exporting to local files is not suitable for public cloud users and is intended for private deployment users only. By default, users have full control over cluster nodes. Doris does not check the validity of the export path provided by the user. If the Doris process user does not have write permissions for the path, or the path does not exist, an error will be reported. Additionally, for security reasons, if a file with the same name already exists at the path, the export will fail. Doris does not manage exported local files or check disk space. Users need to manage these files themselves, including cleanup and other tasks.

	## Best Practices

	### Generate Export Success Indicator File

	The `SELECT INTO OUTFILE` command is synchronous, meaning that the task connection could be interrupted during SQL execution, leaving uncertainty about whether the export completed successfully or whether the data is complete. You can use the `success_file_name` parameter to generate an indicator file upon successful export.

	Similar to Hive, users can determine whether the export completed successfully and whether the files in the export directory are complete by checking for the presence of the file specified by the `success_file_name` parameter.

	For example, exporting the results of a `SELECT` statement to Tencent Cloud COS `s3://${bucket_name}/path/my_file_`, specifying the export format as CSV, and setting the success indicator file name to `SUCCESS`:

	```sql
	SELECT k1, k2, v1 FROM tbl1 LIMIT 100000
	INTO OUTFILE "s3://my_bucket/path/my_file_"
	FORMAT AS CSV
	PROPERTIES
	(
	"s3.endpoint" = "${endpoint}",
	"s3.region" = "ap-beijing",
	"s3.access_key"= "ak",
	"s3.secret_key" = "sk",
	"column_separator" = ",",
	"line_delimiter" = "\n",
	"success_file_name" = "SUCCESS"
	)
	```

	Upon completion, an additional file named `SUCCESS` will be generated.

	### Concurrent Export

	By default, the query results in the `SELECT` section are aggregated to a single BE node, which exports data single-threadedly. However, in some cases (e.g., queries without an `ORDER BY` clause), concurrent export can be enabled to have multiple BE nodes export data simultaneously, improving export performance.

	Here’s an example demonstrating how to enable concurrent export:

	1. Enable the concurrent export session variable:

	```sql
	mysql> SET enable_parallel_outfile = true;
	```

	2. Execute the export command:

	```sql
	mysql> SELECT * FROM demo.tbl
	-> INTO OUTFILE "file:///path/to/ftw/export/exp_"
	-> FORMAT AS PARQUET;
	+------------+-----------+----------+-------------------------------------------------------------------------------+
	\| FileNumber \| TotalRows \| FileSize \| URL \|
	+------------+-----------+----------+-------------------------------------------------------------------------------+
	\| 1 \| 104494 \| 7998308 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d3_ \|
	\| 1 \| 104984 \| 8052491 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d5_ \|
	\| 1 \| 104345 \| 7981406 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d1_ \|
	\| 1 \| 104034 \| 7977301 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d4_ \|
	\| 1 \| 104238 \| 7979757 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d2_ \|
	\| 1 \| 159450 \| 11870222 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7d0_ \|
	\| 1 \| 209691 \| 16082100 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7ce_ \|
	\| 1 \| 208769 \| 16004096 \| file:///127.0.0.1/path/to/exp_1f850179e684476b-9bf001a6bf96d7cf_ \|
	+------------+-----------+----------+-------------------------------------------------------------------------------+
	```

	With concurrent export successfully enabled, the result may consist of multiple rows, indicating that multiple threads exported data concurrently.

	Adding an `ORDER BY` clause to the query prevents concurrent export, as the top-level sorting node necessitates single-threaded export:

	```sql
	mysql> SELECT * FROM demo.tbl ORDER BY id
	-> INTO OUTFILE "file:///path/to/ftw/export/exp_"
	-> FORMAT AS PARQUET;
	+------------+-----------+----------+-------------------------------------------------------------------------------+
	\| FileNumber \| TotalRows \| FileSize \| URL \|
	+------------+-----------+----------+-------------------------------------------------------------------------------+
	\| 1 \| 1100005 \| 80664607 \| file:///127.0.0.1/mnt/disk2/ftw/export/exp_20c5461055774128-826256c0cfb3d8fc_ \|
	+------------+-----------+----------+-------------------------------------------------------------------------------+
	```

	Here, the result is a single row, indicating no concurrent export was triggered.

	Refer to the appendix for more details on concurrent export principles.

	### Clear Export Directory Before Exporting

	```sql
	SELECT * FROM tbl1
	INTO OUTFILE "s3://my_bucket/export/my_file_"
	FORMAT AS CSV
	PROPERTIES
	(
	"s3.endpoint" = "${endpoint}",
	"s3.region" = "region",
	"s3.access_key"= "ak",
	"s3.secret_key" = "sk",
	"column_separator" = ",",
	"line_delimiter" = "\n",
	"delete_existing_files" = "true"
	)
	```

	If `"delete_existing_files" = "true"` is set, the export job will first delete all files and directories under `s3://my_bucket/export/`, then export data to that directory.

	> Note:
	To use the `delete_existing_files` parameter, add `enable_delete_existing_files = true` to `fe.conf` and restart FE. This parameter is potentially dangerous and should only be used in a testing environment.

	### Set Export File Size

	```sql
	SELECT * FROM tbl
	INTO OUTFILE "s3://path/to/result_"
	FORMAT AS ORC
	PROPERTIES(
	"s3.endpoint" = "https://xxx",
	"s3.region" = "ap-beijing",
	"s3.access_key"= "your-ak",
	"s3.secret_key" = "your-sk",
	"max_file_size" = "2048MB"
	);
	```

	Specifying `"max_file_size" = "2048MB"` ensures that the final file size does not exceed 2GB. If the total size exceeds 2GB, multiple files will be generated.

	## Considerations

	- Export Data Volume and Efficiency
	The `SELECT INTO OUTFILE` function executes a SQL query. Without concurrent export, a single BE node and thread export the query results. The total export time includes both the query execution time and the result set write-out time. Enabling concurrent export can reduce the export time.
	- Export Timeout
	The export command shares the same timeout as the query. If the data volume is large and causes the export to timeout, you can extend the query timeout by setting the session variable `query_timeout`.
	- Export File Management
	Doris does not manage exported files, whether successfully exported or remaining from failed exports. Users must handle these files themselves.
	Additionally, `SELECT INTO OUTFILE` does not check for the existence of files or file paths. Whether `SELECT INTO OUTFILE` automatically creates paths or overwrites existing files depends entirely on the semantics of the remote storage system.
	- Empty Result Sets
	Exporting an empty result set still generates an empty file.
	- File Splitting
	File splitting ensures that a single row of data is stored completely in one file. Thus, the file size may not exactly equal `max_file_size`.
	- Non-visible Character Functions
	For functions outputting non-visible characters (e.g., BITMAP, HLL types), CSV output is `\N`, and Parquet/ORC output is NULL.
	Currently, some geographic functions like `ST_Point` output VARCHAR but with encoded binary characters, causing garbled output. Use `ST_AsText` for geographic functions.

	## Appendix

	### Concurrent Export Principles

	- Principle Overview

	Doris is a high-performance, real-time analytical database based on the MPP (Massively Parallel Processing) architecture. MPP divides large datasets into small chunks and processes them in parallel across multiple nodes.
	Concurrent export in `SELECT INTO OUTFILE` leverages this parallel processing capability, allowing multiple BE nodes to export parts of the result set simultaneously.

	- How to Determine Concurrent Export Eligibility

	- Ensure Session Variable is Enabled: `set enable_parallel_outfile = true;`
	- Check Execution Plan with `EXPLAIN`:

	```sql
	mysql> EXPLAIN SELECT ... INTO OUTFILE "s3://xxx" ...;
	+-----------------------------------------------------------------------------+
	\| Explain String \|
	+-----------------------------------------------------------------------------+
	\| PLAN FRAGMENT 0 \|
	\| OUTPUT EXPRS:<slot 2> \| <slot 3> \| <slot 4> \| <slot 5> \|
	\| PARTITION: UNPARTITIONED \|
	\| \|
	\| RESULT SINK \|
	\| \|
	\| 1:EXCHANGE \|
	\| \|
	\| PLAN FRAGMENT 1 \|
	\| OUTPUT EXPRS:`k1`

	+ `k2` \|
	\| PARTITION: HASH_PARTITIONED: `default_cluster:test`.`multi_tablet`.`k1` \|
	\| \|
	\| RESULT FILE SINK \|
	\| FILE PATH: s3://ml-bd-repo/bpit_test/outfile_1951_ \|
	\| STORAGE TYPE: S3 \|
	\| \|
	\| 0:OlapScanNode \|
	\| TABLE: multi_tablet \|
	+-----------------------------------------------------------------------------+
	```

	The `EXPLAIN` command returns the query plan. If `RESULT FILE SINK` appears in `PLAN FRAGMENT 1`, the query can be exported concurrently. If it appears in `PLAN FRAGMENT 0`, concurrent export is not possible.

	- Export Concurrency

	When concurrent export conditions are met, the export task's concurrency is determined by: `BE nodes * parallel_fragment_exec_instance_num`.