docs/sql-manual/sql-statements/Data-Definition-Statements/Create/CREATE-RESOURCE.md - doris-website - Git at Google

 ---
 {
     "title": "CREATE-RESOURCE",
     "language": "en"
 }
 ---

 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->

 ## CREATE-RESOURCE

 ### Name

 CREATE RESOURCE

 ### Description

 This statement is used to create a resource. Only the root or admin user can create resources. Currently supports Spark, ODBC, S3 external resources.
 In the future, other external resources may be added to Doris for use, such as Spark/GPU for query, HDFS/S3 for external storage, MapReduce for ETL, etc.

 grammar:

 ```sql
 CREATE [EXTERNAL] RESOURCE "resource_name"
 PROPERTIES ("key"="value", ...);
 ```

 illustrate:

 - The type of resource needs to be specified in PROPERTIES "type" = "[spark|odbc_catalog|s3]", currently supports spark, odbc_catalog, s3.
 - PROPERTIES differs depending on the resource type, see the example for details.

 ### Example

 1. Create a Spark resource named spark0 in yarn cluster mode.

    ```sql
    CREATE EXTERNAL RESOURCE "spark0"
    PROPERTIES
    (
      "type" = "spark",
      "spark.master" = "yarn",
      "spark.submit.deployMode" = "cluster",
      "spark.jars" = "xxx.jar,yyy.jar",
      "spark.files" = "/tmp/aaa,/tmp/bbb",
      "spark.executor.memory" = "1g",
      "spark.yarn.queue" = "queue0",
      "spark.hadoop.yarn.resourcemanager.address" = "127.0.0.1:9999",
      "spark.hadoop.fs.defaultFS" = "hdfs://127.0.0.1:10000",
      "working_dir" = "hdfs://127.0.0.1:10000/tmp/doris",
      "broker" = "broker0",
      "broker.username" = "user0",
      "broker.password" = "password0"
    );
    ```

    Spark related parameters are as follows:
    - spark.master: Required, currently supports yarn, spark://host:port.
    - spark.submit.deployMode: The deployment mode of the Spark program, required, supports both cluster and client.
    - spark.hadoop.yarn.resourcemanager.address: Required when master is yarn.
    - spark.hadoop.fs.defaultFS: Required when master is yarn.
    - Other parameters are optional, refer to [here](http://spark.apache.org/docs/latest/configuration.html)


    Working_dir and broker need to be specified when Spark is used for ETL. described as follows:

    - working_dir: The directory used by the ETL. Required when spark is used as an ETL resource. For example: hdfs://host:port/tmp/doris.
    - broker: broker name. Required when spark is used as an ETL resource. Configuration needs to be done in advance using the `ALTER SYSTEM ADD BROKER` command.
    - broker.property_key: The authentication information that the broker needs to specify when reading the intermediate file generated by ETL.

 2. Create an ODBC resource

    ```sql
    CREATE EXTERNAL RESOURCE `oracle_odbc`
    PROPERTIES (
    "type" = "odbc_catalog",
    "host" = "192.168.0.1",
    "port" = "8086",
    "user" = "test",
    "password" = "test",
    "database" = "test",
    "odbc_type" = "oracle",
    "driver" = "Oracle 19 ODBC driver"
    );
    ```

    The relevant parameters of ODBC are as follows:
    - hosts: IP address of the external database
    - driver: The driver name of the ODBC appearance, which must be the same as the Driver name in be/conf/odbcinst.ini.
    - odbc_type: the type of the external database, currently supports oracle, mysql, postgresql
    - user: username of the foreign database
    - password: the password information of the corresponding user
    - charset: connection charset
    - There is also support for implementing custom parameters per ODBC Driver, see the description of the corresponding ODBC Driver

 3. Create S3 resource

    ```sql
    CREATE RESOURCE "remote_s3"
    PROPERTIES
    (
       "type" = "s3",
       "s3.endpoint" = "bj.s3.com",
       "s3.region" = "bj",
       "s3.access_key" = "bbb",
       "s3.secret_key" = "aaaa",
       -- the followings are optional
       "s3.connection.maximum" = "50",
       "s3.connection.request.timeout" = "3000",
       "s3.connection.timeout" = "1000"
    );
    ```

    If S3 resource is used for [cold hot separation](../../../../../docs/advanced/cold_hot_separation.md), we should add more required fields.
    ```sql
    CREATE RESOURCE "remote_s3"
    PROPERTIES
    (
       "type" = "s3",
       "s3.endpoint" = "bj.s3.com",
       "s3.region" = "bj",
       "s3.access_key" = "bbb",
       "s3.secret_key" = "aaaa",
       -- required by cooldown
       "s3.root.path" = "/path/to/root",
       "s3.bucket" = "test-bucket"
    );
    ```

    S3 related parameters are as follows:
    - Required parameters
        - `s3.endpoint`: s3 endpoint
        - `s3.region`:s3 region
        - `s3.root.path`: s3 root directory
        - `s3.access_key`: s3 access key
        - `s3.secret_key`: s3 secret key
        - `s3.bucket`：s3 bucket
    - optional parameter
        - `s3.connection.maximum`: the maximum number of s3 connections, the default is 50
        - `s3.connection.request.timeout`: s3 request timeout, in milliseconds, the default is 3000
        - `s3.connection.timeout`: s3 connection timeout, in milliseconds, the default is 1000

 4. Create JDBC resource

    ```sql
    CREATE RESOURCE mysql_resource PROPERTIES (
       "type"="jdbc",
       "user"="root",
       "password"="123456",
       "jdbc_url" = "jdbc:mysql://127.0.0.1:3316/doris_test?useSSL=false",
       "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar",
    "driver_class" = "com.mysql.cj.jdbc.Driver"
    );
    ```

    JDBC related parameters are as follows:
    - user：The username used to connect to the database
    - password：The password used to connect to the database
    - jdbc_url: The identifier used to connect to the specified database
    - driver_url: The url of JDBC driver package
    - driver_class: The class of JDBC driver

 5. Create HDFS resource

    ```sql
    CREATE RESOURCE hdfs_resource PROPERTIES (
       "type"="hdfs",
       "hadoop.username"="user",
       "root_path"="your_path",
       "dfs.nameservices" = "my_ha",
       "dfs.ha.namenodes.my_ha" = "my_namenode1, my_namenode2",
       "dfs.namenode.rpc-address.my_ha.my_namenode1" = "nn1_host:rpc_port",
       "dfs.namenode.rpc-address.my_ha.my_namenode2" = "nn2_host:rpc_port",
       "dfs.client.failover.proxy.provider.my_ha" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
    );
    ```

    HDFS related parameters are as follows:
    - fs.defaultFS: namenode address and port
    - hadoop.username: hdfs username
    - dfs.nameservices: if hadoop enable HA, please set fs nameservice. See hdfs-site.xml
    - dfs.ha.namenodes.[nameservice ID]：unique identifiers for each NameNode in the nameservice. See hdfs-site.xml
    - dfs.namenode.rpc-address.[nameservice ID].[name node ID]`：the fully-qualified RPC address for each NameNode to listen on. See hdfs-site.xml
    - dfs.client.failover.proxy.provider.[nameservice ID]：the Java class that HDFS clients use to contact the Active NameNode, usually it is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

 6. Create HMS resource

    HMS resource is used to create [hms catalog](../../../../lakehouse/multi-catalog/multi-catalog.md)
    ```sql
    CREATE RESOURCE hms_resource PROPERTIES (
       'type'='hms',
       'hive.metastore.uris' = 'thrift://127.0.0.1:7004',
       'dfs.nameservices'='HANN',
       'dfs.ha.namenodes.HANN'='nn1,nn2',
       'dfs.namenode.rpc-address.HANN.nn1'='nn1_host:rpc_port',
       'dfs.namenode.rpc-address.HANN.nn2'='nn2_host:rpc_port',
       'dfs.client.failover.proxy.provider.HANN'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
    );
    ```

    HMS related parameters are as follows:
    - hive.metastore.uris: hive metastore server address
    Optional:
    - dfs.*: If hive data is on hdfs, HDFS resource parameters should be added, or copy hive-site.xml into fe/conf.
    - s3.*: If hive data is on s3, S3 resource parameters should be added. If using [Aliyun Data Lake Formation](https://www.aliyun.com/product/bigdata/dlf), copy hive-site.xml into fe/conf.

 7. Create ES resource

    ```sql
    CREATE RESOURCE es_resource PROPERTIES (
       "type"="es",
       "hosts"="http://127.0.0.1:29200",
       "nodes_discovery"="false",
       "enable_keyword_sniff"="true"
    );
    ```

    ES related parameters are as follows:
    - hosts: ES Connection Address, maybe one or more node, load-balance is also accepted
    - user: username for ES
    - password: password for the user
    - enable_docvalue_scan: whether to enable ES/Lucene column storage to get the value of the query field, the default is true
    - enable_keyword_sniff: Whether to probe the string segmentation type text.fields in ES, query by keyword (the default is true, false matches the content after the segmentation)
    - nodes_discovery: Whether or not to enable ES node discovery, the default is true. In network isolation, set this parameter to false. Only the specified node is connected
    - http_ssl_enabled: Whether ES cluster enables https access mode, the current FE/BE implementation is to trust all

 ### Keywords

     CREATE, RESOURCE

 ### Best Practice
	---
	{
	"title": "CREATE-RESOURCE",
	"language": "en"
	}
	---

	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->

	## CREATE-RESOURCE

	### Name

	CREATE RESOURCE

	### Description

	This statement is used to create a resource. Only the root or admin user can create resources. Currently supports Spark, ODBC, S3 external resources.
	In the future, other external resources may be added to Doris for use, such as Spark/GPU for query, HDFS/S3 for external storage, MapReduce for ETL, etc.

	grammar:

	```sql
	CREATE [EXTERNAL] RESOURCE "resource_name"
	PROPERTIES ("key"="value", ...);
	```

	illustrate:

	- The type of resource needs to be specified in PROPERTIES "type" = "[spark\|odbc_catalog\|s3]", currently supports spark, odbc_catalog, s3.
	- PROPERTIES differs depending on the resource type, see the example for details.

	### Example

	1. Create a Spark resource named spark0 in yarn cluster mode.

	```sql
	CREATE EXTERNAL RESOURCE "spark0"
	PROPERTIES
	(
	"type" = "spark",
	"spark.master" = "yarn",
	"spark.submit.deployMode" = "cluster",
	"spark.jars" = "xxx.jar,yyy.jar",
	"spark.files" = "/tmp/aaa,/tmp/bbb",
	"spark.executor.memory" = "1g",
	"spark.yarn.queue" = "queue0",
	"spark.hadoop.yarn.resourcemanager.address" = "127.0.0.1:9999",
	"spark.hadoop.fs.defaultFS" = "hdfs://127.0.0.1:10000",
	"working_dir" = "hdfs://127.0.0.1:10000/tmp/doris",
	"broker" = "broker0",
	"broker.username" = "user0",
	"broker.password" = "password0"
	);
	```

	Spark related parameters are as follows:
	- spark.master: Required, currently supports yarn, spark://host:port.
	- spark.submit.deployMode: The deployment mode of the Spark program, required, supports both cluster and client.
	- spark.hadoop.yarn.resourcemanager.address: Required when master is yarn.
	- spark.hadoop.fs.defaultFS: Required when master is yarn.
	- Other parameters are optional, refer to [here](http://spark.apache.org/docs/latest/configuration.html)



	Working_dir and broker need to be specified when Spark is used for ETL. described as follows:

	- working_dir: The directory used by the ETL. Required when spark is used as an ETL resource. For example: hdfs://host:port/tmp/doris.
	- broker: broker name. Required when spark is used as an ETL resource. Configuration needs to be done in advance using the `ALTER SYSTEM ADD BROKER` command.
	- broker.property_key: The authentication information that the broker needs to specify when reading the intermediate file generated by ETL.

	2. Create an ODBC resource

	```sql
	CREATE EXTERNAL RESOURCE `oracle_odbc`
	PROPERTIES (
	"type" = "odbc_catalog",
	"host" = "192.168.0.1",
	"port" = "8086",
	"user" = "test",
	"password" = "test",
	"database" = "test",
	"odbc_type" = "oracle",
	"driver" = "Oracle 19 ODBC driver"
	);
	```

	The relevant parameters of ODBC are as follows:
	- hosts: IP address of the external database
	- driver: The driver name of the ODBC appearance, which must be the same as the Driver name in be/conf/odbcinst.ini.
	- odbc_type: the type of the external database, currently supports oracle, mysql, postgresql
	- user: username of the foreign database
	- password: the password information of the corresponding user
	- charset: connection charset
	- There is also support for implementing custom parameters per ODBC Driver, see the description of the corresponding ODBC Driver

	3. Create S3 resource

	```sql
	CREATE RESOURCE "remote_s3"
	PROPERTIES
	(
	"type" = "s3",
	"s3.endpoint" = "bj.s3.com",
	"s3.region" = "bj",
	"s3.access_key" = "bbb",
	"s3.secret_key" = "aaaa",
	-- the followings are optional
	"s3.connection.maximum" = "50",
	"s3.connection.request.timeout" = "3000",
	"s3.connection.timeout" = "1000"
	);
	```

	If S3 resource is used for [cold hot separation](../../../../../docs/advanced/cold_hot_separation.md), we should add more required fields.
	```sql
	CREATE RESOURCE "remote_s3"
	PROPERTIES
	(
	"type" = "s3",
	"s3.endpoint" = "bj.s3.com",
	"s3.region" = "bj",
	"s3.access_key" = "bbb",
	"s3.secret_key" = "aaaa",
	-- required by cooldown
	"s3.root.path" = "/path/to/root",
	"s3.bucket" = "test-bucket"
	);
	```

	S3 related parameters are as follows:
	- Required parameters
	- `s3.endpoint`: s3 endpoint
	- `s3.region`:s3 region
	- `s3.root.path`: s3 root directory
	- `s3.access_key`: s3 access key
	- `s3.secret_key`: s3 secret key
	- `s3.bucket`：s3 bucket
	- optional parameter
	- `s3.connection.maximum`: the maximum number of s3 connections, the default is 50
	- `s3.connection.request.timeout`: s3 request timeout, in milliseconds, the default is 3000
	- `s3.connection.timeout`: s3 connection timeout, in milliseconds, the default is 1000

	4. Create JDBC resource

	```sql
	CREATE RESOURCE mysql_resource PROPERTIES (
	"type"="jdbc",
	"user"="root",
	"password"="123456",
	"jdbc_url" = "jdbc:mysql://127.0.0.1:3316/doris_test?useSSL=false",
	"driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar",
	"driver_class" = "com.mysql.cj.jdbc.Driver"
	);
	```

	JDBC related parameters are as follows:
	- user：The username used to connect to the database
	- password：The password used to connect to the database
	- jdbc_url: The identifier used to connect to the specified database
	- driver_url: The url of JDBC driver package
	- driver_class: The class of JDBC driver

	5. Create HDFS resource

	```sql
	CREATE RESOURCE hdfs_resource PROPERTIES (
	"type"="hdfs",
	"hadoop.username"="user",
	"root_path"="your_path",
	"dfs.nameservices" = "my_ha",
	"dfs.ha.namenodes.my_ha" = "my_namenode1, my_namenode2",
	"dfs.namenode.rpc-address.my_ha.my_namenode1" = "nn1_host:rpc_port",
	"dfs.namenode.rpc-address.my_ha.my_namenode2" = "nn2_host:rpc_port",
	"dfs.client.failover.proxy.provider.my_ha" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
	);
	```

	HDFS related parameters are as follows:
	- fs.defaultFS: namenode address and port
	- hadoop.username: hdfs username
	- dfs.nameservices: if hadoop enable HA, please set fs nameservice. See hdfs-site.xml
	- dfs.ha.namenodes.[nameservice ID]：unique identifiers for each NameNode in the nameservice. See hdfs-site.xml
	- dfs.namenode.rpc-address.[nameservice ID].[name node ID]`：the fully-qualified RPC address for each NameNode to listen on. See hdfs-site.xml
	- dfs.client.failover.proxy.provider.[nameservice ID]：the Java class that HDFS clients use to contact the Active NameNode, usually it is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

	6. Create HMS resource

	HMS resource is used to create [hms catalog](../../../../lakehouse/multi-catalog/multi-catalog.md)
	```sql
	CREATE RESOURCE hms_resource PROPERTIES (
	'type'='hms',
	'hive.metastore.uris' = 'thrift://127.0.0.1:7004',
	'dfs.nameservices'='HANN',
	'dfs.ha.namenodes.HANN'='nn1,nn2',
	'dfs.namenode.rpc-address.HANN.nn1'='nn1_host:rpc_port',
	'dfs.namenode.rpc-address.HANN.nn2'='nn2_host:rpc_port',
	'dfs.client.failover.proxy.provider.HANN'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
	);
	```

	HMS related parameters are as follows:
	- hive.metastore.uris: hive metastore server address
	Optional:
	- dfs.*: If hive data is on hdfs, HDFS resource parameters should be added, or copy hive-site.xml into fe/conf.
	- s3.*: If hive data is on s3, S3 resource parameters should be added. If using [Aliyun Data Lake Formation](https://www.aliyun.com/product/bigdata/dlf), copy hive-site.xml into fe/conf.

	7. Create ES resource

	```sql
	CREATE RESOURCE es_resource PROPERTIES (
	"type"="es",
	"hosts"="http://127.0.0.1:29200",
	"nodes_discovery"="false",
	"enable_keyword_sniff"="true"
	);
	```

	ES related parameters are as follows:
	- hosts: ES Connection Address, maybe one or more node, load-balance is also accepted
	- user: username for ES
	- password: password for the user
	- enable_docvalue_scan: whether to enable ES/Lucene column storage to get the value of the query field, the default is true
	- enable_keyword_sniff: Whether to probe the string segmentation type text.fields in ES, query by keyword (the default is true, false matches the content after the segmentation)
	- nodes_discovery: Whether or not to enable ES node discovery, the default is true. In network isolation, set this parameter to false. Only the specified node is connected
	- http_ssl_enabled: Whether ES cluster enables https access mode, the current FE/BE implementation is to trust all

	### Keywords

	CREATE, RESOURCE

	### Best Practice