| --- |
| { |
| "title": "Google Cloud Storage | Data Source", |
| "language": "en", |
| "description": "Doris provides two ways to load files from Google Cloud Storage:", |
| "sidebar_label": "Google Cloud Storage" |
| } |
| --- |
| |
| # Google Cloud Storage |
| |
| Doris provides two ways to load files from Google Cloud Storage: |
| - Use S3 Load to load Google Cloud Storage files into Doris, which is an asynchronous load method. |
| - Use TVF to load Google Cloud Storage files into Doris, which is a synchronous load method. |
| |
| ## load with S3 Load |
| |
| Use S3 Load to import files on object storage. For detailed steps, please refer to the [Broker Load Manual](../import-way/broker-load-manual) |
| |
| ### Step 1: Prepare the data |
| |
| Create a CSV file s3load_example.csv The file is stored on Google Cloud Storage and its content is as follows: |
| |
| ``` |
| 1,Emily,25 |
| 2,Benjamin,35 |
| 3,Olivia,28 |
| 4,Alexander,60 |
| 5,Ava,17 |
| 6,William,69 |
| 7,Sophia,32 |
| 8,James,64 |
| 9,Emma,37 |
| 10,Liam,64 |
| ``` |
| |
| ### Step 2: Create a table in Doris |
| |
| ```sql |
| CREATE TABLE test_s3load( |
| user_id BIGINT NOT NULL COMMENT "user id", |
| name VARCHAR(20) COMMENT "name", |
| age INT COMMENT "age" |
| ) |
| DUPLICATE KEY(user_id) |
| DISTRIBUTED BY HASH(user_id) BUCKETS 10; |
| ``` |
| |
| ### Step 3: Load data using S3 Load |
| |
| ```sql |
| LOAD LABEL s3_load_2022_04_01 |
| ( |
| DATA INFILE("s3://your_bucket_name/s3load_example.csv") |
| INTO TABLE test_s3load |
| COLUMNS TERMINATED BY "," |
| FORMAT AS "CSV" |
| (user_id, name, age) |
| ) |
| WITH S3 |
| ( |
| "provider" = "GCP", |
| "s3.endpoint" = "storage.us-west2.rep.googleapis.com", |
| "s3.region" = "US-WEST2", |
| "s3.access_key" = "<your-ak>", |
| "s3.secret_key" = "<your-sk>" |
| ) |
| PROPERTIES |
| ( |
| "timeout" = "3600" |
| ); |
| ``` |
| |
| ### Step 4: Check the imported data |
| |
| ```sql |
| SELECT * FROM test_s3load; |
| ``` |
| |
| Results: |
| |
| ``` |
| mysql> select * from test_s3load; |
| +---------+-----------+------+ |
| | user_id | name | age | |
| +---------+-----------+------+ |
| | 5 | Ava | 17 | |
| | 10 | Liam | 64 | |
| | 7 | Sophia | 32 | |
| | 9 | Emma | 37 | |
| | 1 | Emily | 25 | |
| | 4 | Alexander | 60 | |
| | 2 | Benjamin | 35 | |
| | 3 | Olivia | 28 | |
| | 6 | William | 69 | |
| | 8 | James | 64 | |
| +---------+-----------+------+ |
| 10 rows in set (0.04 sec) |
| ``` |
| |
| ## Load with TVF |
| |
| ### Step 1: Prepare the data |
| |
| Create a CSV file s3load_example.csv The file is stored on Google Cloud Storage and its content is as follows: |
| |
| ``` |
| 1,Emily,25 |
| 2,Benjamin,35 |
| 3,Olivia,28 |
| 4,Alexander,60 |
| 5,Ava,17 |
| 6,William,69 |
| 7,Sophia,32 |
| 8,James,64 |
| 9,Emma,37 |
| 10,Liam,64 |
| ``` |
| |
| ### Step 2: Create a table in Doris |
| |
| ```sql |
| CREATE TABLE test_s3load( |
| user_id BIGINT NOT NULL COMMENT "user id", |
| name VARCHAR(20) COMMENT "name", |
| age INT COMMENT "age" |
| ) |
| DUPLICATE KEY(user_id) |
| DISTRIBUTED BY HASH(user_id) BUCKETS 10; |
| ``` |
| |
| ### Step 3: Load data using TVF |
| |
| ```sql |
| INSERT INTO test_s3load |
| SELECT * FROM S3 |
| ( |
| "uri" = "s3://your_bucket_name/s3load_example.csv", |
| "format" = "csv", |
| "provider" = "GCP", |
| "s3.endpoint" = "storage.us-west2.rep.googleapis.com", |
| "s3.region" = "US-WEST2", |
| "s3.access_key" = "<your-ak>", |
| "s3.secret_key" = "<your-sk>", |
| "column_separator" = ",", |
| "csv_schema" = "user_id:int;name:string;age:int" |
| ); |
| ``` |
| |
| ### Step 4: Check the imported data |
| |
| ```sql |
| SELECT * FROM test_s3load; |
| ``` |
| |
| Results: |
| |
| ``` |
| mysql> select * from test_s3load; |
| +---------+-----------+------+ |
| | user_id | name | age | |
| +---------+-----------+------+ |
| | 5 | Ava | 17 | |
| | 10 | Liam | 64 | |
| | 7 | Sophia | 32 | |
| | 9 | Emma | 37 | |
| | 1 | Emily | 25 | |
| | 4 | Alexander | 60 | |
| | 2 | Benjamin | 35 | |
| | 3 | Olivia | 28 | |
| | 6 | William | 69 | |
| | 8 | James | 64 | |
| +---------+-----------+------+ |
| 10 rows in set (0.04 sec) |
| ``` |
| |