Doris provides two ways to load files from HDFS:
Use HDFS Load to import files on HDFS. For detailed steps, please refer to the Broker Load Manual
Create a CSV file hdfsload_example.csv The file is stored on HDFS and its content is as follows:
1,Emily,25 2,Benjamin,35 3,Olivia,28 4,Alexander,60 5,Ava,17 6,William,69 7,Sophia,32 8,James,64 9,Emma,37 10,Liam,64
CREATE TABLE test_hdfsload( user_id BIGINT NOT NULL COMMENT "user id", name VARCHAR(20) COMMENT "name", age INT COMMENT "age" ) DUPLICATE KEY(user_id) DISTRIBUTED BY HASH(user_id) BUCKETS 10;
LOAD LABEL hdfs_load_2022_04_01 ( DATA INFILE("hdfs://127.0.0.1:8020/tmp/hdfsload_example.csv") INTO TABLE test_hdfsload COLUMNS TERMINATED BY "," FORMAT AS "CSV" (user_id, name, age) ) with HDFS ( "fs.defaultFS" = "hdfs://127.0.0.1:8020", "hadoop.username" = "user" ) PROPERTIES ( "timeout" = "3600" );
SELECT * FROM test_hdfsload;
Results:
mysql> select * from test_hdfsload; +---------+-----------+------+ | user_id | name | age | +---------+-----------+------+ | 5 | Ava | 17 | | 10 | Liam | 64 | | 7 | Sophia | 32 | | 9 | Emma | 37 | | 1 | Emily | 25 | | 4 | Alexander | 60 | | 2 | Benjamin | 35 | | 3 | Olivia | 28 | | 6 | William | 69 | | 8 | James | 64 | +---------+-----------+------+ 10 rows in set (0.04 sec)
Create a CSV file hdfsload_example.csv The file is stored on HDFS and its content is as follows:
1,Emily,25 2,Benjamin,35 3,Olivia,28 4,Alexander,60 5,Ava,17 6,William,69 7,Sophia,32 8,James,64 9,Emma,37 10,Liam,64
CREATE TABLE test_hdfsload( user_id BIGINT NOT NULL COMMENT "user id", name VARCHAR(20) COMMENT "name", age INT COMMENT "age" ) DUPLICATE KEY(user_id) DISTRIBUTED BY HASH(user_id) BUCKETS 10;
INSERT INTO test_hdfsload SELECT * FROM hdfs ( "uri" = "hdfs://127.0.0.1:8020/tmp/hdfsload_example.csv", "fs.defaultFS" = "hdfs://127.0.0.1:8020", "hadoop.username" = "doris", "format" = "csv", "csv_schema" = "user_id:int;name:string;age:int" );
SELECT * FROM test_hdfsload;
Results:
mysql> select * from test_hdfsload; +---------+-----------+------+ | user_id | name | age | +---------+-----------+------+ | 5 | Ava | 17 | | 10 | Liam | 64 | | 7 | Sophia | 32 | | 9 | Emma | 37 | | 1 | Emily | 25 | | 4 | Alexander | 60 | | 2 | Benjamin | 35 | | 3 | Olivia | 28 | | 6 | William | 69 | | 8 | James | 64 | +---------+-----------+------+ 10 rows in set (0.04 sec)