• [LOAD DATA](#LOAD DATA)
  • [SHOW SEGMENTS](#SHOW SEGMENTS)
  • [DELETE SEGMENT BY ID](#DELETE SEGMENT BY ID)
  • [DELETE SEGMENT BY DATE](#DELETE SEGMENT BY DATE)

LOAD DATA

This command loads the user data in raw format to the Carbon specific data format store, this way Carbon provides good performance while querying the data.Please visit Data Management for more details on LOAD

Syntax

LOAD DATA [LOCAL] INPATH 'folder_path' INTO TABLE [db_name.]table_name 
            OPTIONS(property_name=property_value, ...)

Parameter Description

ParameterDescriptionOptional
folder_pathPath of raw csv data folder or file.NO
db_nameDatabase name, if it is not specified then it uses current database.YES
table_nameThe name of the table in provided database.NO
OPTIONSExtra options provided to LoadYES

Usage Guideline

Following are the options that can be used in load data:

  • DELIMITER: Delimiters can be provided in the load command.

    OPTIONS('DELIMITER'=',')
    
  • QUOTECHAR: Quote Characters can be provided in the load command.

    OPTIONS('QUOTECHAR'='"')
    
  • COMMENTCHAR: Comment Characters can be provided in the load command if user want to comment lines.

    OPTIONS('COMMENTCHAR'='#')
    
  • FILEHEADER: Headers can be provided in the LOAD DATA command if headers are missing in the source files.

    OPTIONS('FILEHEADER'='column1,column2') 
    
  • MULTILINE: CSV with new line character in quotes.

    OPTIONS('MULTILINE'='true') 
    
  • ESCAPECHAR: Escape char can be provided if user want strict validation of escape character on CSV.

    OPTIONS('ESCAPECHAR'='\') 
    
  • COMPLEX_DELIMITER_LEVEL_1: Split the complex type data column in a row (eg., a$b$c --> Array = {a,b,c}).

    OPTIONS('COMPLEX_DELIMITER_LEVEL_1'='$') 
    
  • COMPLEX_DELIMITER_LEVEL_2: Split the complex type nested data column in a row. Applies level_1 delimiter & applies level_2 based on complex data type (eg., a:b$c:d --> Array> = {{a,b},{c,d}}).

    OPTIONS('COMPLEX_DELIMITER_LEVEL_2'=':') 
    
  • ALL_DICTIONARY_PATH: All dictionary files path.

    OPTIONS('ALL_DICTIONARY_PATH'='/opt/alldictionary/data.dictionary')
    
  • COLUMNDICT: dictionary file path for single column.

    OPTIONS('COLUMNDICT'='column1:dictionaryFilePath1, column2:dictionaryFilePath2')
    

    Note: ALL_DICTIONARY_PATH and COLUMNDICT can't be used together.

Example:

LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
                       options('DELIMITER'=',', 'QUOTECHAR'='"', 'COMMENTCHAR'='#',
                               'FILEHEADER'='empno,empname,
                                designation,doj,workgroupcategory,
                                workgroupcategoryname,deptno,deptname,projectcode,
                                projectjoindate,projectenddate,attendance,utilization,salary',
                               'MULTILINE'='true', 'ESCAPECHAR'='\', 
                               'COMPLEX_DELIMITER_LEVEL_1'='$', 
                               'COMPLEX_DELIMITER_LEVEL_2'=':',
                               'ALL_DICTIONARY_PATH'='/opt/alldictionary/data.dictionary'
                               )

SHOW SEGMENTS

This command is to show the segments of carbon table to the user.

SHOW SEGMENTS FOR TABLE [db_name.]table_name LIMIT number_of_segments;

Parameter Description

ParameterDescriptionOptional
db_nameDatabase name, if it is not specified then it uses current database.YES
table_nameThe name of the table in provided database.NO
number_of_segmentslimit the output to this number.YES

Example:

SHOW SEGMENTS FOR TABLE CarbonDatabase.CarbonTable LIMIT 2;

DELETE SEGMENT BY ID

This command is to delete segment by using the segment ID.

DELETE SEGMENT segment_id1,segment_id2 FROM TABLE [db_name.]table_name;

Parameter Description

ParameterDescriptionOptional
segment_idSegment Id of the load.NO
db_nameDatabase name, if it is not specified then it uses current database.YES
table_nameThe name of the table in provided database.NO

Example:

DELETE SEGMENT 0 FROM TABLE CarbonDatabase.CarbonTable;
DELETE SEGMENT 0.1,5,8 FROM TABLE CarbonDatabase.CarbonTable;

Note: Here 0.1 is compacted segment sequence id.


DELETE SEGMENT BY DATE

This command will allow to deletes the Carbon segment(s) from the store based on the date provided by the user in the DML command. The segment created before the particular date will be removed from the specific stores.

DELETE SEGMENTS FROM TABLE [db_name.]table_name WHERE STARTTIME BEFORE [DATE_VALUE];

Parameter Description

ParameterDescriptionOptional
DATE_VALUEValid segement load start time value. All the segments before this specified date will be deleted.NO
db_nameDatabase name, if it is not specified then it uses current database.YES
table_nameThe name of the table in provided database.NO

Example:

DELETE SEGMENTS FROM TABLE CarbonDatabase.CarbonTable WHERE STARTTIME BEFORE '2017-06-01 12:05:06';