blob: c1e9bdbb2109243df87f59741489d46842f6a2d9 [file] [log] [blame]
************************
Data Definition Language
************************
========================
CREATE DATABASE
========================
*Synopsis*
.. code-block:: sql
CREATE DATABASE [IF NOT EXISTS] <database_name>
*Description*
Database is the namespace in Tajo. A database can contain multiple tables which have unique name in it.
``IF NOT EXISTS`` allows ``CREATE DATABASE`` statement to avoid an error which occurs when the database exists.
========================
DROP DATABASE
========================
*Synopsis*
.. code-block:: sql
DROP DATABASE [IF EXISTS] <database_name>
``IF EXISTS`` allows ``DROP DATABASE`` statement to avoid an error which occurs when the database does not exist.
========================
CREATE TABLE
========================
*Synopsis*
.. code-block:: sql
CREATE TABLE [IF NOT EXISTS] <table_name> [(column_list)] [TABLESPACE tablespace_name]
[using <storage_type> [with (<key> = <value>, ...)]] [AS <select_statement>]
CREATE EXTERNAL TABLE [IF NOT EXISTS] <table_name> (column_list)
using <storage_type> [with (<key> = <value>, ...)] LOCATION '<path>'
*Description*
In Tajo, there are two types of tables, `managed table` and `external table`.
Managed tables are placed on some predefined tablespaces. The ``TABLESPACE`` clause is to specify a tablespace for this table. For external tables, Tajo allows an arbitrary table location with the ``LOCATION`` clause.
For more information about tables and tablespace, please refer to :doc:`/table_management/table_overview` and :doc:`/table_management/tablespaces`.
``column_list`` is a sequence of the column name and its type like ``<column_name> <data_type>, ...``. Additionally, the `asterisk (*)` is allowed for external tables when their data format is `JSON`. You can find more details at :doc:`/table_management/json`.
``IF NOT EXISTS`` allows ``CREATE [EXTERNAL] TABLE`` statement to avoid an error which occurs when the table does not exist.
------------------------
Compression
------------------------
If you want to add an external table that contains compressed data, you should give 'compression.code' parameter to CREATE TABLE statement.
.. code-block:: sql
create EXTERNAL table lineitem (
L_ORDERKEY bigint,
L_PARTKEY bigint,
...
L_COMMENT text)
USING TEXT WITH ('text.delimiter'='|','compression.codec'='org.apache.hadoop.io.compress.SnappyCodec')
LOCATION 'hdfs://localhost:9010/tajo/warehouse/lineitem_100_snappy';
`compression.codec` parameter can have one of the following compression codecs:
* org.apache.hadoop.io.compress.BZip2Codec
* org.apache.hadoop.io.compress.DeflateCodec
* org.apache.hadoop.io.compress.GzipCodec
* org.apache.hadoop.io.compress.SnappyCodec
========================
DROP TABLE
========================
*Synopsis*
.. code-block:: sql
DROP TABLE [IF EXISTS] <table_name> [PURGE]
*Description*
``IF EXISTS`` allows ``DROP DATABASE`` statement to avoid an error which occurs when the database does not exist. ``DROP TABLE`` statement removes a table from Tajo catalog, but it does not remove the contents. If ``PURGE`` option is given, ``DROP TABLE`` statement will eliminate the entry in the catalog as well as the contents.
========================
CREATE INDEX
========================
*Synopsis*
.. code-block:: sql
CREATE INDEX [ name ] ON table_name [ USING method ]
( { column_name | ( expression ) } [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ WHERE predicate ]
*Description*
Tajo supports index for fast data retrieval. Currently, index is supported for only plain ``TEXT`` formats stored on ``HDFS``.
For more information, please refer to :doc:`/index_overview`.
------------------------
Index method
------------------------
Currently, Tajo supports only one type of index.
Index methods:
* TWO_LEVEL_BIN_TREE: This method is used by default in Tajo. For more information about its structure, please refer to :doc:`/index/types`.
========================
DROP INDEX
========================
*Synopsis*
.. code-block:: sql
DROP INDEX name