| --- |
| { |
| "title": "Guidelines for Basic Use", |
| "language": "en" |
| } |
| --- |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| |
| # User Guide |
| |
| Doris uses MySQL protocol for communication. Users can connect to Doris clusters through MySQL Client or MySQL JDBC. MySQL Client 5.1 or newer versions are recommended because they support user names of more than 16 characters. This topic walks you through how to use Doris with the example of MySQL Client. |
| |
| ## Create Users |
| |
| ### Root User Login and Change Password |
| |
| Doris has its built-in root, and the default password is empty. |
| |
| >Note: |
| > |
| >Doris provides a default root. |
| > |
| >The root user has all the privileges about the clusters by default. Users who have both Grant_priv and Node_priv can grant these privileges to other users. Node changing privileges include adding, deleting, and offlining FE, BE, and BROKER nodes. |
| > |
| >For more instructions on privileges, please refer to [Privilege Management](../admin-manual/privilege-ldap/user-privilege.md) |
| |
| After starting the Doris program, root or admin users can connect to Doris clusters. You can use the following command to log in to Doris. After login, you will enter the corresponding MySQL command line interface. |
| |
| ```sql |
| [root@doris ~]# mysql -h FE_HOST -P9030 -uroot |
| Welcome to the MySQL monitor. Commands end with ; or \g. |
| Your MySQL connection id is 41 |
| Server version: 5.1.0 Doris version 1.0.0-preview2-b48ee2734 |
| |
| Copyright (c) 2000, 2022, Oracle and/or its affiliates. |
| |
| Oracle is a registered trademark of Oracle Corporation and/or its |
| affiliates. Other names may be trademarks of their respective |
| owners. |
| |
| Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. |
| |
| mysql> |
| ``` |
| |
| >1. ` FE_HOST` is the IP address of any FE node. ` 9030 ` is the query_port configuration in fe.conf. |
| |
| After login, you can change the root password by the following command: |
| |
| ```sql |
| mysql> SET PASSWORD FOR 'root' = PASSWORD('your_password'); |
| Query OK, 0 rows affected (0.00 sec) |
| ``` |
| |
| > `your_password` is a new password for the `root` user, which can be set at will. A strong password is recommended for security. The new password is required in the next login. |
| |
| ### Create New Users |
| |
| You can create a regular user named `test` with the following command: |
| |
| ```sql |
| mysql> CREATE USER 'test' IDENTIFIED BY 'test_passwd'; |
| Query OK, 0 rows affected (0.00 sec) |
| ``` |
| |
| Follow-up logins can be performed with the following connection commands. |
| |
| ```sql |
| [root@doris ~]# mysql -h FE_HOST -P9030 -utest -ptest_passwd |
| ``` |
| |
| > By default, the newly created regular users do not have any privileges. Privileges can be granted to these users. |
| |
| ## Create Data Table and Import Data |
| |
| ### Create a database |
| |
| Initially, root or admin users can create a database by the following command: |
| |
| ```sql |
| CREATE DATABASE example_db; |
| ``` |
| |
| > You can use the `HELP` command to check the syntax of all commands. For example, `HELP CREATE DATABASE;`. Or you can refer to the [SHOW CREATE DATABASE](../sql-manual/sql-reference/Show-Statements/SHOW-CREATE-DATABASE.md) command manual. |
| > |
| > If you don't know the full name of the command, you can use "HELP + a field of the command" for fuzzy query. For example, if you type in `HELP CREATE`, you can find commands including `CREATE DATABASE`, `CREATE TABLE`, and `CREATE USER`. |
| > |
| > ``` |
| > mysql> HELP CREATE; |
| > Many help items for your request exist. |
| > To make a more specific request, please type 'help <item>', |
| > where <item> is one of the following |
| > topics: |
| > CREATE CLUSTER |
| > CREATE DATABASE |
| > CREATE ENCRYPTKEY |
| > CREATE FILE |
| > CREATE FUNCTION |
| > CREATE INDEX |
| > CREATE MATERIALIZED VIEW |
| > CREATE REPOSITORY |
| > CREATE RESOURCE |
| > CREATE ROLE |
| > CREATE SYNC JOB |
| > CREATE TABLE |
| > CREATE USER |
| > CREATE VIEW |
| > ROUTINE LOAD |
| > SHOW CREATE FUNCTION |
| > SHOW CREATE ROUTINE LOAD |
| > ``` |
| |
| After the database is created, you can view the information about the database via the [SHOW DATABASES](../sql-manual/sql-reference/Show-Statements/SHOW-DATABASES.md) command. |
| |
| ```sql |
| MySQL> SHOW DATABASES; |
| +--------------------+ |
| | Database | |
| +--------------------+ |
| | example_db | |
| | information_schema | |
| +--------------------+ |
| 2 rows in set (0.00 sec) |
| ``` |
| |
| > `information_schema` exists for compatibility with MySQL protocol, so the information might not be 100% accurate in practice. Therefore, for information about the specific databases, please query the corresponding databases directly. |
| |
| ### Authorize an Account |
| |
| After `example_db` is created, root/admin users can grant read/write privileges of `example_db` to regular users, such as `test`, using the `GRANT` command. After authorization, user `test` can perform operations on `example_db`. |
| |
| ```sql |
| mysql> GRANT ALL ON example_db TO test; |
| Query OK, 0 rows affected (0.01 sec) |
| ``` |
| |
| ### Create a Table |
| |
| You can create a table using the [CREATE TABLE](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md) command. For detailed parameters, you can send a `HELP CREATE TABLE;` command. |
| |
| Firstly, you need to switch to the target database using the [USE](../sql-manual/sql-reference/Utility-Statements/USE.md) command: |
| |
| ```sql |
| mysql> USE example_db; |
| Database changed |
| ``` |
| |
| Doris supports two table creation methods: [compound partitioning and single partitioning](./data-partition.md). The following takes the Aggregate Model as an example to demonstrate how to create tables with these two methods, respectively. |
| |
| #### Single Partitioning |
| |
| Create a logical table named `table1`. The bucketing column is the `siteid` column, and the number of buckets is 10. |
| |
| The table schema is as follows: |
| |
| * `siteid`: INT (4 bytes); default value: 10 |
| * `citycode`: SMALLINT (2 bytes) |
| * `username`: VARCHAR, with a maximum length of 32; default value: empty string |
| * `pv`: BIGINT (8 bytes); default value: 0; This is a metric column, and Doris will aggregate the metric columns internally. The `pv` column is aggregated by SUM. |
| |
| The corresponding CREATE TABLE statement is as follows: |
| ```sql |
| CREATE TABLE table1 |
| ( |
| siteid INT DEFAULT '10', |
| citycode SMALLINT, |
| username VARCHAR(32) DEFAULT '', |
| pv BIGINT SUM DEFAULT '0' |
| ) |
| AGGREGATE KEY(siteid, citycode, username) |
| DISTRIBUTED BY HASH(siteid) BUCKETS 10 |
| PROPERTIES("replication_num" = "1"); |
| ``` |
| |
| #### Compound Partitioning |
| |
| Create a logical table named `table2`. |
| |
| The table schema is as follows: |
| |
| * `event_day`: DATE; no default value |
| * `siteid`: INT (4 bytes); default value: 10 |
| * `citycode`: SMALLINT (2 bytes) |
| * `username`: VARCHAR, with a maximum length of 32; default value: empty string |
| * `pv`: BIGINT (8 bytes); default value: 0; This is a metric column, and Doris will aggregate the metric columns internally. The `pv` column is aggregated by SUM. |
| |
| Use the `event_day` column as the partitioning column and create 3 partitions: p201706, p201707, and p201708. |
| |
| * p201706: Range [Minimum, 2017-07-01) |
| * p201707: Range [2017-07-01, 2017-08-01) |
| * p201708: Range [2017-08-01, 2017-09-01) |
| |
| > Note that the intervals are left-closed and right-open. |
| |
| HASH bucket each partition based on `siteid`. The number of buckets per partition is 10. |
| |
| The corresponding CREATE TABLE statement is as follows: |
| ```sql |
| CREATE TABLE table2 |
| ( |
| event_day DATE, |
| siteid INT DEFAULT '10', |
| citycode SMALLINT, |
| username VARCHAR(32) DEFAULT '', |
| pv BIGINT SUM DEFAULT '0' |
| ) |
| AGGREGATE KEY(event_day, siteid, citycode, username) |
| PARTITION BY RANGE(event_day) |
| ( |
| PARTITION p201706 VALUES LESS THAN ('2017-07-01'), |
| PARTITION p201707 VALUES LESS THAN ('2017-08-01'), |
| PARTITION p201708 VALUES LESS THAN ('2017-09-01') |
| ) |
| DISTRIBUTED BY HASH(siteid) BUCKETS 10 |
| PROPERTIES("replication_num" = "1"); |
| ``` |
| |
| After the table is created, you can view the information of the table in `example_db`: |
| |
| ```sql |
| MySQL> SHOW TABLES; |
| +----------------------+ |
| | Tables_in_example_db | |
| +----------------------+ |
| | table1 | |
| | table2 | |
| +----------------------+ |
| 2 rows in set (0.01 sec) |
| |
| MySQL> DESC table1; |
| +----------+-------------+------+-------+---------+-------+ |
| | Field | Type | Null | Key | Default | Extra | |
| +----------+-------------+------+-------+---------+-------+ |
| | siteid | int(11) | Yes | true | 10 | | |
| | citycode | smallint(6) | Yes | true | N/A | | |
| | username | varchar(32) | Yes | true | | | |
| | pv | bigint(20) | Yes | false | 0 | SUM | |
| +----------+-------------+------+-------+---------+-------+ |
| 4 rows in set (0.00 sec) |
| |
| MySQL> DESC table2; |
| +-----------+-------------+------+-------+---------+-------+ |
| | Field | Type | Null | Key | Default | Extra | |
| +-----------+-------------+------+-------+---------+-------+ |
| | event_day | date | Yes | true | N/A | | |
| | siteid | int(11) | Yes | true | 10 | | |
| | citycode | smallint(6) | Yes | true | N/A | | |
| | username | varchar(32) | Yes | true | | | |
| | pv | bigint(20) | Yes | false | 0 | SUM | |
| +-----------+-------------+------+-------+---------+-------+ |
| 5 rows in set (0.00 sec) |
| ``` |
| |
| > Note: |
| > |
| > 1. As `replication_num` is set to `1`, the above tables are created with only one copy. We recommend that you adopt the default three-copy settings to ensure high availability. |
| > 2. You can dynamically add or delete partitions of compoundly partitioned tables. See `HELP ALTER TABLE`. |
| > 3. You can import data into the specified Partition. See `HELP LOAD;`. |
| > 4. You can dynamically change the table schema. See `HELP ALTER TABLE;`. |
| > 5. You can add Rollups to Tables to improve query performance. See the Rollup-related section in "Advanced Usage". |
| > 6. The default value in the Null column is true, which may affect query performance. |
| |
| ### Load data |
| |
| Doris supports a variety of data loading methods. You can refer to [Data Loading](../data-operate/import/load-manual.md) for more details. The following uses Stream Load and Broker Load as examples. |
| |
| #### Stream Load |
| |
| The Stream Load method transfers data to Doris via HTTP protocol. It can import local data directly without relying on any other systems or components. For the detailed syntax, please see `HELP STREAM LOAD;`. |
| |
| Example 1: Use "table1_20170707" as the Label, import the local file `table1_data` into `table1`. |
| |
| ```shell |
| curl --location-trusted -u test:test_passwd -H "label:table1_20170707" -H "column_separator:," -T table1_data http://FE_HOST:8030/api/example_db/table1/_stream_load |
| ``` |
| |
| > 1. FE_HOST is the IP address of any FE node and 8030 is the http_port in fe.conf. |
| > 2. You can use the IP address of any BE and the webserver_port in be.conf for import. For example: `BE_HOST:8040`. |
| |
| The local file `table1_data` uses `,` as the separator between data. The details are as follows: |
| |
| ```text |
| 1,1,Jim,2 |
| 2,1,grace,2 |
| 3,2,tom,2 |
| 4,3,bush,3 |
| 5,3,helen,3 |
| ``` |
| |
| Example 2: Use "table2_20170707" as the Label, import the local file `table2_data` into `table2`. |
| |
| ```shell |
| curl --location-trusted -u test:test -H "label:table2_20170707" -H "column_separator:|" -T table2_data http://127.0.0.1:8030/api/example_db/table2/_stream_load |
| ``` |
| |
| The local file `table2_data` uses `|` as the separator between data. The details are as follows: |
| |
| ``` |
| 2017-07-03|1|1|jim|2 |
| 2017-07-05|2|1|grace|2 |
| 2017-07-12|3|2|tom|2 |
| 2017-07-15|4|3|bush|3 |
| 2017-07-12|5|3|helen|3 |
| ``` |
| |
| > Note: |
| > |
| > 1. The recommended file size for Stream Load is less than 10GB. Excessive file size will result in higher retry cost. |
| > 2. Each batch of import data should have a Label. Label serves as the unique identifier of the load task, and guarantees that the same batch of data will only be successfully loaded into a database once. For more details, please see [Data Loading and Atomicity](https://doris.apache.org/docs/dev/data-operate/import/import-scenes/load-atomicity/). |
| > 3. Stream Load is a synchronous command. The return of the command indicates that the data has been loaded successfully; otherwise the data has not been loaded. |
| |
| #### Broker Load |
| |
| The Broker Load method imports externally stored data via deployed Broker processes. For more details, please see `HELP BROKER LOAD;` |
| |
| Example: Use "table1_20170708" as the Label, import files on HDFS into `table1` . |
| |
| ```sql |
| LOAD LABEL table1_20170708 |
| ( |
| DATA INFILE("hdfs://your.namenode.host:port/dir/table1_data") |
| INTO TABLE table1 |
| ) |
| WITH BROKER hdfs |
| ( |
| "username"="hdfs_user", |
| "password"="hdfs_password" |
| ) |
| PROPERTIES |
| ( |
| "timeout"="3600", |
| "max_filter_ratio"="0.1" |
| ); |
| ``` |
| |
| The Broker Load is an asynchronous command. Successful execution of it only indicates successful submission of the task. You can check if the import task has been completed by `SHOW LOAD;` . For example: |
| |
| ```sql |
| SHOW LOAD WHERE LABEL = "table1_20170708"; |
| ``` |
| |
| In the return result, if you find `FINISHED` in the `State` field, that means the import is successful. |
| |
| For more instructions on `SHOW LOAD`, see` HELP SHOW LOAD; `. |
| |
| Asynchronous import tasks can be cancelled before it is completed: |
| |
| ```sql |
| CANCEL LOAD WHERE LABEL = "table1_20170708"; |
| ``` |
| |
| ## Query the Data |
| |
| ### Simple Query |
| |
| Query example:: |
| |
| ```sql |
| MySQL> SELECT * FROM table1 LIMIT 3; |
| +--------+----------+----------+------+ |
| | siteid | citycode | username | pv | |
| +--------+----------+----------+------+ |
| | 2 | 1 | 'grace' | 2 | |
| | 5 | 3 | 'helen' | 3 | |
| | 3 | 2 | 'tom' | 2 | |
| +--------+----------+----------+------+ |
| 3 rows in set (0.01 sec) |
| |
| MySQL> SELECT * FROM table1 ORDER BY citycode; |
| +--------+----------+----------+------+ |
| | siteid | citycode | username | pv | |
| +--------+----------+----------+------+ |
| | 2 | 1 | 'grace' | 2 | |
| | 1 | 1 | 'jim' | 2 | |
| | 3 | 2 | 'tom' | 2 | |
| | 4 | 3 | 'bush' | 3 | |
| | 5 | 3 | 'helen' | 3 | |
| +--------+----------+----------+------+ |
| 5 rows in set (0.01 sec) |
| ``` |
| |
| ### SELECT * EXCEPT |
| |
| The `SELECT * EXCEPT` statement is used to exclude one or more columns from the result. The output will not include any of the specified columns. |
| |
| ```sql |
| MySQL> SELECT * except (username, citycode) FROM table1; |
| +--------+------+ |
| | siteid | pv | |
| +--------+------+ |
| | 2 | 2 | |
| | 5 | 3 | |
| | 3 | 2 | |
| +--------+------+ |
| 3 rows in set (0.01 sec) |
| ``` |
| |
| **Note**: `SELECT * EXCEPT` does not exclude columns that do not have a name. |
| |
| ### Join Query |
| |
| Query example:: |
| |
| ```sql |
| MySQL> SELECT SUM(table1.pv) FROM table1 JOIN table2 WHERE table1.siteid = table2.siteid; |
| +--------------------+ |
| | sum(`table1`.`pv`) | |
| +--------------------+ |
| | 12 | |
| +--------------------+ |
| 1 row in set (0.20 sec) |
| ``` |
| |
| ### Subquery |
| |
| Query example:: |
| |
| ```sql |
| MySQL> SELECT SUM(pv) FROM table2 WHERE siteid IN (SELECT siteid FROM table1 WHERE siteid > 2); |
| +-----------+ |
| | sum(`pv`) | |
| +-----------+ |
| | 8 | |
| +-----------+ |
| 1 row in set (0.13 sec) |
| ``` |
| |
| ## Change Table Schema |
| |
| Use the [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN.md) command to modify the table Schema, including the following changes. |
| |
| - Adding columns |
| - Deleting columns |
| - Modify column types |
| - Changing the order of columns |
| |
| The following table structure changes are illustrated by using the following example. |
| |
| The Schema for the original table1 is as follows: |
| |
| ```text |
| +----------+-------------+------+-------+---------+-------+ |
| | Field | Type | Null | Key | Default | Extra | |
| +----------+-------------+------+-------+---------+-------+ |
| | siteid | int(11) | No | true | 10 | | |
| | citycode | smallint(6) | No | true | N/A | | |
| | username | varchar(32) | No | true | | | |
| | pv | bigint(20) | No | false | 0 | SUM | |
| +----------+-------------+------+-------+---------+-------+ |
| ``` |
| |
| We add a new column uv, type BIGINT, aggregation type SUM, default value 0: |
| |
| ```sql |
| ALTER TABLE table1 ADD COLUMN uv BIGINT SUM DEFAULT '0' after pv; |
| ``` |
| |
| After successful submission, you can check the progress of the job with the following command: |
| |
| ```sql |
| SHOW ALTER TABLE COLUMN; |
| ``` |
| |
| When the job status is ``FINISHED``, the job is complete. The new Schema has taken effect. |
| |
| After ALTER TABLE is completed, you can view the latest Schema via ``DESC TABLE``. |
| |
| ``` |
| mysql> DESC table1; |
| +----------+-------------+------+-------+---------+-------+ |
| | Field | Type | Null | Key | Default | Extra | |
| +----------+-------------+------+-------+---------+-------+ |
| | siteid | int(11) | No | true | 10 | | |
| | citycode | smallint(6) | No | true | N/A | | |
| | username | varchar(32) | No | true | | | |
| | pv | bigint(20) | No | false | 0 | SUM | |
| | uv | bigint(20) | No | false | 0 | SUM | |
| +----------+-------------+------+-------+---------+-------+ |
| 5 rows in set (0.00 sec) |
| ``` |
| |
| You can cancel the currently executing job with the following command: |
| |
| ```sql |
| CANCEL ALTER TABLE COLUMN FROM table1; |
| ``` |
| |
| For more help, see ``HELP ALTER TABLE``. |
| |
| ## Rollup |
| |
| Rollup can be seen as a materialized index structure for a Table, **materialized** in the sense that its data is physically independent in storage, and **indexed** in the sense that Rollup can reorder columns to increase the hit rate of prefix indexes as well as reduce Key columns to increase the aggregation level of data. |
| |
| You can perform various changes to Rollup using [ALTER TABLE ROLLUP](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-ROLLUP.md). |
| |
| The following is an exemplified illustration. |
| |
| The original schema of `table1` is as follows: |
| |
| ``` |
| +----------+-------------+------+-------+---------+-------+ |
| | Field | Type | Null | Key | Default | Extra | |
| +----------+-------------+------+-------+---------+-------+ |
| | siteid | int(11) | No | true | 10 | | |
| | citycode | smallint(6) | No | true | N/A | | |
| | username | varchar(32) | No | true | | | |
| | pv | bigint(20) | No | false | 0 | SUM | |
| | uv | bigint(20) | No | false | 0 | SUM | |
| +----------+-------------+------+-------+---------+-------+ |
| ``` |
| |
| For `table1`, `siteid`, `citycode`, and `username` constitute a set of Key, based on which the pv fields are aggregated; if you have a frequent need to view the total city pv, you can create a Rollup consisting of only `citycode` and `pv`: |
| |
| ```sql |
| ALTER TABLE table1 ADD ROLLUP rollup_city(citycode, pv); |
| ``` |
| |
| After successful submission, you can check the progress of the task with the following command: |
| |
| ```sql |
| SHOW ALTER TABLE ROLLUP; |
| ``` |
| |
| If the task status is ``FINISHED``, the job is completed. |
| |
| After the Rollup is created, you can use ``DESC table1 ALL`` to check the information of the Rollup. |
| |
| ``` |
| mysql> desc table1 all; |
| +-------------+----------+-------------+------+-------+--------+-------+ |
| | IndexName | Field | Type | Null | Key | Default | Extra | |
| +-------------+----------+-------------+------+-------+---------+-------+ |
| | table1 | siteid | int(11) | No | true | 10 | | |
| | | citycode | smallint(6) | No | true | N/A | | |
| | | username | varchar(32) | No | true | | | |
| | | pv | bigint(20) | No | false | 0 | SUM | |
| | | uv | bigint(20) | No | false | 0 | SUM | |
| | | | | | | | | |
| | rollup_city | citycode | smallint(6) | No | true | N/A | | |
| | | pv | bigint(20) | No | false | 0 | SUM | |
| +-------------+----------+-------------+------+-------+---------+-------+ |
| 8 rows in set (0.01 sec) |
| ``` |
| |
| You can cancel the currently ongoing task using the following command: |
| |
| ```sql |
| CANCEL ALTER TABLE ROLLUP FROM table1; |
| ``` |
| |
| With created Rollups, you do not need to specify the Rollup in queries, but only specify the original table for the query. The program will automatically determine if Rollup should be used. You can check whether Rollup is hit or not using the ``EXPLAIN your_sql;`` command. |
| |
| For more help, see `HELP ALTER TABLE`. |
| |
| |
| |
| ## Materialized Views |
| |
| Materialized views are a space-for-time data analysis acceleration technique. Doris supports building materialized views on top of base tables. For example, a partial column-based aggregated view can be built on top of a table with a granular data model, allowing for fast querying of both granular and aggregated data. |
| |
| Doris can automatically ensure data consistency between materialized views and base tables, and automatically match the appropriate materialized view at query time, greatly reducing the cost of data maintenance for users and providing a consistent and transparent query acceleration experience. |
| |
| For more information about materialized views, see [Materialized Views](../query-acceleration/materialized-view.md) |
| |
| ## Data Table Queries |
| |
| ### Memory Limit |
| |
| To prevent excessive memory usage of one single query, Doris imposes memory limit on queries. By default, one query task should consume no more than 2GB of memory on one single BE node. |
| |
| If you find a `Memory limit exceeded` error, that means the program is trying to allocate more memory than the memory limit. You can solve this by optimizing your SQL statements. |
| |
| You can change the 2GB memory limit by modifying the memory parameter settings. |
| |
| Show the memory limit for one query: |
| |
| ```sql |
| mysql> SHOW VARIABLES LIKE "%mem_limit%"; |
| +---------------+------------+ |
| | Variable_name | Value | |
| +---------------+------------+ |
| | exec_mem_limit| 2147483648 | |
| +---------------+------------+ |
| 1 row in set (0.00 sec) |
| ``` |
| |
| `exec_mem_limit` is measured in byte. You can change the value of `exec_mem_limit` using the `SET` command. For example, you can change it to 8GB as follows: |
| |
| ```sql |
| mysql> SET exec_mem_limit = 8589934592; |
| Query OK, 0 rows affected (0.00 sec) |
| mysql> SHOW VARIABLES LIKE "%mem_limit%"; |
| +---------------+------------+ |
| | Variable_name | Value | |
| +---------------+------------+ |
| | exec_mem_limit| 8589934592 | |
| +---------------+------------+ |
| 1 row in set (0.00 sec) |
| ``` |
| |
| >- The above change is executed at the session level and is only valid for the currently connected session. The default memory limit will restore after reconnection. |
| >- If you need to change the global variable, you can set: `SET GLOBAL exec_mem_limit = 8589934592;`. After this, you disconnect and log back in, and then the new parameter will take effect permanently. |
| |
| ### Query Timeout |
| |
| The default query timeout is set to 300 seconds. If a query is not completed within 300 seconds, it will be cancelled by the Doris system. You change this parameter and customize the timeout for your application to achieve a blocking method similar to wait(timeout). |
| |
| View the current timeout settings: |
| |
| ```sql |
| mysql> SHOW VARIABLES LIKE "%query_timeout%"; |
| +---------------+-------+ |
| | Variable_name | Value | |
| +---------------+-------+ |
| | QUERY_TIMEOUT | 300 | |
| +---------------+-------+ |
| 1 row in set (0.00 sec) |
| ``` |
| |
| Change query timeout to 1 minute: |
| |
| ```sql |
| mysql> SET query_timeout = 60; |
| Query OK, 0 rows affected (0.00 sec) |
| ``` |
| |
| >- The current timeout check interval is 5 seconds, so if you set the query timeout to less than 5 seconds, it might not be executed accurately. |
| >- The above changes are also performed at the session level. You can change the global variable by `SET GLOBAL`. |
| |
| ### Broadcast/Shuffle Join |
| |
| The default way to implement Join is to filter the sub table conditionally, broadcast it to each node of the overall table to form a memory Hash table, and then stream read the data of the overall table for Hash Join, but if the filtered data of the sub table cannot be put into memory, the Join will not be completed, and then usually a memory overrun error will occur. |
| |
| In this case, it is recommended to explicitly specify Shuffle Join, also known as Partitioned Join, where both the sub table and overall table are Hashed according to the Key of the Join and then perform a distributed Join, with the memory consumption being spread across all compute nodes in the cluster. |
| |
| Doris will automatically attempt a Broadcast Join and switch to a Shuffle Join if the sub table is estimated to be too large; note that if a Broadcast Join is explicitly specified at this point, it will enforce Broadcast Join. |
| |
| Use Broadcast Join (default): |
| |
| ```sql |
| mysql> select sum(table1.pv) from table1 join table2 where table1.siteid = 2; |
| +--------------------+ |
| | sum(`table1`.`pv`) | |
| +--------------------+ |
| | 10 | |
| +--------------------+ |
| 1 row in set (0.20 sec) |
| ``` |
| |
| Use Broadcast Join (explicitly specified): |
| |
| ```sql |
| mysql> select sum(table1.pv) from table1 join [broadcast] table2 where table1.siteid = 2; |
| +--------------------+ |
| | sum(`table1`.`pv`) | |
| +--------------------+ |
| | 10 | |
| +--------------------+ |
| 1 row in set (0.20 sec) |
| ``` |
| |
| Use Shuffle Join: |
| |
| ```sql |
| mysql> select sum(table1.pv) from table1 join [shuffle] table2 where table1.siteid = 2; |
| +--------------------+ |
| | sum(`table1`.`pv`) | |
| +--------------------+ |
| | 10 | |
| +--------------------+ |
| 1 row in set (0.15 sec) |
| ``` |
| |
| ### Query Retry and High Availability |
| |
| When deploying multiple FE nodes, you can deploy a load balancing layer on top of multiple FEs to achieve high availability of Doris. |
| |
| Please refer to [Load Balancing](../admin-manual/cluster-management/load-balancing.md) for details on installation, deployment, and usage. |
| |
| ## Update and Delete Data |
| |
| Doris supports two methods to delete imported data. One is to use the `DELETE FROM` statement and specify the target data by the `WHERE` condition. This method is widely applicable and suitable for less frequent scheduled deletion tasks. |
| |
| The other method is only used in the Unique Models with a unique primary key. It imports the the primary key rows that are to be deleted, and the final physical deletion of the data is performed internally by Doris using the deletion mark. This method is suitable for real-time deletion of data. |
| |
| For specific instructions on deleting and updating data, see [Data Update](../data-operate/update-delete/update.md). |