Database indexes are used to accelerate queries. To speed up different query scenarios, Doris supports various rich indexes.
From the perspective of accelerating queries and their principles, Doris indexes are categorized into two main types: point query indexes and skip indexes.
Among the above indexes, the prefix index and ZoneMap index are built-in indexes automatically maintained by Doris, requiring no user management. Inverted indexes, BloomFilter indexes, and NGram BloomFilter indexes need to be manually created and managed by the user based on the scenario.
| Type | Index | Advantages | Limitations |
|---|---|---|---|
| Point Query | Prefix Index | Built-in index, best performance. Only one prefix index per table. | Only one prefix index per table. |
| Point Query | Inverted Index | Supports tokenization and keyword matching. Building index on any column. Multi-condition combination and accelerating more functions. | Large index storage space, similar to raw data. |
| Skip | ZoneMap Index | Built-in index, small index storage space. Only one prefix index per table. | Only one prefix index per table. |
| Skip | BloomFilter Index | More precise than ZoneMap, medium index space. | Supports few query types. Only supports equal (not others: not equal, range, LIKE, MATCH). |
| Skip | NGram BloomFilter | Supports LIKE acceleration, medium index space. | Supports few query types. Only supports LIKE acceleration. |
| Operator or Function | Prefix Index | Inverted Index | ZoneMap Index | BloomFilter Index | NGram BloomFilter Index |
|---|---|---|---|---|---|
| = | YES | YES | YES | YES | NO |
| != | YES | YES | NO | NO | NO |
| IN | YES | YES | YES | YES | NO |
| NOT IN | YES | YES | NO | NO | NO |
| >, >=, <, <=, BETWEEN | YES | YES | YES | NO | NO |
| IS NULL | YES | YES | YES | NO | NO |
| IS NOT NULL | YES | YES | NO | NO | NO |
| LIKE | NO | NO | NO | NO | YES |
| MATCH, MATCH_* | NO | YES | NO | NO | NO |
| array_contains | NO | YES | NO | NO | NO |
| array_overlaps | NO | YES | NO | NO | NO |
| is_ip_address_in_range | NO | YES | NO | NO | NO |
The design and optimization of database table indexes are closely related to data distribution and queries, requiring testing and optimization based on the actual scenario. Although there is no “silver bullet,” Doris continuously strives to reduce the difficulty of using indexes. Users can follow these simple guidelines for index selection and testing.