blob: d60a732cd66bc0529c8f658dc0048b6ce41bd445 [file]
// Licensed to the Apache Software Foundation (ASF) under one or more
// contributor license agreements. See the NOTICE file distributed with
// this work for additional information regarding copyright ownership.
// The ASF licenses this file to You under the Apache License, Version 2.0
// (the "License"); you may not use this file except in compliance with
// the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
= Calcite-based SQL Engine
Starting the 2.13 version, Apache Ignite includes a new SQL engine based on the Apache Calcite framework.
Apache Calcite is a dynamic data management framework, which mainly serves for mediating between applications, one or more data storage locations, and data processing engines. For more information on Apache Calcite, please see the link:https://calcite.apache.org/docs[product documentation, window=_blank].
The current H2-based SQL engine has a number of fundamental limitations of query execution in a distributed environment. To address these limitations, a new SQL engine was implemented. The new engine uses tools provided by Apache Calcite for parsing and planning queries. It also has a new query execution flow.
== Calcite Module Libraries
To use a Calcite-based engine, please make sure that the Calcite module libraries are in a classpath.
=== Standalone Mode
When starting a standalone node, move `optional/ignite-calcite` folder to the `libs` folder before running `ignite.{sh|bat}` script. In this case, the content of the module folder is added to the classpath.
=== Maven Configuration
If you are using Maven to manage dependencies of your project, you can add Calcite module dependency as follows: Replace `${ignite.version}` with the actual Apache Ignite version you are interested in:
[tabs]
--
tab:XML[]
[source,xml]
----
<dependency>
<groupId>org.apache.ignite</groupId>
<artifactId>ignite-calcite</artifactId>
<version>${ignite.version}</version>
</dependency>
----
--
== Configuring Query Engines
To enable engine, add the explicit `CalciteQueryEngineConfiguration` instance to the `SqlConfiguration.QueryEnginesConfiguration` property.
Below is a configuration example of two configured query engines (H2-based and Calcite-based engines) where the Calcite-based engine is chosen as a default one:
[tabs]
--
tab:XML[]
[source,xml]
----
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="sqlConfiguration">
<bean class="org.apache.ignite.configuration.SqlConfiguration">
<property name="queryEnginesConfiguration">
<list>
<bean class="org.apache.ignite.indexing.IndexingQueryEngineConfiguration">
<property name="default" value="false"/>
</bean>
<bean class="org.apache.ignite.calcite.CalciteQueryEngineConfiguration">
<property name="default" value="true"/>
</bean>
</list>
</property>
</bean>
</property>
...
</bean>
----
tab:Java[]
[source,java]
----
IgniteConfiguration cfg = new IgniteConfiguration().setSqlConfiguration(
new SqlConfiguration().setQueryEnginesConfiguration(
new IndexingQueryEngineConfiguration(),
new CalciteQueryEngineConfiguration().setDefault(true)
)
);
----
--
== Routing Queries to Query Engine
Normally, all queries are routed to the query engine that is configured by default. If more than one engine is configured through `queryEnginesConfiguration`, it's possible to use another engine instead of the one configured default for individual queries or for the whole connection.
=== JDBC
To choose a query engine for the JDBC connection, use the `queryEngine` connection parameter:
[tabs]
--
tab:JDBC Connection URL[]
[source,text]
----
jdbc:ignite:thin://127.0.0.1:10800?queryEngine=calcite
----
--
=== ODBC
To configure the query engine for the ODBC connection, use the `QUERY_ENGINE` property:
[tabs]
--
tab:ODBC Connection Properties[]
[source,text]
----
[IGNITE_CALCITE]
DRIVER={Apache Ignite};
SERVER=127.0.0.1;
PORT=10800;
SCHEMA=PUBLIC;
QUERY_ENGINE=CALCITE
----
--
== Memory Quotas [[memory-quotas]]
The Calcite-based SQL engine can track and limit heap memory accounted by Calcite query execution operators.
This is useful for protecting a server node from a single large query or from many concurrent memory-heavy queries.
Two types of quotas can be configured in `CalciteQueryEngineConfiguration`:
* `globalMemoryQuota` - a per-node heap memory quota for all Calcite SQL queries running on the node.
* `queryMemoryQuota` - a per-node heap memory quota for each Calcite SQL query running on the node.
Both quotas are disabled by default (`0`).
The quota values are specified in bytes.
If a quota is exceeded, the query fails with an exception.
The global quota error message contains `Global memory quota for SQL queries exceeded`, and the per-query quota error message contains `Query quota exceeded`.
The quota is applied to query execution structures that keep rows in heap memory, for example sorting, hash joins, hash aggregates, set operations, collection operations, spools, and result materialization.
It is not a process memory limit and it does not account for Ignite data regions, direct memory, JVM native memory, or the operating system page cache.
The quotas are per-node.
For a distributed query, total memory consumption across the cluster can be higher because the limit is applied independently on every participating node.
Size these quotas together with `-Xmx` and the expected SQL concurrency.
[tabs]
--
tab:XML[]
[source,xml]
----
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="sqlConfiguration">
<bean class="org.apache.ignite.configuration.SqlConfiguration">
<property name="queryEnginesConfiguration">
<list>
<bean class="org.apache.ignite.calcite.CalciteQueryEngineConfiguration">
<property name="default" value="true"/>
<property name="globalMemoryQuota" value="#{4L * 1024 * 1024 * 1024}"/>
<property name="queryMemoryQuota" value="#{512L * 1024 * 1024}"/>
</bean>
</list>
</property>
</bean>
</property>
</bean>
----
tab:Java[]
[source,java]
----
IgniteConfiguration cfg = new IgniteConfiguration().setSqlConfiguration(
new SqlConfiguration().setQueryEnginesConfiguration(
new CalciteQueryEngineConfiguration()
.setDefault(true)
.setGlobalMemoryQuota(4L * 1024 * 1024 * 1024)
.setQueryMemoryQuota(512L * 1024 * 1024)
)
);
----
--
== SQL Reference
=== DDL
Data definition language (DDL) statements are compliant with the old H2-based engine. You can find the DDL syntax description link:sql-reference/ddl[here, window=_blank].
=== DML
The new SQL engine mostly inherits data manipulation language (DML) statements syntax from the Apache Calcite framework. See the Apache Calcite SQL grammar description link:https://calcite.apache.org/docs/reference.html[here, window=_blank].
In most cases, statement syntax is compliant with the old SQL engine. But there are still some differences between DML dialects in H2-based engine and Calcite-based engine. For example, note the `MERGE` statement syntax has changed.
=== Transactions
The Calcite-based SQL engine supports SQL savepoint commands for explicit transactions. See link:sql-reference/transactions[Transactions, window=_blank] for syntax and usage details.
JDBC connections can also use the standard JDBC savepoint API. See link:SQL/JDBC/jdbc-driver#transaction-savepoints[JDBC Transaction Savepoints, window=_blank] for details.
=== Supported Functions
The Calcite-based SQL engine currently supports the following user-facing functions and operators.
==== Aggregate functions
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`COUNT`
|`COUNT([ALL \| DISTINCT] value)` or `COUNT(*)`
|Returns the number of input rows or non-null input values.
|`SUM`
|`SUM([ALL \| DISTINCT] numeric)`
|Returns the sum of numeric input values.
|`AVG`
|`AVG([ALL \| DISTINCT] numeric)`
|Returns the average value of numeric input values.
|`MIN`
|`MIN([ALL \| DISTINCT] value)`
|Returns the minimum input value.
|`MAX`
|`MAX([ALL \| DISTINCT] value)`
|Returns the maximum input value.
|`ANY_VALUE`
|`ANY_VALUE([ALL \| DISTINCT] value)`
|Returns one arbitrary input value.
|`LISTAGG`
|`LISTAGG(value[, separator]) WITHIN GROUP (ORDER BY sortExpr)`
|Concatenates values from a group in the requested order.
|`GROUP_CONCAT`
|`GROUP_CONCAT(value[, separator] [ORDER BY sortExpr])`
|Concatenates values from a group, optionally using a separator and ordering.
|`STRING_AGG`
|`STRING_AGG(value, separator [ORDER BY sortExpr])`
|Concatenates string values from a group using a separator.
|`ARRAY_AGG`
|`ARRAY_AGG(value [ORDER BY sortExpr])`
|Collects input values into an array.
|`ARRAY_CONCAT_AGG`
|`ARRAY_CONCAT_AGG(arrayValue [ORDER BY sortExpr])`
|Concatenates arrays from input rows into one array.
|`EVERY`
|`EVERY(condition)`
|Returns `TRUE` if every input condition is `TRUE`.
|`SOME`
|`SOME(condition)`
|Returns `TRUE` if at least one input condition is `TRUE`.
|`BIT_AND`
|`BIT_AND(integer)`
|Aggregates integer values using bitwise AND.
|`BIT_OR`
|`BIT_OR(integer)`
|Aggregates integer values using bitwise OR.
|`BIT_XOR`
|`BIT_XOR(integer)`
|Aggregates integer values using bitwise XOR.
|Aggregate `FILTER` clause
|`aggregateFunction(...) FILTER (WHERE condition)`
|Applies an aggregate function only to input rows that satisfy the condition.
|===
==== String functions and predicates
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`UPPER`
|`UPPER(string)`
|Converts a string to uppercase.
|`LOWER`
|`LOWER(string)`
|Converts a string to lowercase.
|`INITCAP`
|`INITCAP(string)`
|Converts each word to initial uppercase followed by lowercase characters.
|`TO_BASE64`
|`TO_BASE64(string)`
|Encodes a string using Base64.
|`FROM_BASE64`
|`FROM_BASE64(string)`
|Decodes a Base64 string.
|`MD5`
|`MD5(string)`
|Returns the MD5 hash of a string as a hexadecimal string.
|`SHA1`
|`SHA1(string)`
|Returns the SHA-1 hash of a string as a hexadecimal string.
|`SUBSTRING`
|`SUBSTRING(string FROM start [FOR length])`
|Returns a substring starting at the specified position.
|`LEFT`
|`LEFT(string, length)`
|Returns the leftmost characters of a string.
|`RIGHT`
|`RIGHT(string, length)`
|Returns the rightmost characters of a string.
|`REPLACE`
|`REPLACE(string, search[, replacement])`
|Replaces occurrences of a search string.
|`TRANSLATE`
|`TRANSLATE(string, fromString, toString)`
|Replaces characters from one set with the corresponding characters from another set.
|`CHR`
|`CHR(integer)`
|Returns the character for the specified code point.
|`CHAR_LENGTH`
|`CHAR_LENGTH(string)`
|Returns the number of characters in a string.
|`CHARACTER_LENGTH`
|`CHARACTER_LENGTH(string)`
|Returns the number of characters in a string.
|`LENGTH`
|`LENGTH(string)`
|Alias for `CHAR_LENGTH(string)`.
|`\|\|`
|`string \|\| string`
|Concatenates two strings.
|`CONCAT`
|`CONCAT(string, string)` or `CONCAT(string[, string]...)`
|Concatenates strings.
|`OVERLAY`
|`OVERLAY(string1 PLACING string2 FROM start [FOR length])`
|Replaces part of a string with another string.
|`POSITION`
|`POSITION(substring IN string [FROM start])`
|Returns the position of the first occurrence of a substring.
|`ASCII`
|`ASCII(string)`
|Returns the code point of the first character of a string.
|`REPEAT`
|`REPEAT(string, count)`
|Returns a string repeated the specified number of times.
|`SPACE`
|`SPACE(count)`
|Returns a string that contains the specified number of spaces.
|`STRCMP`
|`STRCMP(string1, string2)`
|Compares two strings and returns an integer comparison result.
|`SOUNDEX`
|`SOUNDEX(string)`
|Returns the phonetic representation of a string.
|`DIFFERENCE`
|`DIFFERENCE(string1, string2)`
|Returns a similarity score for the `SOUNDEX` values of two strings.
|`REVERSE`
|`REVERSE(string)`
|Returns a string with characters in reverse order.
|`TRIM`
|`TRIM([{BOTH \| LEADING \| TRAILING} chars FROM] string)`
|Removes characters from the start, end, or both ends of a string.
|`LTRIM`
|`LTRIM(string)`
|Removes spaces from the start of a string.
|`RTRIM`
|`RTRIM(string)`
|Removes spaces from the end of a string.
|`LIKE`
|`string LIKE pattern [ESCAPE escapeChar]`
|Checks whether a string matches a SQL `LIKE` pattern.
|`SIMILAR TO`
|`string SIMILAR TO pattern [ESCAPE escapeChar]`
|Checks whether a string matches a SQL regular expression pattern.
|===
==== Regular expression functions and operators
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`~`
|`string ~ pattern`
|Checks whether a string matches a case-sensitive POSIX regular expression.
|`~*`
|`string ~* pattern`
|Checks whether a string matches a case-insensitive POSIX regular expression.
|`!~`
|`string !~ pattern`
|Checks whether a string does not match a case-sensitive POSIX regular expression.
|`!~*`
|`string !~* pattern`
|Checks whether a string does not match a case-insensitive POSIX regular expression.
|`REGEXP_REPLACE`
|`REGEXP_REPLACE(string, regexp, replacement[, position[, occurrence[, matchType]]])`
|Replaces substrings that match a regular expression.
|`REGEXP_SUBSTR`
|`REGEXP_SUBSTR(string, regexp[, position[, occurrence]])`
|Returns the substring that matches a regular expression.
|===
==== Numeric and math functions
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`MOD`
|`MOD(numeric1, numeric2)` or `numeric1 % numeric2`
|Returns the remainder after division.
|`BITAND`
|`BITAND(integer1, integer2)`
|Returns the bitwise AND of two integer values.
|`BITOR`
|`BITOR(integer1, integer2)`
|Returns the bitwise OR of two integer values.
|`BITXOR`
|`BITXOR(integer1, integer2)`
|Returns the bitwise XOR of two integer values.
|`EXP`
|`EXP(numeric)`
|Returns Euler's number raised to a power.
|`POWER`
|`POWER(numeric1, numeric2)`
|Returns one numeric value raised to the power of another.
|`LN`
|`LN(numeric)`
|Returns the natural logarithm.
|`LOG10`
|`LOG10(numeric)`
|Returns the base-10 logarithm.
|`ABS`
|`ABS(numeric)`
|Returns the absolute value.
|`RAND`
|`RAND([seed])`
|Returns a random double value.
|`RAND_INTEGER`
|`RAND_INTEGER([seed,] bound)`
|Returns a random integer from `0` inclusive to `bound` exclusive.
|`ACOS`
|`ACOS(numeric)`
|Returns the arc cosine.
|`ACOSH`
|`ACOSH(numeric)`
|Returns the inverse hyperbolic cosine.
|`ASIN`
|`ASIN(numeric)`
|Returns the arc sine.
|`ASINH`
|`ASINH(numeric)`
|Returns the inverse hyperbolic sine.
|`ATAN`
|`ATAN(numeric)`
|Returns the arc tangent.
|`ATANH`
|`ATANH(numeric)`
|Returns the inverse hyperbolic tangent.
|`ATAN2`
|`ATAN2(numeric1, numeric2)`
|Returns the angle from rectangular coordinates.
|`SQRT`
|`SQRT(numeric)`
|Returns the square root.
|`CBRT`
|`CBRT(numeric)`
|Returns the cube root.
|`COS`
|`COS(numeric)`
|Returns the cosine.
|`COSH`
|`COSH(numeric)`
|Returns the hyperbolic cosine.
|`COT`
|`COT(numeric)`
|Returns the cotangent.
|`COTH`
|`COTH(numeric)`
|Returns the hyperbolic cotangent.
|`DEGREES`
|`DEGREES(numeric)`
|Converts radians to degrees.
|`RADIANS`
|`RADIANS(numeric)`
|Converts degrees to radians.
|`ROUND`
|`ROUND(numeric[, scale])`
|Rounds a numeric value to the specified scale.
|`SIGN`
|`SIGN(numeric)`
|Returns the sign of a numeric value.
|`SIN`
|`SIN(numeric)`
|Returns the sine.
|`SINH`
|`SINH(numeric)`
|Returns the hyperbolic sine.
|`TAN`
|`TAN(numeric)`
|Returns the tangent.
|`TANH`
|`TANH(numeric)`
|Returns the hyperbolic tangent.
|`SEC`
|`SEC(numeric)`
|Returns the secant.
|`SECH`
|`SECH(numeric)`
|Returns the hyperbolic secant.
|`CSC`
|`CSC(numeric)`
|Returns the cosecant.
|`CSCH`
|`CSCH(numeric)`
|Returns the hyperbolic cosecant.
|`TRUNCATE`
|`TRUNCATE(numeric[, scale])`
|Truncates a numeric value to the specified scale.
|`PI`
|`PI`
|Returns an approximation of pi.
|===
==== Date and time functions
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`EXTRACT`
|`EXTRACT(timeUnit FROM datetime)`
|Returns the specified field from a date, time, or timestamp value.
|`FLOOR`
|`FLOOR(datetime TO timeUnit)`
|Rounds a date, time, or timestamp value down to the specified unit.
|`CEIL`
|`CEIL(datetime TO timeUnit)`
|Rounds a date, time, or timestamp value up to the specified unit.
|`TIMESTAMPADD`
|`TIMESTAMPADD(timeUnit, interval, datetime)`
|Adds an interval to a timestamp.
|`TIMESTAMPDIFF`
|`TIMESTAMPDIFF(timeUnit, datetime1, datetime2)`
|Returns the number of time-unit boundaries between two datetime values.
|`LAST_DAY`
|`LAST_DAY(date)`
|Returns the date of the last day of the month.
|`DAYNAME`
|`DAYNAME(datetime)`
|Returns the name of the day of the week.
|`MONTHNAME`
|`MONTHNAME(datetime)`
|Returns the name of the month.
|`DAYOFMONTH`
|`DAYOFMONTH(datetime)`
|Returns the day of the month.
|`DAYOFWEEK`
|`DAYOFWEEK(datetime)`
|Returns the day of the week.
|`DAYOFYEAR`
|`DAYOFYEAR(datetime)`
|Returns the day of the year.
|`YEAR`
|`YEAR(datetime)`
|Returns the year.
|`QUARTER`
|`QUARTER(datetime)`
|Returns the quarter of the year.
|`MONTH`
|`MONTH(datetime)`
|Returns the month number.
|`WEEK`
|`WEEK(datetime)`
|Returns the week number.
|`HOUR`
|`HOUR(datetime)`
|Returns the hour.
|`MINUTE`
|`MINUTE(datetime)`
|Returns the minute.
|`SECOND`
|`SECOND(datetime)`
|Returns the second.
|`TIMESTAMP_SECONDS`
|`TIMESTAMP_SECONDS(integer)`
|Converts seconds since `1970-01-01 00:00:00` to a timestamp.
|`TIMESTAMP_MILLIS`
|`TIMESTAMP_MILLIS(integer)`
|Converts milliseconds since `1970-01-01 00:00:00` to a timestamp.
|`TIMESTAMP_MICROS`
|`TIMESTAMP_MICROS(integer)`
|Converts microseconds since `1970-01-01 00:00:00` to a timestamp.
|`UNIX_SECONDS`
|`UNIX_SECONDS(timestamp)`
|Returns seconds since `1970-01-01 00:00:00`.
|`UNIX_MILLIS`
|`UNIX_MILLIS(timestamp)`
|Returns milliseconds since `1970-01-01 00:00:00`.
|`UNIX_MICROS`
|`UNIX_MICROS(timestamp)`
|Returns microseconds since `1970-01-01 00:00:00`.
|`UNIX_DATE`
|`UNIX_DATE(date)`
|Returns the number of days since `1970-01-01`.
|`DATE_FROM_UNIX_DATE`
|`DATE_FROM_UNIX_DATE(integer)`
|Converts days since `1970-01-01` to a date.
|`DATE`
|`DATE(value)` or `DATE(year, month, day)`
|Converts a value to a date or creates a date from date fields.
|`TIME`
|`TIME(value)` or `TIME(hour, minute, second)`
|Converts a value to a time or creates a time from time fields.
|`DATETIME`
|`DATETIME(value)` or `DATETIME(year, month, day, hour, minute, second)`
|Converts a value to a timestamp or creates a timestamp from date and time fields.
|`CURRENT_TIME`
|`CURRENT_TIME`
|Returns the current time.
|`CURRENT_TIMESTAMP`
|`CURRENT_TIMESTAMP`
|Returns the current timestamp.
|`CURRENT_DATE`
|`CURRENT_DATE`
|Returns the current date.
|`LOCALTIME`
|`LOCALTIME`
|Returns the current local time.
|`LOCALTIMESTAMP`
|`LOCALTIMESTAMP`
|Returns the current local timestamp.
|JDBC current time escape functions
|`{fn CURDATE()}`, `{fn CURTIME()}`, `{fn NOW()}`
|JDBC escape aliases for the current date, current time, and current timestamp.
|`TO_CHAR`
|`TO_CHAR(datetime, format)`
|Formats a date, time, or timestamp value.
|`TO_DATE`
|`TO_DATE(string, format)`
|Parses a date from a string.
|`TO_TIMESTAMP`
|`TO_TIMESTAMP(string, format)`
|Parses a timestamp from a string.
|===
==== XML functions
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`EXTRACTVALUE`
|`EXTRACTVALUE(xml, xpath)`
|Returns the text selected by an XPath expression.
|`XMLTRANSFORM`
|`XMLTRANSFORM(xml, xslt)`
|Transforms XML with an XSLT stylesheet.
|`EXTRACT`
|`"EXTRACT"(xml, xpath)`
|Returns the XML fragment selected by an XPath expression. Quote the name when needed to distinguish it from date/time `EXTRACT`.
|`EXISTSNODE`
|`EXISTSNODE(xml, xpath)`
|Returns `1` if an XPath expression selects at least one XML node; otherwise returns `0`.
|===
==== JSON functions and predicates
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`FORMAT JSON`
|`value FORMAT JSON`
|Marks a value as JSON-formatted input.
|`JSON_VALUE`
|`JSON_VALUE(jsonValue, path)`
|Extracts a scalar SQL value from JSON using a JSON path expression.
|`JSON_QUERY`
|`JSON_QUERY(jsonValue, path)`
|Extracts a JSON object or array from JSON using a JSON path expression.
|`JSON_TYPE`
|`JSON_TYPE(jsonValue)`
|Returns the type of a JSON value.
|`JSON_EXISTS`
|`JSON_EXISTS(jsonValue, path)`
|Returns whether JSON satisfies the specified path expression.
|`JSON_DEPTH`
|`JSON_DEPTH(jsonValue)`
|Returns the depth of a JSON value.
|`JSON_KEYS`
|`JSON_KEYS(jsonValue[, path])`
|Returns the keys from a JSON object.
|`JSON_PRETTY`
|`JSON_PRETTY(jsonValue)`
|Returns formatted JSON.
|`JSON_LENGTH`
|`JSON_LENGTH(jsonValue[, path])`
|Returns the length of a JSON value.
|`JSON_REMOVE`
|`JSON_REMOVE(jsonValue, path[, path]...)`
|Removes data selected by path expressions and returns the updated JSON.
|`JSON_STORAGE_SIZE`
|`JSON_STORAGE_SIZE(jsonValue)`
|Returns the number of bytes used by the JSON binary representation.
|`JSON_OBJECT`
|`JSON_OBJECT(jsonKey : jsonValue[, jsonKey : jsonValue]...)`
|Builds a JSON object from key-value pairs.
|`JSON_ARRAY`
|`JSON_ARRAY([jsonValue[, jsonValue]...])`
|Builds a JSON array from values.
|`IS JSON`
|`value IS [NOT] JSON [VALUE]`
|Checks whether a value is, or is not, JSON.
|`IS JSON OBJECT`
|`value IS [NOT] JSON OBJECT`
|Checks whether a value is, or is not, a JSON object.
|`IS JSON ARRAY`
|`value IS [NOT] JSON ARRAY`
|Checks whether a value is, or is not, a JSON array.
|`IS JSON SCALAR`
|`value IS [NOT] JSON SCALAR`
|Checks whether a value is, or is not, a JSON scalar value.
|===
==== Collection functions and operators
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`ARRAY`
|`ARRAY[value[, value]...]` or `ARRAY(query)`
|Creates an array from values or from a query result.
|`MAP`
|`MAP[key, value[, key, value]...]` or `MAP(query)`
|Creates a map from key-value pairs or from a query result.
|`ITEM`
|`array[index]` or `map[key]`
|Returns an item from an array or map.
|`CARDINALITY`
|`CARDINALITY(collection)`
|Returns the number of elements in a collection.
|`IS EMPTY`
|`collection IS EMPTY`
|Checks whether a collection has no elements.
|`IS NOT EMPTY`
|`collection IS NOT EMPTY`
|Checks whether a collection has at least one element.
|===
==== Other functions and operators
[cols="1,2,4",opts="stretch,header"]
|===
|Name | Syntax | Description
|`ROW`
|`ROW(value[, value]...)`
|Creates a row value.
|`CAST`
|`CAST(value AS type)`
|Converts a value to the specified type.
|Infix cast
|`value::type`
|Converts a value to the specified type using PostgreSQL-style cast syntax.
|`TYPEOF`
|`TYPEOF(value)`
|Returns the Ignite SQL type of a value.
|`COALESCE`
|`COALESCE(value, value[, value]...)`
|Returns the first non-null value.
|`NVL`
|`NVL(value1, value2)`
|Returns `value1` when it is not null; otherwise returns `value2`.
|`NULLIF`
|`NULLIF(value1, value2)`
|Returns `NULL` when the values are equal; otherwise returns `value1`.
|`CASE`
|`CASE WHEN condition THEN result [ELSE result] END`
|Returns a result selected by conditional branches.
|`DECODE`
|`DECODE(value, search, result[, search, result]...[, default])`
|Compares a value with search values and returns the matching result.
|`LEAST`
|`LEAST(value[, value]...)`
|Returns the least value from the arguments.
|`GREATEST`
|`GREATEST(value[, value]...)`
|Returns the greatest value from the arguments.
|`COMPRESS`
|`COMPRESS(string)`
|Compresses a string and returns binary data.
|`OCTET_LENGTH`
|`OCTET_LENGTH(binary)`
|Returns the number of bytes in binary data.
|`QUERY_ENGINE`
|`QUERY_ENGINE()`
|Returns the name of the query engine used by the query.
|`SYSTEM_RANGE`
|`TABLE(SYSTEM_RANGE(start, end[, increment]))`
|Returns a table with one `BIGINT` column named `X` and one row for each value in the range.
|===
=== Supported Data Types
Below are the data types supported by the Calcite-based SQL engine:
[cols="1,1",opts="stretch,header"]
|===
|Data type | Mapped to Java class
|BOOLEAN
|`java.lang.Boolean`
|DECIMAL
|`java.math.BigDecimal`
|DOUBLE
|`java.lang.Double`
|REAL/FLOAT
|`java.lang.Float`
|INT
|`java.lang.Integer`
|BIGINT
|`java.lang.Long`
|SMALLINT
|`java.lang.Short`
|TINYINT
|`java.lang.Byte`
|CHAR/VARCHAR
|`java.lang.String`
|DATE
|`java.sql.Date`
|TIME
|`java.sql.Time`
|TIMESTAMP
|`java.sql.Timestamp`
|INTERVAL YEAR TO MONTH
|`java.time.Period`
|INTERVAL DAY TO SECOND
|`java.time.Duration`
|BINARY/VARBINARY
|`byte[]`
|UUID
|`java.util.UUID`
|OTHER
|`java.lang.Object`
|===
== Optimizer hints [[hints]]
The query optimizer does its best to build the fastest excution plan. However, this is a far way to create an optimizer
which is the most effective for each case. You can better know about the data design, application design or data
distribution in the cluster. SQL hints can help the optimizer to make optimizations more rationally or build
execution plan faster.
[NOTE]
====
SQL hints are optional to apply and might be skipped in some cases.
====
=== Hints format
SQL hints are defined by a special comment +++/*+ HINT */+++ reffered as _hint block_. Spaces before and after the
hint name are required. The hint block is placed right after _SELECT_ or after a table name. Several hint blocks for
one _SELECT_ or one table *are not allowed*. Several hints in one hint block are separated with comma.
Example:
[source, SQL]
----
SELECT /*+ FORCE_INDEX(IDX_TBL1_V2), EXPAND_DISTINCT_AGG */ V2, AVG(DISTINCT V3) FROM TBL1 WHERE V1=? and V2=? GROUP BY V2
SELECT * FROM TBL1 /*+ FORCE_INDEX(IDX_TBL1_V2) */ where V1=? and V2=?
----
It is allowed to define several hints for the same relation operator. To use several hints, separate them by comma
(spaces are optional).
Example:
[source, SQL]
----
SELECT /*+ NO_INDEX, EXPAND_DISTINCT_AGG */ SUM(DISTINCT V1), AVG(DISTINCT V2) FROM TBL1 WHERE V3=? GROUP BY V3
----
==== Hint parameters
Hint parameters, if required, are placed in brackets after the hint name and separated by commas.
The hint parameter can be quoted. Quoted parameter is case-sensitive. The quoted and unquoted parameters cannot be
defined for the same hint.
Example:
[source, SQL]
----
SELECT /*+ FORCE_INDEX(TBL1_IDX2,TBL2_IDX1) */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;
SELECT /*+ FORCE_INDEX('TBL2_idx1') */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;
----
=== Hint scope
Hints of a _SELECT_ are "visible" for this operation and the following relation operators, queries and subqueries.
Hints in a subquery have effective scope only for this subquery and its subqueries. Hint, defined for a table, is
effective only for this table.
Example:
[source, SQL]
----
SELECT /*+ NO_INDEX(TBL1_IDX2), FORCE_INDEX(TBL2_IDX2) */ T1.V1 FROM TBL1 T1 WHERE T1.V2 IN (SELECT T2.V2 FROM TBL2 T2 WHERE T2.V1=? AND T2.V2=?);
SELECT T1.V1 FROM TBL1 T1 WHERE T1.V2 IN (SELECT /*+ FORCE_INDEX(TBL2_IDX2) */ T2.V2 FROM TBL2 T2 WHERE T2.V1=? AND T2.V2=?);
SELECT T1.V1 FROM TBL1 T1 JOIN TBL2 /*+ MERGE_JOIN */ T2 ON T1.V2=T2.V2 and T1.V3=T2.V3 and T2.V3=?;
----
Note that only the first query has a hint in such a case as:
[source, SQL]
----
SELECT /*+ FORCE_INDEX */ V1 FROM TBL1 WHERE V1=? AND V2=?
UNION ALL
SELECT V1 FROM TBL1 WHERE V3>?
----
But *there are exceptions*: hints of engine or optimizer level, such as link:#hint_disable_rule[_DISABLE_RULE_] or
link:#hint_query_engine[_QUERY_ENGINE_]. Such hints should be defined at the beginning of the query and are related to
the whole query.
=== Hints priority
Hints, defined in subqueries or in the following _SELECTs_, have priority over the preceding ones. In the following example,
an index for _TBL2_ is actually applied.
[source, SQL]
----
SELECT /*+ NO_INDEX */ * FROM TBL1 T1 WHERE T1.V1 = (SELECT /*+ FORCE_INDEX */ T2.V1 FROM TBL2 T2 where T2.V2=? and T2.V3=?)
----
Table hints usually have a bigger priority. In the following example, an index for _TBL1_ is actually applied.
[source, SQL]
----
SELECT /*+ NO_INDEX */ * FROM TBL /*+ FORCE_INDEX(IDX_TBL1_V2) */ where V1=? and V2=? and V3=?;
----
=== Hints errors
The optimizer tries to apply every hint and its parameters, if possible. But it skips the hint or hint parameter if:
* There is no such supported hint.
* Required hint parameters are not passed.
* The hint parameters have been passed, but the hint does not support any parameter.
* The hint parameter is incorrect or refers to a nonexistent object, such as a nonexistent index or table.
* The current hints or current parameters are incompatible with the previous ones, such as forcing the use and disabling of the same index.
=== Hint limitations
Currently, SQL hints do not recognize the aliases. You can't refer to an alias like this:
[source, SQL]
----
SELECT /*+ MERGE_JOIN(T2) */ T2.V1 FROM TBL1 T1 JOIN TBL2 T2 ON T1.V3=T2.V1 WHERE T1.V2=? AND T2.V2=?
----
Instead, a table name have to be used:
[source, SQL]
----
SELECT /*+ MERGE_JOIN(TBL2) */ T2.V1 FROM TBL1 T1 JOIN TBL2 T2 ON T1.V3=T2.V1 WHERE T1.V2=? AND T2.V2=?
----
=== Supportted hints
==== FORCE_INDEX / NO_INDEX
Forces or disables index scan.
===== Parameters:
* Empty. To force an index scan for every undelying table. Optimizer will choose any available index. Or to disable all indexes.
* Single index name to use or skip exactly this index.
* Several index names. They can relate to different tables. The optimizer will choose indexes for scanning or skip them all.
===== Example:
[source, SQL]
----
SELECT /*+ FORCE_INDEX */ T1.* FROM TBL1 T1 WHERE T1.V1 = T2.V1 AND T1.V2 > ?;
SELECT /*+ FORCE_INDEX(TBL1_IDX2, TBL2_IDX1) */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;
SELECT /*+ NO_INDEX */ T1.* FROM TBL1 T1 WHERE T1.V1 = T2.V1 AND T1.V2 > ?;
SELECT /*+ NO_INDEX(TBL1_IDX2, TBL2_IDX1) */ T1.V1, T2.V1 FROM TBL1 T1, TBL2 T2 WHERE T1.V1 = T2.V1 AND T1.V2 > ? AND T2.V2 > ?;
SELECT T1.V1, T2.V2 FROM TBL1 t1 JOIN TBL2 /*+ FORCE_INDEX(IDX2_2) */ T2 on T1.V3=T2.V3 and T1.V1=T2.V2 and T2.V1=?";
----
==== ENFORCE_JOIN_ORDER
Forces join order as appears in a query. Fastens building of joins plan.
===== Example:
[source, SQL]
----
SELECT /*+ ENFORCE_JOIN_ORDER */ T1.V1, T2.V1, T2.V2, T3.V1, T3.V2, T3.V3 FROM TBL1 T1 JOIN TBL2 T2 ON T1.V3=T2.V1 JOIN TBL3 T3 ON T2.V3=T3.V1 AND T2.V2=T3.V2
SELECT t1.v1, t3.v2 FROM TBL1 t1 JOIN TBL3 t3 on t1.v3=t3.v3 WHERE t1.v2 in (SELECT /*+ ENFORCE_JOIN_ORDER */ t2.v2 FROM TBL2 t2 JOIN TBL3 t3 ON t2.v1=t3.v1)
----
==== MERGE_JOIN, NL_JOIN, CNL_JOIN, HASH_JOIN
Forces certain join type: Merge, Nested Loop, Correlated Nested Loop and Hash Join respectively.
Every of those has the negation like 'NO_INDEX': CNL_JOIN, NO_CNL_JOIN, NO_HASH_JOIN. The negation hint disables certain join type.
===== Parameters:
* Empty. To force or disable certain join type for every join.
* Single or several tables names force or disable certain join type only for joining of these tables.
===== Example:
[source, SQL]
----
SELECT /*+ MERGE_JOIN */ t1.v1, t2.v2 FROM TBL1 t1, TBL2 t2 WHERE t1.v3=t2.v3
SELECT /*+ NL_JOIN(TBL3,TBL1) */ t4.v1, t2.v2 FROM TBL1 t4 JOIN TBL2 t2 on t1.v3=t2.v3 WHERE t2.v1 in (SELECT t3.v3 FROM TBL3 t3 JOIN TBL1 t4 on t3.v2=t4.v2)
SELECT t1.v1, t2.v2 FROM TBL2 t1 JOIN TBL1 t2 on t1.v3=t2.v3 WHERE t2.v3 in (SELECT /*+ NO_CNL_JOIN(TBL4) */ t3.v3 FROM TBL3 t3 JOIN TBL4 t4 on t3.v1=t4.v1)
SELECT t4.v1, t2.v2 FROM TBL1 t4 JOIN TBL2 t2 on t1.v3=t2.v3 WHERE t2.v1 in (SELECT t3.v3 FROM TBL3 t3 JOIN TBL1 /*+ HASH_JOIN */ t4 on t3.v2=t4.v2)
----
==== EXPAND_DISTINCT_AGG
If the optimizer wraps aggregation operations with a join, forces expanding of only distinct aggregates to the join.
Removes duplicates before the joining and speeds up it.
===== Example:
[source, SQL]
----
SELECT /*+ EXPAND_DISTINCT_AGG */ SUM(DISTINCT V1), AVG(DISTINCT V2) FROM TBL1 GROUP BY V3
----
==== QUERY_ENGINE [[hint_query_engine]]
Selects a particular engine to run individual queries. This is an engine level hint.
===== Parameters:
Single parameter required: the engine name.
===== Example:
[source, SQL]
----
SELECT /*+ QUERY_ENGINE('calcite') */ V1 FROM TBL1
----
==== DISABLE_RULE [[hint_disable_rule]]
Disables certain optimizer rules. This is an optimizer level hint.
===== Parameters:
* One or more optimizer rules for skipping.
===== Example:
[source, SQL]
----
SELECT /*+ DISABLE_RULE('MergeJoinConverter') */ T1.* FROM TBL1 T1 JOIN TBL2 T2 ON T1.V1=T2.V1 WHERE T2.V2=?
----
== Transaction-aware Queries [[transaction-aware-queries]]
The Calcite-based query engine supports SQL transactions with READ_COMMITTED isolation.
To ensure backward compatibility, transaction support is disabled by default. To enable it, use the following configuration:
[tabs]
--
tab:XML[]
[source,xml]
----
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="transactionConfiguration">
<bean class="org.apache.ignite.configuration.TransactionConfiguration">
<property name="txAwareQueriesEnabled" value="true" />
...
</bean>
</property>
...
</bean>
----
tab:Java[]
[source,java]
----
IgniteConfiguration cfg = new IgniteConfiguration();
cfg.getTransactionConfiguration().setTxAwareQueriesEnabled(true);
----
--
The following API "transactional aware":
* key-value API.
* SQL queries with Calcite engine.
* Scan queries.
"Transactional aware" means:
1. ACID in the same way as key-value API.
2. Any data modified or deleted within a transaction will not be visible to concurrent transactions until it is committed.
3. Any data modified by a query will be returned with updated values by subsequent queries within the same transaction.
So, when `txAwareQueriesEnabled = true` enabled the usage:
* INSERT, UPDATE, DELETE statements with regular transaction guarantees.
* Mix key-value, SQL and Scan queries to process same dataset.
[NOTE]
====
[discrete]
=== SELECT ... FOR UPDATE
Currently, no locks are held by the UPDATE or SELECT statements.
`SELECT ... FOR UPDATE` statement not supported.
This means lost updates can occur when multiple transactions concurrently modify the same key.
For example, if you execute a query like `SET salary = salary + 50 WHERE id = 1` concurrently across multiple threads,
note that due to the lost-update anomaly, the final value may not equal the original salary plus 50 multiplied by the number of threads.
====