blob: 8e44e4da09d658051f85b87659c12c2a8fd7685e [file] [log] [blame]
***********
Predicates
***********
=============
IN Predicate
=============
IN predicate provides a comparison of row, array, and result of a subquery.
*Synopsis*
.. code-block:: sql
column_reference (NOT) IN (val1, val2, ..., valN)
column_reference (NOT) IN (SELECT ... FROM ...) AS alias_name
Examples are as follows:
.. code-block:: sql
-- this statement filters lists down all the records where col1 value is 1, 2 or 3:
SELECT col1, col2 FROM table1 WHERE col1 IN (1, 2, 3);
-- this statement filters lists down all the records where col1 value is neither 1, 2 nor 3:
SELECT col1, col2 FROM table1 WHERE col1 NOT IN (1, 2, 3);
You can use `IN clause` on text data domain as follows:
.. code-block:: sql
SELECT col1, col2 FROM table1 WHERE col2 IN ('tajo', 'hadoop');
SELECT col1, col2 FROM table1 WHERE col2 NOT IN ('tajo', 'hadoop');
Finally, you can use subqueries in the `IN clause`.
.. code-block:: sql
SELECT col1, col2
FROM table1
WHERE col3 IN (
SELECT avg(col2) as avg_col2
FROM table2
GROUP BY col1
HAVING avg_col2 > 100);
SELECT col1, col2
FROM table1
WHERE col3 NOT IN (
SELECT avg(col2) as avg_col2
FROM table2
GROUP BY col1
HAVING avg_col2 > 100);
==================================
String Pattern Matching Predicates
==================================
--------------------
LIKE
--------------------
LIKE operator returns true or false depending on whether its pattern matches the given string. An underscore (_) in pattern matches any single character. A percent sign (%) matches any sequence of zero or more characters.
*Synopsis*
.. code-block:: sql
string LIKE pattern
string NOT LIKE pattern
--------------------
ILIKE
--------------------
ILIKE is the same to LIKE, but it is a case insensitive operator. It is not in the SQL standard. We borrow this operator from PostgreSQL.
*Synopsis*
.. code-block:: sql
string ILIKE pattern
string NOT ILIKE pattern
--------------------
SIMILAR TO
--------------------
*Synopsis*
.. code-block:: sql
string SIMILAR TO pattern
string NOT SIMILAR TO pattern
It returns true or false depending on whether its pattern matches the given string. Also like LIKE, ``SIMILAR TO`` uses ``_`` and ``%`` as metacharacters denoting any single character and any string, respectively.
In addition to these metacharacters borrowed from LIKE, 'SIMILAR TO' supports more powerful pattern-matching metacharacters borrowed from regular expressions:
+------------------------+-------------------------------------------------------------------------------------------+
| metacharacter | description |
+========================+===========================================================================================+
| | | denotes alternation (either of two alternatives). |
+------------------------+-------------------------------------------------------------------------------------------+
| * | denotes repetition of the previous item zero or more times. |
+------------------------+-------------------------------------------------------------------------------------------+
| + | denotes repetition of the previous item one or more times. |
+------------------------+-------------------------------------------------------------------------------------------+
| ? | denotes repetition of the previous item zero or one time. |
+------------------------+-------------------------------------------------------------------------------------------+
| {m} | denotes repetition of the previous item exactly m times. |
+------------------------+-------------------------------------------------------------------------------------------+
| {m,} | denotes repetition of the previous item m or more times. |
+------------------------+-------------------------------------------------------------------------------------------+
| {m,n} | denotes repetition of the previous item at least m and not more than n times. |
+------------------------+-------------------------------------------------------------------------------------------+
| [] | A bracket expression specifies a character class, just as in POSIX regular expressions. |
+------------------------+-------------------------------------------------------------------------------------------+
| () | Parentheses can be used to group items into a single logical item. |
+------------------------+-------------------------------------------------------------------------------------------+
Note that `.`` is not used as a metacharacter in ``SIMILAR TO`` operator.
---------------------
Regular expressions
---------------------
Regular expressions provide a very powerful means for string pattern matching. In the current Tajo, regular expressions are based on Java-style regular expressions instead of POSIX regular expression. The main difference between java-style one and POSIX's one is character class.
*Synopsis*
.. code-block:: sql
string ~ pattern
string !~ pattern
string ~* pattern
string !~* pattern
+----------+---------------------------------------------------------------------------------------------------+
| operator | Description |
+==========+===================================================================================================+
| ~ | It returns true if a given regular expression is matched to string. Otherwise, it returns false. |
+----------+---------------------------------------------------------------------------------------------------+
| !~ | It returns false if a given regular expression is matched to string. Otherwise, it returns true. |
+----------+---------------------------------------------------------------------------------------------------+
| ~* | It is the same to '~', but it is case insensitive. |
+----------+---------------------------------------------------------------------------------------------------+
| !~* | It is the same to '!~', but it is case insensitive. |
+----------+---------------------------------------------------------------------------------------------------+
Here are examples:
.. code-block:: sql
'abc' ~ '.*c' true
'abc' ~ 'c' false
'aaabc' ~ '([a-z]){3}bc true
'abc' ~* '.*C' true
'abc' !~* 'B.*' true
Regular expressions operator is not in the SQL standard. We borrow this operator from PostgreSQL.
*Synopsis for REGEXP and RLIKE operators*
.. code-block:: sql
string REGEXP pattern
string NOT REGEXP pattern
string RLIKE pattern
string NOT RLIKE pattern
But, they do not support case-insensitive operators.