blob: 3eea252d708055a78317da3ce809fa56a35d3d4f [file] [log] [blame] [view]
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# SELECT syntax
The queries in DataFusion scan data from tables and return 0 or more rows.
Please be aware that column names in queries are made lower-case, but not on the inferred schema. Accordingly, if you
want to query against a capitalized field, make sure to use double quotes. Please see this
[example](https://arrow.apache.org/datafusion/user-guide/example-usage.html) for clarification.
In this documentation we describe the SQL syntax in DataFusion.
DataFusion supports the following syntax for queries:
<code class="language-sql hljs">
[ [WITH](#with-clause) with_query [, ...] ] <br/>
[SELECT](#select-clause) [ ALL | DISTINCT ] select_expr [, ...] <br/>
[ [FROM](#from-clause) from_item [, ...] ] <br/>
[ [JOIN](#join-clause) join_item [, ...] ] <br/>
[ [WHERE](#where-clause) condition ] <br/>
[ [GROUP BY](#group-by-clause) grouping_element [, ...] ] <br/>
[ [HAVING](#having-clause) condition] <br/>
[ [UNION](#union-clause) [ ALL | select ] <br/>
[ [ORDER BY](#order-by-clause) expression [ ASC | DESC ][, ...] ] <br/>
[ [LIMIT](#limit-clause) count ] <br/>
</code>
## WITH clause
A with clause allows to give names for queries and reference them by name.
```sql
WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
SELECT a, b FROM x;
```
## SELECT clause
Example:
```sql
SELECT a, b, a + b FROM table
```
The `DISTINCT` quantifier can be added to make the query return all distinct rows.
By default `ALL` will be used, which returns all the rows.
```sql
SELECT DISTINCT person, age FROM employees
```
## FROM clause
Example:
```sql
SELECT t.a FROM table AS t
```
## WHERE clause
Example:
```sql
SELECT a FROM table WHERE a > 10
```
## JOIN clause
DataFusion supports `INNER JOIN`, `LEFT OUTER JOIN`, `RIGHT OUTER JOIN`, `FULL OUTER JOIN`, and `CROSS JOIN`.
The following examples are based on this table:
```sql
select * from x;
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
```
### INNER JOIN
The keywords `JOIN` or `INNER JOIN` define a join that only shows rows where there is a match in both tables.
```sql
❯ select * from x inner join x y ON x.column_1 = y.column_1;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | 1 | 2 |
+----------+----------+----------+----------+
```
### LEFT OUTER JOIN
The keywords `LEFT JOIN` or `LEFT OUTER JOIN` define a join that includes all rows from the left table even if there
is not a match in the right table. When there is no match, null values are produced for the right side of the join.
```sql
❯ select * from x left join x y ON x.column_1 = y.column_2;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | | |
+----------+----------+----------+----------+
```
### RIGHT OUTER JOIN
The keywords `RIGHT JOIN` or `RIGHT OUTER JOIN` define a join that includes all rows from the right table even if there
is not a match in the left table. When there is no match, null values are produced for the left side of the join.
```sql
❯ select * from x right join x y ON x.column_1 = y.column_2;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| | | 1 | 2 |
+----------+----------+----------+----------+
```
### FULL OUTER JOIN
The keywords `FULL JOIN` or `FULL OUTER JOIN` define a join that is effectively a union of a `LEFT OUTER JOIN` and
`RIGHT OUTER JOIN`. It will show all rows from the left and right side of the join and will produce null values on
either side of the join where there is not a match.
```sql
❯ select * from x full outer join x y ON x.column_1 = y.column_2;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | | |
| | | 1 | 2 |
+----------+----------+----------+----------+
```
### CROSS JOIN
A cross join produces a cartesian product that matches every row in the left side of the join with every row in the
right side of the join.
```sql
❯ select * from x cross join x y;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | 1 | 2 |
+----------+----------+----------+----------+
```
## GROUP BY clause
Example:
```sql
SELECT a, b, MAX(c) FROM table GROUP BY a, b
```
## HAVING clause
Example:
```sql
SELECT a, b, MAX(c) FROM table GROUP BY a, b HAVING MAX(c) > 10
```
## UNION clause
Example:
```sql
SELECT
a,
b,
c
FROM table1
UNION ALL
SELECT
a,
b,
c
FROM table2
```
## ORDER BY clause
Orders the results by the referenced expression. By default it uses ascending order (`ASC`).
This order can be changed to descending by adding `DESC` after the order-by expressions.
Examples:
```sql
SELECT age, person FROM table ORDER BY age;
SELECT age, person FROM table ORDER BY age DESC;
SELECT age, person FROM table ORDER BY age, person DESC;
```
## LIMIT clause
Limits the number of rows to be a maximum of `count` rows. `count` should be a non-negative integer.
Example:
```sql
SELECT age, person FROM table
LIMIT 10
```