| --- | 
 | layout: global | 
 | title: CLUSTER BY Clause | 
 | displayTitle: CLUSTER BY Clause | 
 | license: | | 
 |   Licensed to the Apache Software Foundation (ASF) under one or more | 
 |   contributor license agreements.  See the NOTICE file distributed with | 
 |   this work for additional information regarding copyright ownership. | 
 |   The ASF licenses this file to You under the Apache License, Version 2.0 | 
 |   (the "License"); you may not use this file except in compliance with | 
 |   the License.  You may obtain a copy of the License at | 
 |  | 
 |      http://www.apache.org/licenses/LICENSE-2.0 | 
 |  | 
 |   Unless required by applicable law or agreed to in writing, software | 
 |   distributed under the License is distributed on an "AS IS" BASIS, | 
 |   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | 
 |   See the License for the specific language governing permissions and | 
 |   limitations under the License. | 
 | --- | 
 |  | 
 | ### Description | 
 |  | 
 | The `CLUSTER BY` clause is used to first repartition the data based | 
 | on the input expressions and then sort the data within each partition. This is | 
 | semantically equivalent to performing a | 
 | [DISTRIBUTE BY](sql-ref-syntax-qry-select-distribute-by.html) followed by a | 
 | [SORT BY](sql-ref-syntax-qry-select-sortby.html). This clause only ensures that the | 
 | resultant rows are sorted within each partition and does not guarantee a total order of output. | 
 |  | 
 | ### Syntax | 
 |  | 
 | ```sql | 
 | CLUSTER BY { expression [ , ... ] } | 
 | ``` | 
 |  | 
 | ### Parameters | 
 |  | 
 | * **expression** | 
 |  | 
 |     Specifies combination of one or more values, operators and SQL functions that results in a value. | 
 |  | 
 | ### Examples | 
 |  | 
 | ```sql | 
 | CREATE TABLE person (name STRING, age INT); | 
 | INSERT INTO person VALUES | 
 |     ('Zen Hui', 25), | 
 |     ('Anil B', 18), | 
 |     ('Shone S', 16), | 
 |     ('Mike A', 25), | 
 |     ('John A', 18), | 
 |     ('Jack N', 16); | 
 |  | 
 | -- Reduce the number of shuffle partitions to 2 to illustrate the behavior of `CLUSTER BY`. | 
 | -- It's easier to see the clustering and sorting behavior with less number of partitions. | 
 | SET spark.sql.shuffle.partitions = 2; | 
 |  | 
 | -- Select the rows with no ordering. Please note that without any sort directive, the results | 
 | -- of the query is not deterministic. It's included here to show the difference in behavior | 
 | -- of a query when `CLUSTER BY` is not used vs when it's used. The query below produces rows | 
 | -- where age column is not sorted. | 
 | SELECT age, name FROM person; | 
 | +---+-------+ | 
 | |age|   name| | 
 | +---+-------+ | 
 | | 16|Shone S| | 
 | | 25|Zen Hui| | 
 | | 16| Jack N| | 
 | | 25| Mike A| | 
 | | 18| John A| | 
 | | 18| Anil B| | 
 | +---+-------+ | 
 |  | 
 | -- Produces rows clustered by age. Persons with same age are clustered together. | 
 | -- In the query below, persons with age 18 and 25 are in first partition and the | 
 | -- persons with age 16 are in the second partition. The rows are sorted based | 
 | -- on age within each partition. | 
 | SELECT age, name FROM person CLUSTER BY age; | 
 | +---+-------+ | 
 | |age|   name| | 
 | +---+-------+ | 
 | | 18| John A| | 
 | | 18| Anil B| | 
 | | 25|Zen Hui| | 
 | | 25| Mike A| | 
 | | 16|Shone S| | 
 | | 16| Jack N| | 
 | +---+-------+ | 
 | ``` | 
 |  | 
 | ### Related Statements | 
 |  | 
 | * [SELECT Main](sql-ref-syntax-qry-select.html) | 
 | * [WHERE Clause](sql-ref-syntax-qry-select-where.html) | 
 | * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) | 
 | * [HAVING Clause](sql-ref-syntax-qry-select-having.html) | 
 | * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) | 
 | * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) | 
 | * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) | 
 | * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) | 
 | * [OFFSET Clause](sql-ref-syntax-qry-select-offset.html) | 
 | * [CASE Clause](sql-ref-syntax-qry-select-case.html) | 
 | * [PIVOT Clause](sql-ref-syntax-qry-select-pivot.html) | 
 | * [UNPIVOT Clause](sql-ref-syntax-qry-select-unpivot.html) | 
 | * [LATERAL VIEW Clause](sql-ref-syntax-qry-select-lateral-view.html) |