Note: this documentation is in the process of being migrated to be automatically created from the codebase. Please see the Window Functions (Old) page for the rest of the documentation.
A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. However, window functions do not cause rows to become grouped into a single output row like non-window aggregate calls would. Instead, the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result
Here is an example that shows how to compare each employee's salary with the average salary in his or her department:
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary; +-----------+-------+--------+-------------------+ | depname | empno | salary | avg | +-----------+-------+--------+-------------------+ | personnel | 2 | 3900 | 3700.0 | | personnel | 5 | 3500 | 3700.0 | | develop | 8 | 6000 | 5020.0 | | develop | 10 | 5200 | 5020.0 | | develop | 11 | 5200 | 5020.0 | | develop | 9 | 4500 | 5020.0 | | develop | 7 | 4200 | 5020.0 | | sales | 1 | 5000 | 4866.666666666667 | | sales | 4 | 4800 | 4866.666666666667 | | sales | 3 | 4800 | 4866.666666666667 | +-----------+-------+--------+-------------------+
A window function call always contains an OVER clause directly following the window function's name and argument(s). This is what syntactically distinguishes it from a normal function or non-window aggregate. The OVER clause determines exactly how the rows of the query are split up for processing by the window function. The PARTITION BY clause within OVER divides the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row. The previous example showed how to count the average of a column per partition.
You can also control the order in which rows are processed by window functions using ORDER BY within OVER. (The window ORDER BY does not even have to match the order in which the rows are output.) Here is an example:
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary; +-----------+-------+--------+--------+ | depname | empno | salary | rank | +-----------+-------+--------+--------+ | personnel | 2 | 3900 | 1 | | develop | 8 | 6000 | 1 | | develop | 10 | 5200 | 2 | | develop | 11 | 5200 | 2 | | develop | 9 | 4500 | 4 | | develop | 7 | 4200 | 5 | | sales | 1 | 5000 | 1 | | sales | 4 | 4800 | 2 | | personnel | 5 | 3500 | 2 | | sales | 3 | 4800 | 2 | +-----------+-------+--------+--------+
There is another important concept associated with window functions: for each row, there is a set of rows within its partition called its window frame. Some window functions act only on the rows of the window frame, rather than of the whole partition. Here is an example of using window frames in queries:
SELECT depname, empno, salary, avg(salary) OVER(ORDER BY salary ASC ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS avg, min(salary) OVER(ORDER BY empno ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_min FROM empsalary ORDER BY empno ASC; +-----------+-------+--------+--------------------+---------+ | depname | empno | salary | avg | cum_min | +-----------+-------+--------+--------------------+---------+ | sales | 1 | 5000 | 5000.0 | 5000 | | personnel | 2 | 3900 | 3866.6666666666665 | 3900 | | sales | 3 | 4800 | 4700.0 | 3900 | | sales | 4 | 4800 | 4866.666666666667 | 3900 | | personnel | 5 | 3500 | 3700.0 | 3500 | | develop | 7 | 4200 | 4200.0 | 3500 | | develop | 8 | 6000 | 5600.0 | 3500 | | develop | 9 | 4500 | 4500.0 | 3500 | | develop | 10 | 5200 | 5133.333333333333 | 3500 | | develop | 11 | 5200 | 5466.666666666667 | 3500 | +-----------+-------+--------+--------------------+---------+
When a query involves multiple window functions, it is possible to write out each one with a separate OVER clause, but this is duplicative and error-prone if the same windowing behavior is wanted for several functions. Instead, each windowing behavior can be named in a WINDOW clause and then referenced in OVER. For example:
SELECT sum(salary) OVER w, avg(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
The syntax for the OVER-clause is
function([expr])
OVER(
[PARTITION BY expr[, …]]
[ORDER BY expr [ ASC | DESC ][, …]]
[ frame_clause ]
)
where frame_clause is one of:
{ RANGE | ROWS | GROUPS } frame_start
{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end
and frame_start and frame_end can be one of
UNBOUNDED PRECEDING offset PRECEDING CURRENT ROW offset FOLLOWING UNBOUNDED FOLLOWING
where offset is an non-negative integer.
RANGE and GROUPS modes require an ORDER BY clause (with RANGE the ORDER BY must specify exactly one column).
All aggregate functions can be used as window functions.
cume_distRelative rank of the current row: (number of rows preceding or peer with current row) / (total rows).
cume_dist()
dense_rankReturns the rank of the current row without gaps. This function ranks rows in a dense manner, meaning consecutive ranks are assigned even for identical values.
dense_rank()
ntileInteger ranging from 1 to the argument value, dividing the partition as equally as possible
ntile(expression)
percent_rankReturns the percentage rank of the current row within its partition. The value ranges from 0 to 1 and is computed as (rank - 1) / (total_rows - 1).
percent_rank()
rankReturns the rank of the current row within its partition, allowing gaps between ranks. This function provides a ranking similar to row_number, but skips ranks for identical values.
rank()
row_numberNumber of the current row within its partition, counting from 1.
row_number()
lagReturns value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default (which must be of the same type as value).
lag(expression, offset, default)
leadReturns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default (which must be of the same type as value).
lead(expression, offset, default)