SQL Window Functions are powerful tools used to perform calculations across a set of table rows that are related to the current row without collapsing the result into a single value. Unlike traditional aggregate functions, which return a single result for a group of rows, window functions preserve individual rows while adding aggregated data.
In this guide, we’ll explore what SQL Window Functions are, how they work, practical examples, and best practices to enhance your SQL skills.
1. What Are SQL Window Functions?
SQL Window Functions allow you to perform calculations such as running totals, moving averages, and ranking over a subset of rows. They are executed after the WHERE
, GROUP BY
, and HAVING
clauses but before the ORDER BY
clause.
Common SQL Window Functions:
ROW_NUMBER()
: Assigns a unique sequential integer to rows.RANK()
: Provides ranking, with gaps for identical values.DENSE_RANK()
: Similar toRANK()
but without gaps.SUM()
,AVG()
,MAX()
,MIN()
: Can be used as window functions.LEAD()
andLAG()
: Access subsequent or previous rows.NTILE()
: Divides rows into a specified number of buckets.
2. Syntax of SQL Window Functions
sqlCopyEditSELECT column_name,
function_name() OVER (PARTITION BY column_name ORDER BY column_name) AS alias_name
FROM table_name;
PARTITION BY
: Divides rows into groups (optional).ORDER BY
: Specifies the order of rows.OVER
: Defines the window of rows for the function to operate on.
3. Overview of Common SQL Window Functions
Function | Description | Example | Use Case |
---|---|---|---|
ROW_NUMBER() | Assigns a unique number to each row within a partition. | ROW_NUMBER() OVER (ORDER BY salary) | Employee ranking by salary. |
RANK() | Provides ranking with gaps for identical values. | RANK() OVER (ORDER BY salary DESC) | Ranking top performers with gaps. |
DENSE_RANK() | Ranks without gaps for identical values. | DENSE_RANK() OVER (ORDER BY salary DESC) | Continuous ranking of employees. |
SUM() OVER | Calculates running totals. | SUM(salary) OVER (ORDER BY id) | Cumulative salary sum. |
LEAD() | Accesses subsequent row values. | LEAD(salary, 1) OVER (ORDER BY id) | Compare current and next salary. |
LAG() | Accesses previous row values. | LAG(salary, 1) OVER (ORDER BY id) | Compare current and previous salary. |
4. Using ROW_NUMBER()
Function
Purpose: Assigns a unique sequential number to rows within a partition.
Syntax:
sqlCopyEditSELECT column_name,
ROW_NUMBER() OVER (ORDER BY column_name) AS row_num
FROM table_name;
Example: Get top 3 highest-paid employees:
sqlCopyEditSELECT employee_name, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees
WHERE salary_rank <= 3;
5. Understanding RANK()
vs DENSE_RANK()
RANK() | DENSE_RANK() |
---|---|
Creates gaps for identical values. | No gaps for identical values. |
Useful for competitions with tied scores. | Useful for continuous ranking. |
Example:
sqlCopyEditSELECT employee_name, salary,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
Output:
Employee | Salary | RANK() | DENSE_RANK() |
---|---|---|---|
John | 90000 | 1 | 1 |
Alice | 85000 | 2 | 2 |
Bob | 85000 | 2 | 2 |
Carol | 80000 | 4 | 3 |
6. Using LEAD()
and LAG()
for Row Comparison
LEAD()
: Accesses a following row’s value.LAG()
: Accesses a previous row’s value.
Example: Salary difference from the previous year:
sqlCopyEditSELECT year, salary,
salary - LAG(salary) OVER (ORDER BY year) AS salary_diff
FROM salaries;
Output:
Year | Salary | Salary Diff |
---|---|---|
2021 | 50000 | NULL |
2022 | 55000 | 5000 |
2023 | 60000 | 5000 |
7. Using NTILE()
for Data Segmentation
Purpose: Divides rows into a specified number of equal buckets.
Example: Segment employees into 4 salary bands:
sqlCopyEditSELECT employee_name, salary,
NTILE(4) OVER (ORDER BY salary DESC) AS salary_band
FROM employees;
Output:
Employee | Salary | Salary Band |
---|---|---|
John | 90000 | 1 |
Alice | 85000 | 1 |
Bob | 85000 | 2 |
Carol | 80000 | 2 |
8. Best Practices for Using SQL Window Functions
- Avoid Using
DISTINCT
with Window Functions: It’s redundant and affects performance. - Use
PARTITION BY
Efficiently: Only when needed to improve performance. - Leverage Indexing: Speed up queries involving large datasets.
- Combine Multiple Window Functions: For complex analytics like moving averages and rankings.
9. Common Mistakes to Avoid
- Using
WHERE
instead ofHAVING
with Window Functions: ❌ Incorrect: sqlCopyEditSELECT employee_name, salary FROM employees WHERE ROW_NUMBER() OVER (ORDER BY salary DESC) <= 5;
✔️ Correct: sqlCopyEditSELECT employee_name, salary FROM ( SELECT employee_name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num FROM employees ) AS temp WHERE row_num <= 5;
- Forgetting
OVER()
Clause:- Window functions must include
OVER()
; otherwise, they behave like aggregate functions.
- Window functions must include
10. Frequently Asked Questions (FAQs)
Q1: Can we use multiple window functions in one query?
A: Yes, you can use multiple window functions in a single query to perform complex analytics.
Q2: What is the difference between PARTITION BY
and GROUP BY
?
A: PARTITION BY
keeps all rows intact and applies window functions within each partition, while GROUP BY
collapses rows into groups.
Final Thoughts
Mastering SQL Window Functions enables you to perform advanced analytics with ease. By leveraging functions like ROW_NUMBER()
, RANK()
, and LEAD()
, you can gain deeper insights into your data without sacrificing individual row visibility.
For more in-depth SQL tutorials, visit our SQL Community.
Explore additional resources on SQL Window Functions.