What are SQL Window Functions? | Complete Beginner’s Guide

In this article, we will see what are the SQL Window Functions and common issues that we face with practical examples.

SQL Window Functions

We have 3 different categories of window functions, which are given below.

With all these analytical functions we can use the over by clause.

For our better understanding we will write some queries and see the behaviour of these functions.

Interview point of view these are very important.

Let us consider the below table.

Employee IDNameWork LocationBonus
1JohnHome100
2PeterHome200
3GeethaHome300
4SivaHome400
5JoeHome500
6GaryOffice600
7ThomasOffice700
8NaniOffice800
9DivyaOffice900
10RohanOffice1000
SQL Window Functions Examples

Now let us write a query to get the average on bonus column

SELECT AVG(Bonus) From Employee;

Result for this query will be

AVG(Bonus)
550

Now we will write another query to capture all other records from the employee table along with average.

SELECT 
Employee ID, Name, Work, Bonus, 
AVG(Bonus) 
FROM Employee;

Results for the above query will be

Employee
ID
NameWorkBonusAVG(Bonus)
1JohnHome100100
2PeterHome200150
3GeethaHome300200
4SivaHome400250
5JoeHome500300
6GaryOffice600350
7ThomasOffice700400
8NaniOffice800450
9DivyaOffice900500
10RohanOffice1000550
Employee table along with average

What happened here?. If you can obeserve, I got a different average salary for each employee. The last employee I got the correct value which is 550.

Let us see the below diagram to understand how this average is being computed.

SQL Window Functions Average Bonus Calculation
SQL Window Functions Average Bonus Calculation

Here we have not specified the explicit value for the rows or arrange clause. So that is the reason all average values for the bonus column is being computed differently except the last value.

Since we have not specified explicit value it is going to take the default value.

It is going to take the Unbounded Preceding and Current Row.

Let us understand what this value means.

Say for example this average function is being applied to the second row. In that context what is this range between unbounded preceding and current row?

Current row is the one which average being computed and unbounded preceding is the window for this average function starts at the first row within the result set.

Now the order by clause is emposed on the rows for the employee table that is our result set.

Within the orderset results unbounded preceding means the window starts for this average function at the first row.

When it is in the second row the range of average is for the first row and current row.

Similarly when we are on the third row the range is first row and the current row which is third row. So average of these three salary would be captured for the third row.

At the moment this is how it is being calculated. Now look at the last row it is calculated correctly. Why because it is going to calculate the average for all records.

We will have the same problem whenever we use the any of window functions that we have aggregate functions, Ranking Function and Analytical Functions because of the default value for that clause.

Now in the query we will also take Count and SUM

For that our query will be

SELECT Name, Work, Bonus,
AVG(Bonus) OVER( ORDER BY BONUS) AS [Average]
COUNT(Bonus) OVER(ORDER BY BONUS) AS [Count]
SUM(Bonus) OVER(ORDER BY BONUS) AS [SUM]
FROM
Employee

Results for this query will be

IDNameWorkBonusAVG(Bonus)COUNT(Bonus)SumBonus)
1JohanHome1001001100
2PeterHome2001502300
3GeethaHome3002003600
4SivaHome40025041000
5JoeHome50030051500
6GaryOffice60035062100
7ThomasOffice 70040072800
8NaniOffice 80045083600
9DivyaOffice 90050094500
10RohanOffice 1000550105500
Average, Count, Sum – Table

The window for these functions is, It has to starts with the first row and ends with the last row.

But what is the default value here? Which is the range between unbounded preceding and current row.

Now i am going to change that range between Unbounded Preceding And Unbounded Following. Means window ends with the last row within the result set.

Now let us writte a query for this

SELECT Name, Work, Bonus,
AVG(Bonus) OVER( ORDER BY BONUS ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS [Average], 

COUNT(Bonus) OVER(ORDER BY BONUS ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS [Count], 

SUM(Bonus) OVER(ORDER BY BONUS ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS [SUM]

FROM

Employee

Results for this query will be

IDNameWorkBonusAVG(Bonus)COUNT(Bonus)SUM(Bonus)
1JohnHome100550105500
2PeterHome200550105500
3GeethaHome300550105500
4SivaHome400550105500
5JoeHome500550105500
6GaryOffice600550105500
7ThomasOffice700550105500
8NaniOffice 800550105500
9DivyaOffice 900550105500
10RohanOffice 1000550105500

What we achieved here. Total count, Total average and Total sum for all rows. This is what we expected here.

At the moment there is no partition. If the partition is involved this is going to slightly change.

Now let us take the partition by work. Here window starts with partition first row and ends with the partition last row.

Let us write a query to achieve this

SELECT Name, Work, Bonus,
AVG(Bonus) OVER( PARTITION BY Work ORDER BY BONUS ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS [Average], 
COUNT(Bonus) OVER( PARTITION BY Work ORDER BY BONUS ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS [Count], 
SUM(Bonus) OVER(PARTITION BY Work ORDER BY BONUS ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS [SUM]
FROM
Employee

Results for this query will be

IDNameWorkBonusAVG(Bonus)COUNT(Bonus)SUM(Bonus)
1JohnHome10030051500
2PeterHome20030051500
3GeethaHome30030051500
4SivaHome40030051500
5JoeHome50030051500
6GaryOffice60080054000
7ThomasOffice70080054000
8NaniOffice80080054000
9DivyaOffice90080054000
10RohanOffice100080054000

Now there is one more thing here I want to calculate the one row preceding and one row following how i can do that?

Let us see.

If you can take an average. Consider I am in third row and I want to calculate average between second row and fourth row.

Let us look at the first row we dont have any row before that and one row after that it calculate average, sum and count respectively .

Let us write a query for this

SELECT Name, Work, Bonus,
AVG(Bonus) OVER( PARTITION BY Work ORDER BY BONUS ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS [Average], COUNT(Bonus) OVER( PARTITION BY Work ORDER BY BONUS ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS [Count],
SUM(Bonus) OVER(PARTITION BY Work ORDER BY BONUS ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS [SUM]
FROM
Employee

Results for this query will be

IDNameWorkBonusAVG(Bonus)Count(Bonus)SUM(Bonus)
1JohnHome1001502300
2PeterHome2002003600
3Geetha Home3003003900
4SivaHome40040031200
5JoeHome5004502900
6GaryOffice 60065021300
7ThomasOffice 70070032100
8NaniOffice 80080032400
9DivyaOffice 90090032700
10RohanOffice 100095021900

Here you need to calculate the result set very carefully because partition is applied on the work column. There are two sets here one is Home and other one is Office. When new partition starts you need to calculate it separately.

I hope this article is useful to you. Please leave your comment in the comment section below.

You can read related article SQL Date Functions here.

Leave a comment