Sample Header Ad - 728x90

Understanding aggregate window functions

2 votes
1 answer
295 views
Consider the following table:
sql
CREATE TABLE T1
(
  keycol INT         NOT NULL CONSTRAINT PK_T1 PRIMARY KEY,
  col1   VARCHAR(10) NOT NULL
);

INSERT INTO T1 VALUES
  (2, 'A'),(3, 'A'),
  (5, 'B'),(7, 'B'),(11, 'B'),
  (13, 'C'),(17, 'C'),(19, 'C'),(23, 'C');
Currently, I am looking into window functions and am trying out aggregate window functions. Although I feel I understand how the windows are defined with the
and
clauses, I am unsure how the aggregate window functions are being calculated, such as
() OVER ()
. **I am looking to understand the following three queries**.
-- Query 1
SELECT keycol, col1, AVG(keycol) OVER (PARTITION BY col1) AS RowAvg 
FROM T1
>
> keycol | col1 | RowAvg
> -----: | :--- | -----:
>      2 | A    |      2
>      3 | A    |      2
>      5 | B    |      7
>      7 | B    |      7
>     11 | B    |      7
>     13 | C    |     18
>     17 | C    |     18
>     19 | C    |     18
>     23 | C    |     18
> 
-- Query 2
SELECT keycol, col1, AVG(keycol) OVER (ORDER BY keycol) AS RowAvg
FROM T1
>
> keycol | col1 | RowAvg
> -----: | :--- | -----:
>      2 | A    |      2
>      3 | A    |      2
>      5 | B    |      3
>      7 | B    |      4
>     11 | B    |      5
>     13 | C    |      6
>     17 | C    |      8
>     19 | C    |      9
>     23 | C    |     11
> 
-- Query 3
SELECT keycol, col1, AVG(keycol) OVER (PARTITION BY col1 ORDER BY keycol) AS RowAvg
FROM T1
>
> keycol | col1 | RowAvg
> -----: | :--- | -----:
>      2 | A    |      2
>      3 | A    |      2
>      5 | B    |      5
>      7 | B    |      6
>     11 | B    |      7
>     13 | C    |     13
>     17 | C    |     15
>     19 | C    |     16
>     23 | C    |     18
> 
**Query 1**: I believe RowAvg should be the average of the rows for each col1 level. Are the numbers 2 and 7 the FLOOR of the average or is my understanding incorrect? **Query 2**: I am not too sure what is being done to produce RowAvg here. As no PARTITION or framing is used here, I believe the window should be the entire table, is this correct? Also, how is the RowAvg being found? **Query 3**: Is this finding the (FLOOR) average for each partition however doing this incrementally? That is, for row 1 of the first partition ('A'), we find the average of that row. Then, for row 2 of the first partition, we find the average of the first 2 rows. **General question**: Does introducing
BY
into the aggregate window function perform the aggregate function 'consecutively' such as in queries 1 and 2? It is interesting to see that in query 1,
is performed to each partition as a whole, whereas in queries 1 and 2 the RowAvg's are almost different for each row.
Asked by TMilliman (85 rep)
Dec 27, 2019, 12:59 AM
Last activity: Dec 27, 2019, 02:08 PM