Sample Header Ad - 728x90

postgresql cumulative counts in date range

0 votes
2 answers
451 views
I'm trying to get the cumulative count of rows (by group_id) between two dates that represent a time period where the row was active. I have a table like this:
group_id |   id   | type | start_date |  end_date
----------+--------+------+------------+------------
       33 | 119435 | AAA  | 2013-05-21 | 2014-05-19
       33 |  15144 | AAA  | 2013-05-21 | 2015-05-18
       33 |  29393 | AAA  | 2013-05-21 | 2016-05-23
       33 | 119437 | AAA  | 2013-05-21 | 2017-05-15
       33 |  62380 | AAA  | 2013-05-21 | 2099-12-31
       33 | 119436 | AAA  | 2013-05-21 | 2099-12-31
       33 |  27346 | AAA  | 2013-05-21 | 2099-12-31
       33 |  28529 | AAA  | 2014-05-20 | 2099-12-31
       33 | 221576 | AAA  | 2015-05-19 | 2099-12-31
       33 | 253893 | AAA  | 2016-05-24 | 2099-12-31
       33 | 251589 | AAA  | 2017-05-16 | 2099-12-31
       33 | 285245 | AAA  | 2019-01-24 | 2099-12-31
       34 | 253893 | AAA  | 2016-05-24 | 2099-12-31
       34 | 251589 | AAA  | 2017-05-16 | 2099-12-31
       34 | 285245 | AAA  | 2019-01-24 | 2099-12-31
       34 | 285246 | AAA  | 2019-05-31 | 2099-12-31
... and I need to get active counts for each of those date ranges like this:
group_id | start_date |  end_date  | active
----------+------------+------------+--------
       33 | 2013-05-21 | 2014-05-19 |      7
       33 | 2013-05-21 | 2015-05-18 |      8
       33 | 2013-05-21 | 2016-05-23 |      9
       33 | 2013-05-21 | 2017-05-15 |     10
       33 | 2013-05-21 | 2099-12-31 |     12
       33 | 2013-05-21 | 2099-12-31 |     12
       33 | 2013-05-21 | 2099-12-31 |     12
       33 | 2014-05-20 | 2099-12-31 |     11
       33 | 2015-05-19 | 2099-12-31 |     10
       33 | 2016-05-24 | 2099-12-31 |      9
       33 | 2017-05-16 | 2099-12-31 |      8
       33 | 2019-01-24 | 2099-12-31 |      8
       34 | 2016-05-24 | 2099-12-31 |      1
       34 | 2017-05-16 | 2099-12-31 |      2
       34 | 2019-01-24 | 2099-12-31 |      3
       34 | 2019-05-31 | 2099-12-31 |      4
I've tried various combinations of LAG and LEAD, with and without CTEs, but cannot come up with a solution. Is there a way to do this in a single query? If not a single query, perhaps a combination of queries in a UDF? **UPDATE** Per @jjanes comment below, I believe my source table is setup incorrectly. I think I should create the source table like this instead:
group_id |   id   | type | start_date |  end_date
----------+--------+------+------------+------------
  ... (skipped group 33) ...
       34 | 253893 | AAA  | 2016-05-24 | 2017-05-15
       34 | 253893 | AAA  | 2017-05-16 | 2019-01-23
       34 | 253893 | AAA  | 2019-01-24 | 2019-05-30
       34 | 253893 | AAA  | 2019-05-31 | 2099-12-31
       34 | 251589 | AAA  | 2017-05-16 | 2019-01-23
       34 | 251589 | AAA  | 2019-01-24 | 2019-05-30
       34 | 251589 | AAA  | 2019-05-31 | 2099-12-31
       34 | 285245 | AAA  | 2019-01-24 | 2019-05-30
       34 | 285245 | AAA  | 2019-05-31 | 2099-12-31
       34 | 285246 | AAA  | 2019-05-31 | 2099-12-31
With that change in the source data, the outcome of actives (showing only group 34 here) would be like this:
group_id | start_date |  end_date  | active
----------+------------+------------+--------
       34 | 2016-05-24 | 2017-05-15 |      1
       34 | 2017-05-16 | 2019-01-23 |      2
       34 | 2019-01-24 | 2019-05-30 |      3
       34 | 2019-05-31 | 2099-12-31 |      4
Asked by jacaetevha (1 rep)
Mar 21, 2020, 08:20 PM
Last activity: Jun 9, 2025, 09:01 AM