postgresql cumulative counts in date range
0
votes
2
answers
451
views
I'm trying to get the cumulative count of rows (by
group_id
) between two dates that represent a time period where the row was active.
I have a table like this:
group_id | id | type | start_date | end_date
----------+--------+------+------------+------------
33 | 119435 | AAA | 2013-05-21 | 2014-05-19
33 | 15144 | AAA | 2013-05-21 | 2015-05-18
33 | 29393 | AAA | 2013-05-21 | 2016-05-23
33 | 119437 | AAA | 2013-05-21 | 2017-05-15
33 | 62380 | AAA | 2013-05-21 | 2099-12-31
33 | 119436 | AAA | 2013-05-21 | 2099-12-31
33 | 27346 | AAA | 2013-05-21 | 2099-12-31
33 | 28529 | AAA | 2014-05-20 | 2099-12-31
33 | 221576 | AAA | 2015-05-19 | 2099-12-31
33 | 253893 | AAA | 2016-05-24 | 2099-12-31
33 | 251589 | AAA | 2017-05-16 | 2099-12-31
33 | 285245 | AAA | 2019-01-24 | 2099-12-31
34 | 253893 | AAA | 2016-05-24 | 2099-12-31
34 | 251589 | AAA | 2017-05-16 | 2099-12-31
34 | 285245 | AAA | 2019-01-24 | 2099-12-31
34 | 285246 | AAA | 2019-05-31 | 2099-12-31
... and I need to get active counts for each of those date ranges like this:
group_id | start_date | end_date | active
----------+------------+------------+--------
33 | 2013-05-21 | 2014-05-19 | 7
33 | 2013-05-21 | 2015-05-18 | 8
33 | 2013-05-21 | 2016-05-23 | 9
33 | 2013-05-21 | 2017-05-15 | 10
33 | 2013-05-21 | 2099-12-31 | 12
33 | 2013-05-21 | 2099-12-31 | 12
33 | 2013-05-21 | 2099-12-31 | 12
33 | 2014-05-20 | 2099-12-31 | 11
33 | 2015-05-19 | 2099-12-31 | 10
33 | 2016-05-24 | 2099-12-31 | 9
33 | 2017-05-16 | 2099-12-31 | 8
33 | 2019-01-24 | 2099-12-31 | 8
34 | 2016-05-24 | 2099-12-31 | 1
34 | 2017-05-16 | 2099-12-31 | 2
34 | 2019-01-24 | 2099-12-31 | 3
34 | 2019-05-31 | 2099-12-31 | 4
I've tried various combinations of LAG
and LEAD
, with and without CTEs, but cannot come up with a solution.
Is there a way to do this in a single query? If not a single query, perhaps a combination of queries in a UDF?
**UPDATE**
Per @jjanes comment below, I believe my source table is setup incorrectly. I think I should create the source table like this instead:
group_id | id | type | start_date | end_date
----------+--------+------+------------+------------
... (skipped group 33) ...
34 | 253893 | AAA | 2016-05-24 | 2017-05-15
34 | 253893 | AAA | 2017-05-16 | 2019-01-23
34 | 253893 | AAA | 2019-01-24 | 2019-05-30
34 | 253893 | AAA | 2019-05-31 | 2099-12-31
34 | 251589 | AAA | 2017-05-16 | 2019-01-23
34 | 251589 | AAA | 2019-01-24 | 2019-05-30
34 | 251589 | AAA | 2019-05-31 | 2099-12-31
34 | 285245 | AAA | 2019-01-24 | 2019-05-30
34 | 285245 | AAA | 2019-05-31 | 2099-12-31
34 | 285246 | AAA | 2019-05-31 | 2099-12-31
With that change in the source data, the outcome of actives (showing only group 34 here) would be like this:
group_id | start_date | end_date | active
----------+------------+------------+--------
34 | 2016-05-24 | 2017-05-15 | 1
34 | 2017-05-16 | 2019-01-23 | 2
34 | 2019-01-24 | 2019-05-30 | 3
34 | 2019-05-31 | 2099-12-31 | 4
Asked by jacaetevha
(1 rep)
Mar 21, 2020, 08:20 PM
Last activity: Jun 9, 2025, 09:01 AM
Last activity: Jun 9, 2025, 09:01 AM