Sample Header Ad - 728x90

Cumulative sum in a period of months

2 votes
1 answer
1073 views
I have this table: | month_rep | fruits | harvested | | ----- | ------------- | --------- | | 2021-09-01 | 139 | 139 | | 2021-10-01 | 143 | 11 | | 2021-11-01 | 152 | 14 | | 2021-12-01 | 112 | 9 | | 2022-01-01 | 133 | 10 | | 2022-02-01 | 145 | 12 | | 2022-03-01 | 123 | 5 | | 2022-04-01 | 111 | 4 | | 2022-05-01 | 164 | 9 | | 2022-06-01 | 135 | 12 | | 2022-07-01 | 124 | 14 | | 2022-08-01 | 144 | 18 | | 2022-09-01 | 111 | 111 | | 2022-10-01 | 108 | 13 | | 2022-11-01 | 123 | 7 | | 2022-12-01 | 132 | 20 | I want to create a new column called sold that is based on a calculation - which will be a running sum of harvested in a period of months (Sep-Jun). Every September, sold will always be 1 (or 100 in percent). The calculation for Oct 2021 will be fruits / (harvested + harvested_Nov) = 143 / 11 + 139. For the rest of the months of 2021, follows the same format: fruits / (harvested + harvested_until_Sep) --> this will be a running sum, starting from the month you're in, and ends in Sep of the previous year. Another example for 2022 is the calculation for Mar 2022 = fruits / (harvested + harvested_Feb_2022 + harvested_Jan_2022 + harvested_Dec_2021 + harvested_Nov_2021 + harvested_Oct_2021 + harvested_Sep_2021) = 123 / (5+12+10+9+14+11+139). The table should look like this: | month_rep | fruits | harvested | sold | | ----- | ------------- | --------- | ---- | | 2021-09-01 | 139 | 139 | 1 | 2021-10-01 | 143 | 11 | 0.95 | 2021-11-01 | 152 | 14 | 0.93 | 2021-12-01 | 112 | 9 | 0.65 | 2022-01-01 | 133 | 10 | .. | 2022-02-01 | 145 | 12 | .. | 2022-03-01 | 123 | 5 | .. | 2022-04-01 | 111 | 4 | .. | 2022-05-01 | 164 | 9 | .. | 2022-06-01 | 135 | 12 | .. | 2022-07-01 | 124 | 14 | null | 2022-08-01 | 144 | 18 | null | 2022-09-01 | 111 | 111 | 1 | 2022-10-01 | 108 | 13 | 0.87 | 2022-11-01 | 123 | 7 | 0.94 | 2022-12-01 | 132 | 20 | .. I tried this:
select 
	month_rep,
	fruits,
    harvested,
	case when extract(month from "month_rep") in (7, 8) then null
		 when extract(month from "month_rep") = 9 then 1
		else ROUND(fruits / sum(harvested) over (order by month_rep), 2) end sold
from my_table
This works well, but only when I have data before the 2022 September. I want Jul and Aug to have null sold - which works. After Aug, Sep 2022 should be a new period where sold is 1. After that, Oct 2022 will be calculated as fruits / (harvested + harvested_Sep_2022) - where we start a new period for the 2nd period Sep 2022 - Jun 2023. Is there a way to group these "periods" and have the running sum over that? I might need to find a way to group the period and take partition by from that.
Asked by Jason (23 rep)
Aug 1, 2022, 10:45 PM
Last activity: Aug 3, 2022, 03:37 AM