Sample Header Ad - 728x90

How to Select The Lowest Date in one field that's higher than the date in another field

0 votes
2 answers
2197 views
The below aim isn't actually what my query is for, but I'm using it as an analogy to explain more simply what I am trying to achieve: I am trying to build a BigQuery Script which looks at all arcade machines which have had a vandalism alarm triggered, it then identifies all of those machines which have either had no vends made since the alarm, or those where the first vend was more than 28 days after the alarm was triggerred, in order to identify machines where free use could have occurred after the alarm was triggerred. SO far, I have highlighted all of these instances, however where I have joined the Payments table to the query, all vends made >28 days after the alarm per machine are being returned, where I am only interested in returning the first vend after the alarm. I have 3 tables alarm | machine_num| alarm_date | | ---------- | ---------- | | 111 | 2022-01-20 | | 222 | 2022-01-20 | | 123 | 2022-01-20 | | 456 | 2022-01-20 | Customer | cust_num | machine_num | | -------- | ------------| | 1 | 111 | | 2 | 222 | | 3 | 123 | | 4 | 456 | payments | cust_num | vend_date | | -------- | -------------| | 1 | 2022-01-10 | | 1 | 2022-01-21 | | 1 | 2022-02-21 | | 2 | 2022-01-11 | | 2 | 2022-01-19 | | 3 | 2022-01-01 | | 3 | 2022-01-10 | | 3 | 2022-03-01 | | 3 | 2022-03-03 | | 3 | 2022-03-04 | | 4 | 2022-01-19 | | 4 | 2022-04-20 | | 4 | 2022-04-21 | So in this case: cust_num "1" woudln't be returned, as there was a vend less than 28 days after the alarm cust_num "2" Would be returned with the vend date as NULL since no Vends have been made since the alarm cust_num "3" Would be returned with the vend date as "2022-03-01" since this is the first Vend after the alarm cust_num "4" Would be returned with the vend date as "2022-04-20" since this is the first Vend after the alarm I need to return all 3 fields, so based on the above example, my output would be | cust_num | machine_num | alarm_date | vend_date | | -------- | ------------| ---------- | -------------| | 2 | 222 | 2022-01-20 | NULL | | 3 | 333 | 2022-01-20 | 2022-03-01 | | 4 | 444 | 2022-01-20 | 2022-04-20 | I've tried adding a sub-query in my select statement similar to the below:
(SELECT
MIN(vend_date)
FROM payments AS paymin
WHERE paymin.vend_date > alarm_date)
AS vend_date
However this just tends to cause bigquery to run longer than I have the patience to wait for it for when I add the subquery to my existing query. I've never asked for help on one of these sites before, so apologies if I am asking in the wrong place or in the wrong way! I'm still relatively new to BQ and work very distantly from any analysts in the business. Any help is really appreciated! Cheers __ EDIT: So I gave up with the sub query, it was too intensive as it was querying almost 2 million rows! I tried using simple MIN and grouping everything together, similar to this simplified example
SELECT
 payments.cust_num,
 alarm.machine_num, p.PAN, alarm.alarm_date, MIN(payments.vend_date) as vend_Date
 
 FROM alarm
    
LEFT JOIN customer
    ON alarm.machine_num = customer.machine_num
 INNER JOIN payments
    ON customer.cust_num =payments.cust_num 

WHERE 
vend_Date > DATE_ADD(alarm.alarm_date, INTERVAL +28 DAY) OR 
vend_Date IS NULL

GROUP BY payments.cust_num, alarm.machine_num, alarm.alarm_date
ORDER BY  payments.cust_num
however This isn't picking up instances where first vend_date after alarm_date is NULL It also only returns the first date that is newer than 28 days after the alarm date, not accounts WHEN the first vend_date is newer than 28 days after the alarm_date
Asked by Phil Tomlinson (1 rep)
Sep 7, 2022, 03:51 PM
Last activity: Oct 14, 2022, 01:47 PM