Why do two queries run faster than combined subquery?

3 votes
1 answer
219 views
                          I'm running postgres 11 on Azure.

If I run this query:

    select min(pricedate) + interval '2 days' from pjm.rtprices

It takes 0.153 sec and has the following explain:

		"Result  (cost=2.19..2.20 rows=1 width=8)"
		"  InitPlan 1 (returns $0)"
		"    ->  Limit  (cost=0.56..2.19 rows=1 width=4)"
		"          ->  Index Only Scan using rtprices_pkey on rtprices  (cost=0.56..103248504.36 rows=63502562 width=4)"
		"                Index Cond: (pricedate IS NOT NULL)"



If I run this query:

		select pricedate, hour, last_updated, count(1) as N 
		from pjm.rtprices
		where pricedate  Sort  (cost=738576.82..739570.68 rows=397541 width=16)"
		"        Sort Key: pricedate DESC, hour, last_updated"
		"        ->  Index Scan using rtprices_pkey on rtprices  (cost=0.56..694807.03 rows=397541 width=16)"
		"              Index Cond: (pricedate   Result  (cost=2.19..2.20 rows=1 width=8)"
		"          InitPlan 1 (returns $0)"
		"            ->  Limit  (cost=0.56..2.19 rows=1 width=4)"
		"                  ->  Index Only Scan using rtprices_pkey on rtprices rtprices_1  (cost=0.56..103683459.22 rows=63730959 width=4)"
		"                        Index Cond: (pricedate IS NOT NULL)"
		"  ->  Gather Merge  (cost=3791454.84..4662729.67 rows=6316230 width=24)"
		"        Workers Planned: 2"
		"        Params Evaluated: $1"
		"        ->  Partial GroupAggregate  (cost=3790454.81..3932679.99 rows=3158115 width=24)"
		"              Group Key: rtprices.pricedate, rtprices.hour, rtprices.last_updated"
		"              ->  Sort  (cost=3790454.81..3812583.62 rows=8851522 width=16)"
		"                    Sort Key: rtprices.pricedate DESC, rtprices.hour, rtprices.last_updated"
		"                    ->  Parallel Seq Scan on rtprices  (cost=0.00..2466553.08 rows=8851522 width=16)"
		"                          Filter: (pricedate   Limit  (cost=0.56..1629038.11 rows=3 width=4)"
		"          ->  Result  (cost=0.56..105887441.26 rows=195 width=4)"
		"                ->  Unique  (cost=0.56..105887441.26 rows=195 width=4)"
		"                      ->  Index Only Scan using rtprices_pkey on rtprices rtprices_1  (cost=0.56..105725202.47 rows=64895517 width=4)"
		"  ->  Sort  (cost=648411.43..649243.43 rows=332798 width=16)"
		"        Sort Key: rtprices.pricedate DESC, rtprices.hour, rtprices.last_updated"
		"        ->  Nested Loop  (cost=0.56..612199.22 rows=332798 width=16)"
		"              ->  CTE Scan on lastday  (cost=0.00..0.06 rows=3 width=4)"
		"              ->  Index Scan using rtprices_pkey on rtprices  (cost=0.56..202957.06 rows=110933 width=16)"
		"                    Index Cond: ((pricedate <= lastday.pricedate) AND (pricedate = lastday.pricedate))"


This last one is all well and good but if my subquery wasn't extensible to this hack, is there a better way for my subquery to have similar performance to the one at a time approach?
                        
Asked by Dean MacGregor (739 rep)
Jun 4, 2021, 09:10 PM
Last activity: Jun 7, 2025, 04:03 AM
Why do two queries run faster than combined subquery?

Related Questions