EXISTS() vs EXISTS() = TRUE in Postgres

6 votes

1 answer

1019 views

postgresql postgresql-performance exists

Faced weird behaviour with EXISTS (also applies for NOT EXISTS) generating different execution plans for WHERE EXISTS(...)

EXPLAIN ANALYZE
SELECT * FROM books
WHERE EXISTS (SELECT 1 FROM authors WHERE id = books.author_id AND name LIKE 'asd%');

| QUERY PLAN                                                                                                     |
| -------------------------------------------------------------------------------------------------------------- |
| Hash Join  (cost=218.01..454.43 rows=56 width=40) (actual time=0.975..0.975 rows=0 loops=1)                    |
|   Hash Cond: (books.author_id = authors.id)                                                                    |
|   ->  Seq Scan on books  (cost=0.00..206.80 rows=11280 width=40) (actual time=0.010..0.010 rows=1 loops=1)     |
|   ->  Hash  (cost=217.35..217.35 rows=53 width=4) (actual time=0.943..0.943 rows=0 loops=1)                    |
|         Buckets: 1024  Batches: 1  Memory Usage: 8kB                                                           |
|         ->  Seq Scan on authors  (cost=0.00..217.35 rows=53 width=4) (actual time=0.942..0.943 rows=0 loops=1) |
|               Filter: ((name)::text ~~ 'asd%'::text)                                                           |
|               Rows Removed by Filter: 10000                                                                    |
| Planning Time: 0.361 ms                                                                                        |
| Execution Time: 1.022 ms                                                                                       |

vs. WHERE EXISTS(...) = TRUE

EXPLAIN ANALYZE
SELECT * FROM books
WHERE EXISTS (SELECT id FROM authors WHERE id = books.author_id AND name LIKE 'asd%') = True;

| QUERY PLAN                                                                                                           |
| -------------------------------------------------------------------------------------------------------------------- |
| Seq Scan on books  (cost=0.00..93887.20 rows=5640 width=40) (actual time=2.054..2.054 rows=0 loops=1)                |
|   Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)                                                              |
|   Rows Removed by Filter: 10000                                                                                      |
|   SubPlan 1                                                                                                          |
|     ->  Index Scan using authors_pkey on authors  (cost=0.29..8.30 rows=1 width=0) (never executed)                  |
|           Index Cond: (id = books.author_id)                                                                         |
|           Filter: ((name)::text ~~ 'asd%'::text)                                                                     |
|   SubPlan 2                                                                                                          |
|     ->  Seq Scan on authors authors_1  (cost=0.00..217.35 rows=53 width=4) (actual time=0.931..0.931 rows=0 loops=1) |
|           Filter: ((name)::text ~~ 'asd%'::text)                                                                     |
|           Rows Removed by Filter: 10000                                                                              |
| Planning Time: 0.298 ms                                                                                              |
| Execution Time: 2.129 ms                                                                                             |

Of particular interest is Hash Join vs. simple Seq Scan and the 2x time diff. Fiddle https://www.db-fiddle.com/f/a7kedUeFmCu4tWLFRoxMJA/2

Asked by scorpp (173 rep)

May 25, 2022, 03:10 PM
Last activity: May 29, 2022, 11:08 AM

EXISTS() vs EXISTS() = TRUE in Postgres

Related Questions