Sample Header Ad - 728x90

When to use an array aggregate vs a lateral join array subquery in postgres

2 votes
1 answer
1096 views
**tl;dr: When to use array_agg over a lateral join with array subquery?** Context:
create table t_zipcode ( zipcode text primary key, location geography );
create table t_user ( id text primary key, zipcode text references t_zipcode);
I want to design a query that gets all 'nearby' zipcodes to each user, say in a range of 20 km. My first attempt was like this:
select u.id, z.zipcode, z.location, array_agg(zz.zipcode order by zz.zipcode) as nearby_zipcodes
from t_user u
         join t_zipcode z on u.zipcode = z.zipcode
         left join t_zipcode zz on st_dwithin(z.location, zz.location, 20000)
group by u.id, z.zipcode, z.location
order by u.id;
which returns results like this:
+------------------------------------+-------+--------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|id                                  |zipcode|location                                          |nearby_zipcodes                                                                                                                                                                                                                                                                                                                            |
+------------------------------------+-------+--------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|0233684d-d2d8-4fdc-863f-08c6aac79d92|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|0e59fd58-00a5-442e-a21e-16ab1f4d2f65|85226  |0101000020E6100000506EDBF7A8005CC016A243E048A04040|{85034,85040,85041,85042,85044,85045,85048,85121,85202,85224,85225,85226,85248,85282,85283,85284,85286,85339}                                                                                                                                                                                                                              |
|13f26eca-bda0-4b8f-bcff-c95cceee3421|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|2d69e6bd-e0f0-40aa-ba3a-0e157fff46ff|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|4bd8e913-7a75-4028-8254-faebf043f629|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|627897e0-595c-4d21-b1b1-8dd43f21dee7|85033  |0101000020E6100000DAC69FA86C0D5CC08E78B29B19BF4040|{85003,85004,85006,85007,85009,85012,85013,85014,85015,85016,85017,85019,85020,85021,85023,85029,85031,85033,85034,85035,85037,85040,85041,85043,85051,85053,85064,85301,85302,85303,85304,85305,85306,85307,85308,85309,85323,85335,85340,85345,85351,85353,85363,85378,85379,85381,85382,85392,85395}                                    |
|6353efe7-b0e0-42a2-b5fd-f38b7c8e27ad|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|6efad24b-10a7-4ac7-b91a-8842bca70c06|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|7f1fe1d1-e2cd-44b9-85f0-b3b888823a20|85305  |0101000020E6100000BF7D1D3867105CC0D2C6116BF1C34040|{85003,85004,85007,85009,85012,85013,85014,85015,85017,85019,85020,85021,85023,85029,85031,85033,85035,85037,85043,85051,85053,85080,85301,85302,85303,85304,85305,85306,85307,85308,85309,85323,85335,85340,85345,85351,85353,85355,85363,85373,85374,85375,85378,85379,85381,85382,85388,85392,85395}                                    |
|9c238c4a-712b-4b10-a91d-6c3548ae59bc|86001  |0101000020E6100000054F2157EAE85BC0D6AA5D13D2A44140|{86001,86011,86015}                                                                                                                                                                                                                                                                                                                        |
|ace88100-9ecd-4931-a617-b8d8b091470e|85007  |0101000020E6100000CB30EE06D1055CC0AA9D616A4BB94040|{85003,85004,85006,85007,85008,85009,85012,85013,85014,85015,85016,85017,85018,85019,85020,85021,85028,85029,85031,85033,85034,85035,85037,85040,85041,85042,85043,85044,85045,85048,85051,85064,85251,85253,85257,85281,85282,85283,85284,85301,85302,85303,85304,85305,85353}                                                            |
|bf96321b-be2e-4ef9-801e-9ee82ce693f4|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|d787537e-a3fa-4267-b313-71b2c07027fe|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|ea8d074d-adb1-4c45-bd02-b6b57f5a6604|41091  |0101000020E61000005B0A48FB1F2F55C063D009A183744340|{41005,41018,41030,41042,41051,41080,41091,41092,41094,41095,47020,47038,47040}                                                                                                                                                                                                                                                            |
|f925949e-6a5e-4e2c-9591-60f05f42ccb1|81001  |0101000020E6100000C4245CC823225AC025E7C41EDA254340|{81001,81003,81006,81008,81025}                                                                                                                                                                                                                                                                                                            |
|fc47120d-d3da-4847-8134-0eca87708467|98102  |0101000020E61000001C78B5DC99945EC09A7CB3CD8DD14740|{98004,98005,98006,98007,98008,98011,98020,98021,98028,98033,98034,98036,98039,98040,98043,98052,98056,98057,98072,98101,98102,98103,98104,98105,98106,98107,98108,98109,98110,98112,98115,98116,98117,98118,98119,98121,98122,98125,98126,98133,98134,98136,98144,98146,98154,98155,98164,98168,98174,98177,98178,98195,98199,98342,98353}|
+------------------------------------+-------+--------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
This is the analyzed plan with just 16 users AND using an index on the zipcode location column:
+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                                                                     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
|GroupAggregate  (cost=85365.87..86252.54 rows=35467 width=112) (actual time=1.932..2.474 rows=16 loops=1)                                                      |
|  Output: u.id, z.zipcode, z.location, array_agg(zz.zipcode ORDER BY zz.zipcode)                                                                               |
|  Group Key: u.id, z.zipcode, z.location                                                                                                                       |
|  ->  Sort  (cost=85365.87..85454.53 rows=35467 width=112) (actual time=1.887..1.913 rows=277 loops=1)                                                         |
|        Output: u.id, z.zipcode, z.location, zz.zipcode                                                                                                        |
|        Sort Key: u.id, z.zipcode, z.location                                                                                                                  |
|        Sort Method: quicksort  Memory: 53kB                                                                                                                   |
|        ->  Nested Loop Left Join  (cost=0.69..81507.59 rows=35467 width=112) (actual time=0.197..1.641 rows=277 loops=1)                                      |
|              Output: u.id, z.zipcode, z.location, zz.zipcode                                                                                                  |
|              ->  Nested Loop  (cost=0.29..719.73 rows=1070 width=80) (actual time=0.043..0.136 rows=16 loops=1)                                               |
|                    Output: u.id, z.zipcode, z.location                                                                                                        |
|                    Inner Unique: true                                                                                                                         |
|                    ->  Seq Scan on pg_temp.t_user u  (cost=0.00..20.70 rows=1070 width=48) (actual time=0.017..0.021 rows=16 loops=1)                         |
|                          Output: u.id, u.zipcode                                                                                                              |
|                    ->  Index Scan using test_idx_user_zip on pg_temp.t_zipcode z  (cost=0.29..0.65 rows=1 width=64) (actual time=0.006..0.006 rows=1 loops=16)|
|                          Output: z.zipcode, z.location                                                                                                        |
|                          Index Cond: (z.zipcode = u.zipcode)                                                                                                  |
|              ->  Index Scan using test_idx_zip_geo on pg_temp.t_zipcode zz  (cost=0.40..75.47 rows=3 width=64) (actual time=0.040..0.089 rows=17 loops=16)    |
|                    Output: zz.zipcode, zz.location                                                                                                            |
|                    Index Cond: (zz.location && _st_expand(z.location, '20000'::double precision))                                                             |
|                    Filter: st_dwithin(z.location, zz.location, '20000'::double precision, true)                                                               |
|                    Rows Removed by Filter: 8                                                                                                                  |
|Query Identifier: -2358650375080684395                                                                                                                         |
|Planning Time: 0.421 ms                                                                                                                                        |
|Execution Time: 2.649 ms                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
I then tried a similar solution using a lateral join, which yields the exact same results:
select u.id, z.zipcode, z.location, t.nearby_zipcodes
from t_user u
         join t_zipcode z on u.zipcode = z.zipcode
         cross join lateral (select array(select zz.zipcode
                                          from t_zipcode zz
                                          where st_dwithin(z.location, zz.location, 20000)
                                          order by zz.zipcode) as nearby_zipcodes
    ) t
order by u.id;
which to my surprise, has slightly better performance:
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                                                                   |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|Sort  (cost=85823.88..85826.55 rows=1070 width=112) (actual time=1.828..1.830 rows=16 loops=1)                                                               |
|  Output: u.id, z.zipcode, z.location, ((SubPlan 1))                                                                                                         |
|  Sort Key: u.id                                                                                                                                             |
|  Sort Method: quicksort  Memory: 30kB                                                                                                                       |
|  ->  Nested Loop  (cost=0.29..85770.04 rows=1070 width=112) (actual time=0.319..1.809 rows=16 loops=1)                                                      |
|        Output: u.id, z.zipcode, z.location, (SubPlan 1)                                                                                                     |
|        Inner Unique: true                                                                                                                                   |
|        ->  Seq Scan on pg_temp.t_user u  (cost=0.00..20.70 rows=1070 width=48) (actual time=0.012..0.016 rows=16 loops=1)                                   |
|              Output: u.id, u.zipcode                                                                                                                        |
|        ->  Index Scan using test_idx_user_zip on pg_temp.t_zipcode z  (cost=0.29..0.65 rows=1 width=64) (actual time=0.006..0.006 rows=1 loops=16)          |
|              Output: z.zipcode, z.location                                                                                                                  |
|              Index Cond: (z.zipcode = u.zipcode)                                                                                                            |
|        SubPlan 1                                                                                                                                            |
|          ->  Sort  (cost=79.48..79.49 rows=3 width=32) (actual time=0.101..0.102 rows=17 loops=16)                                                          |
|                Output: zz.zipcode                                                                                                                           |
|                Sort Key: zz.zipcode                                                                                                                         |
|                Sort Method: quicksort  Memory: 25kB                                                                                                         |
|                ->  Index Scan using test_idx_zip_geo on pg_temp.t_zipcode zz  (cost=0.40..79.45 rows=3 width=32) (actual time=0.032..0.080 rows=17 loops=16)|
|                      Output: zz.zipcode                                                                                                                     |
|                      Index Cond: (zz.location && _st_expand(z.location, '20000'::double precision))                                                         |
|                      Filter: st_dwithin(z.location, zz.location, '20000'::double precision, true)                                                           |
|                      Rows Removed by Filter: 8                                                                                                              |
|Query Identifier: 8776498431209157273                                                                                                                        |
|Planning Time: 0.257 ms                                                                                                                                      |
|Execution Time: 1.892 ms                                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
I was under the impression that lateral joins should be avoided in general when possible, is there maybe a demerit I'm failing to notice?
Asked by ed__ (23 rep)
Dec 28, 2023, 10:46 PM
Last activity: Dec 31, 2023, 02:51 AM