Optimize UPDATE query for single "next" row
1
vote
1
answer
302
views
We had information from new relic that one our query has a bad performance (almost 16 seconds for execution). We use PostgreSQL 11.8 on Debian Linux.
Table:
postgres=# SELECT table_name, column_name, data_type, character_maximum_length, character_octet_length, numeric_precision, numeric_precision_radix, numeric_scale, is_nullable, is_updatable FROM information_schema.columns WHERE table_name = 'store_codeinventory';
table_name | column_name | data_type | character_maximum_length | character_octet_length | numeric_precision | numeric_precision_radix | numeric_scale | is_nullable | is_updatable
---------------------+-------------------+--------------------------+--------------------------+------------------------+-------------------+-------------------------+---------------+-------------+--------------
store_codeinventory | id | integer | | | 32 | 2 | 0 | NO | YES
store_codeinventory | code | character varying | 255 | 1020 | | | | NO | YES
store_codeinventory | limit_days | integer | | | 32 | 2 | 0 | NO | YES
store_codeinventory | deactivation_date | timestamp with time zone | | | | | | YES | YES
store_codeinventory | cost | numeric | | | 8 | 10 | 2 | NO | YES
store_codeinventory | price | numeric | | | 8 | 10 | 2 | NO | YES
store_codeinventory | created_date | timestamp with time zone | | | | | | YES | YES
store_codeinventory | claimed_date | timestamp with time zone | | | | | | YES | YES
store_codeinventory | is_active | boolean | | | | | | NO | YES
store_codeinventory | book_id | integer | | | 32 | 2 | 0 | YES | YES
store_codeinventory | code_import_id | integer | | | 32 | 2 | 0 | NO | YES
store_codeinventory | creator_id | integer | | | 32 | 2 | 0 | NO | YES
store_codeinventory | inv_id | integer | | | 32 | 2 | 0 | YES | YES
store_codeinventory | label_id | integer | | | 32 | 2 | 0 | YES | YES
store_codeinventory | recipient_id | integer | | | 32 | 2 | 0 | YES | YES
store_codeinventory | purchase_id | integer | | | 32 | 2 | 0 | YES | YES
(16 rows)
I don't have an access to production, but I tried to fill in local db and understand the problem. Query:
UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
where id = ANY((select array(select id from store_codeinventory
where recipient_id is NULL and inv_id = 72 and is_active=true
ORDER BY ID ASC LIMIT 1 FOR UPDATE)) ::integer[]) and recipient_id is NULL;
Query plan:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on store_codeinventory (cost=0.64..47.43 rows=10 width=92) (actual time=0.291..0.291 rows=0 loops=1)
InitPlan 2 (returns $3)
-> Result (cost=0.34..0.35 rows=1 width=32) (actual time=0.060..0.060 rows=1 loops=1)
InitPlan 1 (returns $2)
-> Limit (cost=0.29..0.34 rows=1 width=10) (actual time=0.058..0.058 rows=1 loops=1)
-> LockRows (cost=0.29..1031.77 rows=19963 width=10) (actual time=0.057..0.057 rows=1 loops=1)
-> Index Scan using store_codeinventory_pkey on store_codeinventory store_codeinventory_1 (cost=0.29..832.14 rows=19963 width=10) (actual time=0.053..0.053 rows=1 loops=1)
Filter: ((recipient_id IS NULL) AND is_active AND (inv_id = 72))
Rows Removed by Filter: 94
-> Index Scan using store_codeinventory_pkey on store_codeinventory (cost=0.29..47.08 rows=10 width=92) (actual time=0.065..0.066 rows=1 loops=1)
Index Cond: (id = ANY ($3))
Filter: (recipient_id IS NULL)
Planning Time: 0.365 ms
Execution Time: 0.327 ms
(14 rows)
I am not so good at writing sql. Could someone give me advice where a bottleneck is? As I understood there might be two places: LockRows and Index Scan, right? I think that the first one was caused FOR UPDATE
and the second one - select where ORDER BY ID ASC LIMIT
? Is there a way to optimize them?
UPD. information about indexes(I removed columns which didn't use in query):
postgres=# SELECT indexname, indexdef FROM pg_indexes WHERE schemaname = 'public' and tablename = 'store_codeinventory' AND (indexdef LIKE '%inv_id%' OR indexdef LIKE '%pkey%' OR indexdef LIKE '%recipient_id%');
indexname | indexdef
----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------
store_codeinventory_inv_id_recipient_id_is_active_822b92e5_idx | CREATE INDEX store_codeinventory_inv_id_recipient_id_is_active_822b92e5_idx ON public.store_codeinventory USING btree (inv_id, recipient_id, is_active)
store_codeinventory_recipient_id_e32fbb18 | CREATE INDEX store_codeinventory_recipient_id_e32fbb18 ON public.store_codeinventory USING btree (recipient_id)
store_codeinventory_inv_id_21e20eb7 | CREATE INDEX store_codeinventory_inv_id_21e20eb7 ON public.store_codeinventory USING btree (inv_id)
store_codeinventory_pkey | CREATE UNIQUE INDEX store_codeinventory_pkey ON public.store_codeinventory USING btree (id)
(4 rows)
*UPD2* I am not so good at sql, but I decided to analyse that query and has concluded that we can simplify it. What you think about this variant:
UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
where id = (select id from store_codeinventory
where recipient_id is NULL and inv_id = 72 and is_active=true
ORDER BY ID ASC LIMIT 1 FOR UPDATE) and recipient_id is NULL;
I thought that we don't need the array because we will get only one or null in sub-select. Then I thought that LIMIT 1
for ordered by ID ASC
it is the same to get minimal value of id. What if we write:
UPDATE store_codeinventory set recipient_id = 1168, claimed_date = NOW()
where id = (select MIN(id) from store_codeinventory
where recipient_id is NULL and inv_id = 72 and is_active=true);
Is it the same or not?
Asked by Pavel Mikhadziuk
(13 rep)
Aug 12, 2021, 12:22 PM
Last activity: Aug 13, 2021, 04:03 PM
Last activity: Aug 13, 2021, 04:03 PM