Sample Header Ad - 728x90

MySQL 8.0.41 - 83 % Waiting for row lock on AO_319474_QUEUE

-1 votes
1 answer
140 views
Environment We run three Jira Service Management DC clusters that all share the same topology: * MySQL 8.0.36 on Ubuntu 22.04 – 32 vCPU, 100 GB RAM, buffer-pool now 80 GB (only ~33 GB in use), NVMe storage. * Each Jira node holds a 50-connection pool (3 nodes × 50 = 150 sessions per cluster). * Workload is at 500 TPS steady; A week ago we removed an unrelated update bottleneck and suddenly discovered a hot row issue at Jira table. - 83 % of total wait time is Waiting for row lock (Percona PMM, 6-hour window). - Slow-log is 90 % UPDATE AO_319474_QUEUE … WHERE ID = ?. - SHOW ENGINE INNODB STATUS always shows 50-70 transactions waiting on the same PK row. - CPU, disk I/O, redo, buffer-pool, latches — all SELECT ID, NAME -> FROM AO_319474_QUEUE -> WHERE ID = 11459545; +----------+---------------------------------------------+ | ID | NAME | +----------+---------------------------------------------+ | 11459545 | servicedesk.base.internal.processing.master | +----------+---------------------------------------------+ Query details mysql> explain SELECT ID, NAME -> FROM AO_319474_QUEUE -> WHERE ID = 11459545; +----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | SIMPLE | AO_319474_QUEUE | NULL | const | PRIMARY | PRIMARY | 8 | const | 1 | 100.00 | NULL | +----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.01 sec) Update Query update AO_319474_QUEUE set CLAIMANT_TIME = 1748437310646 where AO_319474_QUEUE.ID = 11459545 and (AO_319474_QUEUE.CLAIMANT = 'c.a.s.plugins.base.internal.events.runner.async.XXX' and AO_319474_QUEUE.CLAIMANT_TIME >= 1748437010646) Update Query Details mysql> explain update AO_319474_QUEUE -> set CLAIMANT_TIME = 1748437310373 where AO_319474_QUEUE.ID = 11459545 and (AO_319474_QUEUE.CLAIMANT = 'c.a.s.plugins.base.internal.events.runner.async.XXX' and AO_319474_QUEUE.CLAIMANT_TIME >= 1748437010373) -> ; +----+-------------+-----------------+------------+-------+----------------------------------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------------+------------+-------+----------------------------------------+---------+---------+-------+------+----------+-------------+ | 1 | UPDATE | AO_319474_QUEUE | NULL | range | PRIMARY,index_ao_319474_queue_claimant | PRIMARY | 8 | const | 1 | 100.00 | Using where | +----+-------------+-----------------+------------+-------+----------------------------------------+---------+---------+-------+------+----------+-------------+ Warnings mysql> SHOW WARNINGS LIMIT 10; +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | update scr_518f3268.AO_319474_QUEUE set scr_518f3268.AO_319474_QUEUE.CLAIMANT_TIME = 1748437310373 where ((scr_518f3268.AO_319474_QUEUE.CLAIMANT = 'c.a.s.plugins.base.internal.events.runner.async.XXX') and (scr_518f3268.AO_319474_QUEUE.ID = 11459545) and (scr_518f3268.AO_319474_QUEUE.CLAIMANT_TIME >= 1748437010373)) | +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.01 sec) So the DB is idle, but everybody’s queuing on one hot row. Mainly we were looking at stats from
ENGINE INNODB STATUS\G
Already tried / confirmed - Increased innodb_buffer_pool_size from 16 G → 80 G; buf-pool hit stays 1000/1000, so I/O really is not the issue. - No I/O or redo backlog. - Hinted Atlassian support — they’re still digging on the AO side. **Indexes** Index Summary mysql> SELECT -> INDEX_NAME, -> COLUMN_NAME, -> SEQ_IN_INDEX, -> NON_UNIQUE -> FROM INFORMATION_SCHEMA.STATISTICS -> WHERE TABLE_SCHEMA = 'scr_518f3268' -> AND TABLE_NAME = 'AO_319474_QUEUE' -> ORDER BY INDEX_NAME, SEQ_IN_INDEX; +--------------------------------+-------------+--------------+------------+ | INDEX_NAME | COLUMN_NAME | SEQ_IN_INDEX | NON_UNIQUE | +--------------------------------+-------------+--------------+------------+ | index_ao_319474_queue_claimant | CLAIMANT | 1 | 1 | | index_ao_319474_queue_topic | TOPIC | 1 | 1 | | PRIMARY | ID | 1 | 0 | | U_AO_319474_QUEUE_NAME | NAME | 1 | 0 | +--------------------------------+-------------+--------------+------------+ 4 rows in set (0.00 sec) Analyse Table mysql> ANALYZE TABLE AO_319474_QUEUE; +-----------------------------------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-----------------------------------------+---------+----------+----------+ | scr_518f3268.AO_319474_QUEUE | analyze | status | OK | +-----------------------------------------+---------+----------+----------+ 1 row in set (0.00 sec) **Create table info (SHOW CREATE)** mysql> SHOW CREATE TABLE AO_319474_QUEUE\G *************************** 1. row *************************** Table: AO_319474_QUEUE Create Table: CREATE TABLE AO_319474_QUEUE ( CLAIMANT varchar(127) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL, CLAIMANT_TIME bigint DEFAULT NULL, CREATED_TIME bigint NOT NULL, ID bigint NOT NULL AUTO_INCREMENT, MESSAGE_COUNT bigint NOT NULL, MODIFIED_TIME bigint NOT NULL, NAME varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL, PURPOSE varchar(450) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL, TOPIC varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL, PRIMARY KEY (ID), UNIQUE KEY U_AO_319474_QUEUE_NAME (NAME), KEY index_ao_319474_queue_topic (TOPIC), KEY index_ao_319474_queue_claimant (CLAIMANT) ) ENGINE=InnoDB AUTO_INCREMENT=12485939 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.01 sec) **Questions** Mainly interested in way to resolve or further troubleshoot. - Are there any known MySQL 8.0.41+ bugs that make single-row updates stall longer than expected? - Any ideas how to mitigate problem? We can't change query as it's coming from product (Atlassian Jira Service Management / Data Center). - Any low hanging fruits we can try for a quick fix? - If it's Virtualisation issue - what we need to ask / capture to chat with our provider in DC? I’m a Java engineer, not a full-time DBA - feel free to point out obvious RTFM gaps. Attachments - Percona InnoDB Buffer Pool InnoDB Locking Percona Query Analytics InnoDB Log IO
Asked by user340962 (1 rep)
May 27, 2025, 09:00 AM
Last activity: Jun 24, 2025, 09:07 PM