Sample Header Ad - 728x90

MariaDB uses "wrong"/irrelevant Keys on large table

1 vote
1 answer
521 views
We have migrated our database from a local hosting company to AWS. Now when we run a particular query on AWS it finishes in about a minute (processlist says "sending data") where on the old server the query ran for only 0.5 seconds. Server specs are about the same. EXPLAIN on both servers says that the queries use different indexes. Old server where the query runs fast, MariaDB 5.5.56: EXPLAIN SELECT COUNT(*) AS layer_impressions, DATE_FORMAT(i.created, "%H") AS the_date FROM layer_impression i INNER JOIN layer l ON l.id = i.layer_id WHERE i.created BETWEEN '2018-11-15' AND '2018-11-15 23:59:59' GROUP BY the_date; +------+-------------+-------+--------+------------------+---------+---------+--------------------+--------+--------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+------------------+---------+---------+--------------------+--------+--------------------------------------------------------+ | 1 | SIMPLE | i | range | layer_id,created | created | 8 | NULL | 249116 | Using index condition; Using temporary; Using filesort | | 1 | SIMPLE | l | eq_ref | PRIMARY | PRIMARY | 4 | getback.i.layer_id | 1 | Using index | +------+-------------+-------+--------+------------------+---------+---------+--------------------+--------+--------------------------------------------------------+ New server (AWS where the query runs slow), MariaDB 10.3.8: EXPLAIN SELECT COUNT(*) AS layer_impressions, DATE_FORMAT(i.created, "%H") AS the_date FROM layer_impression i INNER JOIN layer l ON l.id = i.layer_id WHERE i.created BETWEEN '2018-11-15' AND '2018-11-15 23:59:59' GROUP BY the_date; +------+-------------+-------+-------+------------------+------------+---------+--------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+------------------+------------+---------+--------------+------+----------------------------------------------+ | 1 | SIMPLE | l | index | PRIMARY | voucher_id | 4 | NULL | 652 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | i | ref | layer_id,created | layer_id | 4 | getback.l.id | 545 | Using where | +------+-------------+-------+-------+------------------+------------+---------+--------------+------+----------------------------------------------+ What surprises me is that "possible_keys" lists PRIMARY but uses "voucher_id", a column which has nothing to do with our query. If we add USE INDEX or make a STRAIGHT_JOIN the query runs fast and the correct indexes are being used: EXPLAIN SELECT COUNT(*) AS layer_impressions, DATE_FORMAT(i.created, "%H") AS the_date FROM layer_impression i STRAIGHT_JOIN layer l ON l.id = i.layer_id WHERE i.created BETWEEN '2018-11-15' AND '2018-11-15 23:59:59' GROUP BY the_date; +------+-------------+-------+--------+------------------+---------+---------+--------------------+--------+--------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+------------------+---------+---------+--------------------+--------+--------------------------------------------------------+ | 1 | SIMPLE | i | range | layer_id,created | created | 5 | NULL | 272230 | Using index condition; Using temporary; Using filesort | | 1 | SIMPLE | l | eq_ref | PRIMARY | PRIMARY | 4 | getback.i.layer_id | 1 | Using index | +------+-------------+-------+--------+------------------+---------+---------+--------------------+--------+--------------------------------------------------------+ What behaviour is that and how can we prevent this without modifying all our queries throughout our application? It's more a general question about how this can happen and less about this particular query. We tried so far: Switching it off and on again, OPTIMIZE and ANALYZE, FLUSH, setting extended_keys=off (in the optimizer_switch variable). Schema (omitted some additional columns): CREATE TABLE IF NOT EXISTS layer ( id int(10) unsigned NOT NULL AUTO_INCREMENT, shop_id int(10) unsigned NOT NULL, voucher_id int(10) unsigned NOT NULL, templ_id int(10) unsigned NOT NULL COMMENT 'Mailtemplate ID', doi_templ_id int(10) unsigned NOT NULL, key char(5) CHARACTER SET latin1 COLLATE latin1_general_cs NOT NULL, status enum('active','inactive','deleted') NOT NULL DEFAULT 'active', created datetime NOT NULL, createdby int(10) unsigned NOT NULL, modified timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(), PRIMARY KEY (id), KEY shop_id (shop_id,status) USING BTREE, KEY key (key), KEY voucher_id (voucher_id), KEY templ_id (templ_id), KEY doi_templ_id (doi_templ_id) ) ENGINE=InnoDB AUTO_INCREMENT=656 DEFAULT CHARSET=utf8; CREATE TABLE IF NOT EXISTS layer_impression ( id int(10) unsigned NOT NULL AUTO_INCREMENT, layer_id int(10) unsigned NOT NULL, session_id bigint(20) unsigned NOT NULL, device enum('desktop','mobile','tablet') NOT NULL DEFAULT 'desktop', created datetime NOT NULL, PRIMARY KEY (id), KEY layer_id (layer_id), KEY session_id (session_id), KEY created (created,device) USING BTREE, CONSTRAINT layer_impression_ibfk_1 FOREIGN KEY (layer_id) REFERENCES layer (id) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=34184266 DEFAULT CHARSET=utf8; layer has 652 rows, layer_impression has ~34M rows The cardinality on the layer_id index is also quite different: show index from layer_impression; +------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | layer_impression | 0 | PRIMARY | 1 | id | A | 34062000 | NULL | NULL | | BTREE | | | | layer_impression | 1 | layer_id | 1 | layer_id | A | 62499 | NULL | NULL | | BTREE | | | | layer_impression | 1 | session_id | 1 | session_id | A | 34062000 | NULL | NULL | | BTREE | | | | layer_impression | 1 | created | 1 | created | A | 34062000 | NULL | NULL | | BTREE | | | | layer_impression | 1 | created | 2 | device | A | 34062000 | NULL | NULL | | BTREE | | | +------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ On AWS (above) it's 62499 while on our old server it's 18: show index from layer_impression; +------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | layer_impression | 0 | PRIMARY | 1 | id | A | 34295989 | NULL | NULL | | BTREE | | | | layer_impression | 1 | layer_id | 1 | layer_id | A | 18 | NULL | NULL | | BTREE | | | | layer_impression | 1 | session_id | 1 | session_id | A | 34295989 | NULL | NULL | | BTREE | | | | layer_impression | 1 | created | 1 | created | A | 34295989 | NULL | NULL | | BTREE | | | | layer_impression | 1 | created | 2 | device | A | 34295989 | NULL | NULL | | BTREE | | | +------------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ Thanks for any advice! EDIT: Most of the time we have additional WHERE clauses, eg. WHERE l.shop_id = 1 # or WHERE l.shop_id IN (1,2,3) I left this part out for simplicity because I was more interested in why one server would handle it completely different than the other. Here's a typical query which uses the layer table. On AWS it runs for ~18sec, on the old server for ~2sec. # AWS EXPLAIN SELECT COUNT(*) AS layer_impressions, DATE_FORMAT(i.created, "%H") AS the_date FROM layer_impression i LEFT JOIN layer l ON l.id = i.layer_id WHERE i.created BETWEEN '2018-11-10' AND '2018-11-15 23:59:59' AND l.shop_id IN (42,36,103,63) GROUP BY the_date; +------+-------------+-------+-------+------------------+----------+---------+--------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+------------------+----------+---------+--------------+------+-----------------------------------------------------------+ | 1 | SIMPLE | l | range | PRIMARY,shop_id | shop_id | 4 | NULL | 11 | Using where; Using index; Using temporary; Using filesort | | 1 | SIMPLE | i | ref | layer_id,created | layer_id | 4 | getback.l.id | 545 | Using where | +------+-------------+-------+-------+------------------+----------+---------+--------------+------+-----------------------------------------------------------+ # Old server EXPLAIN SELECT COUNT(*) AS layer_impressions, DATE_FORMAT(i.created, "%H") AS the_date FROM layer_impression i LEFT JOIN layer l ON l.id = i.layer_id WHERE i.created BETWEEN '2018-11-10' AND '2018-11-15 23:59:59' AND l.shop_id IN (42,36,103,63) GROUP BY the_date; +------+-------------+-------+--------+------------------+---------+---------+--------------------+---------+--------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+--------+------------------+---------+---------+--------------------+---------+--------------------------------------------------------+ | 1 | SIMPLE | i | range | layer_id,created | created | 8 | NULL | 1615766 | Using index condition; Using temporary; Using filesort | | 1 | SIMPLE | l | eq_ref | PRIMARY,shop_id | PRIMARY | 4 | getback.i.layer_id | 1 | Using where | +------+-------------+-------+--------+------------------+---------+---------+--------------------+---------+--------------------------------------------------------+
Asked by J-F (11 rep)
Nov 21, 2018, 12:20 PM
Last activity: Apr 25, 2025, 11:03 PM