Sample Header Ad - 728x90

Slow CREATE TABLE AS followed by SELECT (other solutions haven't worked)

1 vote
1 answer
117 views
I have a problem which is similar to https://dba.stackexchange.com/questions/165248/slow-create-table-from-subquery-using-select-inner-join but the solutions provided there have not made any difference for me. I'm using MariaDB 15.1 and have tried this on 2 separate physical machines. One is a production web server (4 GHz, 8 core Xeon with 32 GB RAM and a production grade SSD). The other is my 2021 MacBook Pro (1.4 GHz 4 core Intel, 8 GB RAM, SSD). The performance on the queries doesn't significantly vary in either environment - they are too slow for our application to be considered usable. Our web application allows users to search for chemical substances which are mapped to 1 or more filters. I have eliminated the web application - which is built in PHP - as being the bottleneck. All queries below have been run directly on the database. If I run this query: SELECT s.id FROM substances s WHERE ( exists ( SELECT null FROM filters_substances fs WHERE fs.substance_id = s.id AND fs.filter_id = 2676 ) OR exists ( SELECT null FROM filters_substances fs WHERE fs.substance_id = s.id AND fs.filter_id = 2677 ) OR exists ( SELECT null FROM filters_substances fs WHERE fs.substance_id = s.id AND fs.filter_id = 2678 ) ); This takes approx 0.1 seconds to execute and returns 3 rows. This is fine and I wouldn't expect it to execute quicker than this. We need to write the data returned by that SELECT to another table for the logged-in user of our application. We do this by wrapping the SQL above inside a CREATE TABLE AS statement: CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234 AS SELECT s.id FROM substances s WHERE ( ... ); This creates a table on a separate database (always hosted on the same machine) called tmpdata where the table is named "filtering_1234". The suffix, "_1234", is the logged in user's ID. Although the query above executes successfully it takes approx 3 - 4 seconds. If more filters (fs.filter_id = ) or conditions are added it can take longer, to the point where the web application appears to be un-usable. The CREATE TABLE statement *on it's own* executes in approx 0.1 seconds: CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234(id INT); Query OK, 0 rows affected (0.13 sec) I tried adding the (id INT) part of the statement above to the full query: CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234(id INT) AS SELECT ( ... ) The id column in this case corresponds to the primary key of a table substances where the substances are selected from as per SELECT s.id FROM substances s. I tried adding a separate auto-incrementing primary key - called pk - to the table, as well as the id column for storing the data returned by the select statement: CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234(pk int NOT NULL AUTO_INCREMENT, id INT, PRIMARY KEY(pk)) Again this runs in a similar 3 - 4 seconds. In all cases we get the correct results, i.e. the same 3 substances are always written to filtering_1234. How can I debug this further to find where the bottleneck is? The structure of my tables where the data comes from are: mariadb> DESCRIBE substances; +-------------+-----------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------+-----------------------+------+-----+---------+----------------+ | id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment | | app_id | varchar(8) | NO | UNI | NULL | | | name | varchar(1500) | NO | | NULL | | | date | date | NO | | NULL | | +-------------+-----------------------+------+-----+---------+----------------+ 4 rows in set (0.01 sec) mariadb> DESCRIBE filters_substances; +--------------+-----------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+-----------------------+------+-----+---------+----------------+ | id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment | | substance_id | mediumint(8) unsigned | NO | MUL | NULL | | | filter_id | smallint(5) unsigned | NO | MUL | NULL | | +--------------+-----------------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec) mariadb> SHOW INDEXES FROM substances; +------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored | +------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ | substances | 0 | PRIMARY | 1 | id | A | 288989 | NULL | NULL | | BTREE | | | NO | | substances | 0 | app_id | 1 | app_id | A | 288989 | NULL | NULL | | BTREE | | | NO | +------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ 2 rows in set (0.00 sec) mariadb> SHOW INDEXES from filters_substances; +--------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored | +--------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ | filters_substances | 0 | PRIMARY | 1 | id | A | 1626686 | NULL | NULL | | BTREE | | | NO | | filters_substances | 1 | substance_id | 1 | substance_id | A | 813343 | NULL | NULL | | BTREE | | | NO | | filters_substances | 1 | filter_id | 1 | filter_id | A | 70725 | NULL | NULL | | BTREE | | | NO | +--------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ 3 rows in set (0.00 sec)
Asked by Andy (121 rep)
Oct 10, 2023, 09:21 AM
Last activity: Jan 10, 2024, 07:19 AM