Slow CREATE TABLE AS followed by SELECT (other solutions haven't worked)
1
vote
1
answer
117
views
I have a problem which is similar to https://dba.stackexchange.com/questions/165248/slow-create-table-from-subquery-using-select-inner-join but the solutions provided there have not made any difference for me.
I'm using MariaDB 15.1 and have tried this on 2 separate physical machines. One is a production web server (4 GHz, 8 core Xeon with 32 GB RAM and a production grade SSD). The other is my 2021 MacBook Pro (1.4 GHz 4 core Intel, 8 GB RAM, SSD). The performance on the queries doesn't significantly vary in either environment - they are too slow for our application to be considered usable.
Our web application allows users to search for chemical substances which are mapped to 1 or more filters. I have eliminated the web application - which is built in PHP - as being the bottleneck. All queries below have been run directly on the database.
If I run this query:
SELECT s.id
FROM substances s
WHERE ( exists ( SELECT null
FROM filters_substances fs
WHERE fs.substance_id = s.id
AND fs.filter_id = 2676
)
OR exists ( SELECT null
FROM filters_substances fs
WHERE fs.substance_id = s.id
AND fs.filter_id = 2677
)
OR exists ( SELECT null
FROM filters_substances fs
WHERE fs.substance_id = s.id
AND fs.filter_id = 2678
)
);
This takes approx 0.1 seconds to execute and returns 3 rows. This is fine and I wouldn't expect it to execute quicker than this.
We need to write the data returned by that
SELECT
to another table for the logged-in user of our application. We do this by wrapping the SQL above inside a CREATE TABLE AS
statement:
CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234 AS
SELECT s.id FROM substances s WHERE
(
...
);
This creates a table on a separate database (always hosted on the same machine) called tmpdata
where the table is named "filtering_1234". The suffix, "_1234", is the logged in user's ID.
Although the query above executes successfully it takes approx 3 - 4 seconds. If more filters (fs.filter_id =
) or conditions are added it can take longer, to the point where the web application appears to be un-usable.
The CREATE TABLE
statement *on it's own* executes in approx 0.1 seconds:
CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234(id INT);
Query OK, 0 rows affected (0.13 sec)
I tried adding the (id INT)
part of the statement above to the full query:
CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234(id INT) AS
SELECT ( ... )
The id
column in this case corresponds to the primary key of a table substances
where the substances are selected from as per SELECT s.id FROM substances s
.
I tried adding a separate auto-incrementing primary key - called pk
- to the table, as well as the id
column for storing the data returned by the select statement:
CREATE TABLE IF NOT EXISTS tmpdata.filtering_1234(pk int NOT NULL AUTO_INCREMENT, id INT, PRIMARY KEY(pk))
Again this runs in a similar 3 - 4 seconds.
In all cases we get the correct results, i.e. the same 3 substances are always written to filtering_1234
.
How can I debug this further to find where the bottleneck is?
The structure of my tables where the data comes from are:
mariadb> DESCRIBE substances;
+-------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| app_id | varchar(8) | NO | UNI | NULL | |
| name | varchar(1500) | NO | | NULL | |
| date | date | NO | | NULL | |
+-------------+-----------------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mariadb> DESCRIBE filters_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
| filter_id | smallint(5) unsigned | NO | MUL | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
mariadb> SHOW INDEXES FROM substances;
+------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| substances | 0 | PRIMARY | 1 | id | A | 288989 | NULL | NULL | | BTREE | | | NO |
| substances | 0 | app_id | 1 | app_id | A | 288989 | NULL | NULL | | BTREE | | | NO |
+------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
2 rows in set (0.00 sec)
mariadb> SHOW INDEXES from filters_substances;
+--------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+--------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| filters_substances | 0 | PRIMARY | 1 | id | A | 1626686 | NULL | NULL | | BTREE | | | NO |
| filters_substances | 1 | substance_id | 1 | substance_id | A | 813343 | NULL | NULL | | BTREE | | | NO |
| filters_substances | 1 | filter_id | 1 | filter_id | A | 70725 | NULL | NULL | | BTREE | | | NO |
+--------------------+------------+--------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
3 rows in set (0.00 sec)
Asked by Andy
(121 rep)
Oct 10, 2023, 09:21 AM
Last activity: Jan 10, 2024, 07:19 AM
Last activity: Jan 10, 2024, 07:19 AM