I'm migrating data into a MySql (5.7.26) database (32GB ram), running as a managed service on AWS. While importing the data, I need to map one of the columns of the CSV being imported to another value using a MEMORY table lookup; so my LOAD DATA resembles the following:
LOAD DATA LOCAL INFILE 'file.csv'
INTO TABLE table_1(col_1, @var1)
SET col_2 = (select mapped_value from table_2 where id = @var1)
table_2 is a 2-column (id, mapped_value) MEMORY table with 3.4MM rows.
When I import the CSV *without* the subquery, I get several million inserts per minute. However, when I run the same import with the subquery the LOAD DATA performance degrades to near zero (~100 inserts per minute). Is this to be expected with a subquery, or is there something I'm doing wrong in the example above?
Asked by Chad
(3 rep)
Jan 28, 2020, 03:26 AM
Last activity: Jan 28, 2020, 12:47 PM
Last activity: Jan 28, 2020, 12:47 PM