Sample Header Ad - 728x90

INSERTing millions of files into a table

0 votes
2 answers
76 views
I archive millions of XML files (1-100MB each) in a table with the structure of CREATE TABLE Data ( ID int(11) unsigned NOT NULL, XML longtext COMPRESSED, PRIMARY KEY(ID) ) ENGINE=Aria DEFAULT CHARSET=utf8 COLLATE utf8_general_ci ROW_FORMAT=DYNAMIC; INSERT INTO Data (ID,XML) VALUES ($id,LOAD_FILE('file.xml')); The process is slow, around 2-5 inserts/second. The entire database would be too large for an SDD drive, and I create the database on a separate HDD, but I move the files in batches to an SDD drive to make the reading faster. Note that disk speed is not the rate-determining step, as XML data are hugely shrunk by compression. I tried InnoDB to gain concurrent insert, but the size of InnoDB ibd is three times larger than ARIA/MyISAM, and InnoDB is much slower on HDD. I tried ROCKSDB, but it cannot be created on a separate disk, as there is one single directory for all tables. Also, the memory management of ROCKSDB is terrible for such scenarios (or I could not find the proper configuration). I did not try ARCHIVE engine performance since it needs ID to be in order. My current solution is to INSERT concurrently to a temporary InnoDB table on SSD and then INSERT INTO SELECT from the InnoDB table to the ARIA table on HDD. The problem is the integrity and delay in emptying the InnoDB and starting the concurrent INSERT process. I appreciate any possible solution.
Asked by Googlebot (4551 rep)
Nov 6, 2023, 02:12 PM
Last activity: Nov 10, 2023, 05:05 PM