load text file into db using temp variable as place holder for complex column value
0
votes
1
answer
246
views
Assume I have a text file with tab terminated fields as below. Where value in row m, column n as vmn. Vmn is a string which could wrapped in double quota. Generally this string variable could have any readable char such as
'\'
or '"'
, such as
"Bob the "king""TAB"c:\path\to\file"TAB
Here TAB is tab char '\t'
They are not as
"Bob the \"king\""TAB"c:\\path\\to\\file"TAB
header1 header2 header3 ,,headerk,,,headern
v11 v12 v13 ,,v1k ,,,v1n
...
vm1 vm2 vm3 ,,vmk ,,,vmn
...
Now I also have a table as create as
|column1 |column2 |,,|columnv |,,,|columnw |
------------------------------------------------
t1 t2 ,,tv ,,,tw
w is not equal to n, w could larger than or smaller than n.
tk = function(vi1, vi2,...)
Let us look at a real sample of text file.
"Bob the "King"" 01/01/1010 "$100" "1000" "12/12/1090"
"Jenny the Queen" 11/11/1030 "$10.2" "100" "11/02/1070"
I use a slow way to load the data into table as
CREATE TABLE IF NOT EXISTS STATE
(
KeyID INT UNSIGNED auto_increment primary key,
Names VARCHAR(32),
DOBs VARCHAR(16),
Incomes VARCHAR(16),
Propertys VARCHAR(16),
Deaths VARCHAR(16),
Name VARCHAR(16),
Title VARCHAR(16),
Ages INT,
Income Double,
Property Double,
DOB Date,
Death Date,
Age INT,
Value Double
);
LOAD DATA LOCAL INFILE 'state.tsv' IGNORE INTO TABLE STATE
COLUMNS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(
Names,
DOBs,
Incomes,
Propertys,
Deaths,
)
SET
Name = SUBSTRING(Names, 0, POSITION("the" IN Names)),INSTR (ori_str, sub_str)
Title = CASE
WHEN INSTR (Names, "King") THEN "King"
WHEN INSTR (Names, "Queen") THEN "Queen"
ELSE "No Title"
END,
DOB = STR_TO_DATE(case when DOBs='' then null else DOBs end, '%m/%d/%Y' ),
Death = STR_TO_DATE(case when Deaths='' then null else Deaths end, '%m/%d/%Y' ),
Income = NULLIF(Incomes,''),
Property = NULLIF(Propertys,''),
Age = Death - DOB,
Value = MyFunction(Income, Property);
Then I run an alter sql to drop columns like Names, Titles,DOBs, Deaths, Incomes, Propertys.
Here MyFunction is a user function to calculate the value.
This method is not smart and slow. I would like to have temp variables to hold the strings, then convert them in to table columns. No extra columns to drop afterwards.
Any suggestion?
Asked by chans
(1 rep)
Aug 11, 2021, 12:36 AM
Last activity: May 23, 2025, 06:05 PM
Last activity: May 23, 2025, 06:05 PM