Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

2 answers

9152 views

Load csv file to mysql using load data local infile

I need a simple way to load a csv file from my computer to the database (which is on a remote server). I'm trying to load the file using a php page that executes this code: ```php $file = $_FILES['csv']['tmp_name']; $handle = fopen($file,"r"); $name = $_FILES['csv']['name']; $import = "LOAD DATA LOC...

I need a simple way to load a csv file from my computer to the database (which is on a remote server). I'm trying to load the file using a php page that executes this code:

$file = $_FILES['csv']['tmp_name']; 
    $handle = fopen($file,"r"); 
    $name = $_FILES['csv']['name'];

   $import = "LOAD DATA LOCAL INFILE '" . $name .
                "' INTO TABLE temporal_load
                  FIELDS TERMINATED BY ','  
                  optionally ENCLOSED BY '\"' 
                  LINES TERMINATED BY '\\n' 
				  IGNORE 1 LINES
                  (num,ticker,company,sector,industry) ";

mysql_query($import) or die(mysql_error());

But when I select a file on the php page and try to load it, shows the error message: File 'test11-14-15.csv' not found (Errcode: 2) I already reviewed that the variable **mysql.allow_local_infile** is ON and the database connection was made this way:

mysql_connect(HOST,USER,PASSWORD,false,128);

The file is not on the remote server, the file is on my computer, but it should work anyway because I'm using LOCAL. What am I doing wrong???

Claudia Sanchez (21 rep)

Nov 23, 2015, 07:04 PM • Last activity: Apr 11, 2025, 03:04 AM

2 votes

1 answers

327 views

Loading a CSV into mariadb errs out as "invalid date format"

mariadb date csv load

I am trying to load CSV into a MariaDB . Having issues in transforming date . Here is an example file and two tables ( emp , where dob is declared as a date column and emp_varchar , where dob is declared as varchar). What am I doing wrong here? Jane@dbserver1:~$ cat empdata.csv 100,John ,02/14/1955...

                                  
I am trying to load CSV into a MariaDB . 
Having issues in transforming date . 
    
Here is an example file and two tables 
( emp , where dob is declared as a date column and emp_varchar , where dob is declared 
as varchar).

What am I doing wrong here?
    
    
    
    Jane@dbserver1:~$ cat empdata.csv
    100,John ,02/14/1955
    200,Jane ,08/22/1980
    
    
    MariaDB [emptest]>  LOAD DATA LOCAL INFILE 'empdata.csv'
        ->     INTO TABLE   emp
        ->     FIELDS TERMINATED BY ','
        ->     OPTIONALLY  ENCLOSED BY '\"'
        ->     LINES TERMINATED BY '\n'
        ->     IGNORE 0 ROWS
        ->     (
        ->                 eid,
        ->                 ename,
        ->                 dob
        ->  )
        ->      SET dob =  STR_TO_DATE(dob, '%m/%d/%Y');
    Query OK, 2 rows affected, 4 warnings (0.001 sec)
    Records: 2  Deleted: 0  Skipped: 0  Warnings: 4
    
    MariaDB [emptest]>             show warnings;
    +---------+------+-----------------------------------------------------------------+
    | Level   | Code | Message                                                         |
    +---------+------+-----------------------------------------------------------------+
    | Warning | 1265 | Data truncated for column 'dob' at row 1                        |
    | Warning | 1411 | Incorrect datetime value: '0000-00-00' for function str_to_date |
    | Warning | 1265 | Data truncated for column 'dob' at row 2                        |
    | Warning | 1411 | Incorrect datetime value: '0000-00-00' for function str_to_date |
    +---------+------+-----------------------------------------------------------------+
    4 rows in set (0.000 sec)
    
    MariaDB [emptest]> select * from emp;
    +------+-----------+------+
    | eid  | ename     | dob  |
    +------+-----------+------+
    |  100 | John      | NULL |
    |  200 | Jane      | NULL |
    +------+-----------+------+
    2 rows in set (0.000 sec)
    
    MariaDB [emptest]>  LOAD DATA LOCAL INFILE 'empdata.csv'
        ->     INTO TABLE   emp_varchar
        ->     FIELDS TERMINATED BY ','
        ->     OPTIONALLY  ENCLOSED BY '\"'
        ->     LINES TERMINATED BY '\n'
        ->     IGNORE 0 ROWS
        ->     (
        ->                 eid,
        ->                 ename,
        ->                 dob
        ->  )
        ->      SET dob =  STR_TO_DATE(dob, '%m/%d/%Y');
    Query OK, 2 rows affected (0.001 sec)
    Records: 2  Deleted: 0  Skipped: 0  Warnings: 0
    
    MariaDB [emptest]> select * from  emp_varchar
        -> ;
    +------+-----------+------------+
    | eid  | ename     | dob        |
    +------+-----------+------------+
    |  100 | John      | 1955-02-14 |
    |  200 | Jane      | 1980-08-22 |
    +------+-----------+------------+
    2 rows in set (0.000 sec)
    
    MariaDB [emptest]>


                                

Z.DBA (21 rep)

Nov 21, 2024, 06:39 PM • Last activity: Nov 21, 2024, 08:30 PM

0 votes

1 answers

115 views

Why does MariaDB execution time is doubled for the same query (LOAD DATA INFILE)?

mariadb time load execution

I observed a strange behaviour regarding the execution time of a query to import a CSV file in an empty table created beforehand. The query execution time to import the file increases while repeating the import. I meet this behaviour while importing 10 times the same medium CSV file (0.6 GB, 6 colum...

                                  I observed a strange behaviour regarding the execution time of a query to import a CSV file in an empty table created beforehand. The query execution time to import the file increases while repeating the import.

I meet this behaviour while importing 10 times the same medium CSV file (0.6 GB, 6 columns, 8 million rows) using TRUNCATE then LOAD DATA INFILE, repeated 10 times within one MariaDB connection.
On the first iteration, the CSV import takes 40 seconds, then about 50 seconds on the second iteration, and from the third to the 10th iteration, the execution time reaches a plateau at 85 +/- 5 s.

I performed the test twice : 
   - on the mariadb shell (alias "mysql" on GNU Linux)
   - on python3 using mysql.connector

And I get the same result, i.e. an execution time that doubles (see figure)...

**• What could explain (or avoid) the execution time being doubled between the first and the third iteration ?**

Steps to reproduce the behaviour :
 
1. Initiation : create (just once) the empty table with a generated primary key (idRow) : 

       CREATE TABLE myTable (col1 VARCHAR(14), col2 VARCHAR(14), col3 VARCHAR(10), col4 VARCHAR(5), col5 VARCHAR(5), col6 VARCHAR(19), idRow INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (idRow));

2. Repeat steps A. and B. 10 times and collect the execution time of step B. for each iteration :

A. Empty the table using TRUNCATE :

    TRUNCATE TABLE myTable;

B. Then import a 0.6 GB-large CSV file of 8 million rows and 6 columns :
  
    LOAD DATA INFILE "/myData/myFile.csv" INTO TABLE myTable FIELDS TERMINATED BY "," LINES TERMINATED BY "\n" IGNORE 1 ROWS (col1, col2, col3, col4, col5, col6) SET idRow=NULL;

Any help to understand this phenomenon would be welcome, dont hesitate to ask for more info. 

*Why do I do this ?
The goal is to build a procedure to measure robustly the statistics of the execution time of any query, and how much it fluctuates determines the number of iterations one needs to get a relevant sample size. I was surprised that any query could fluctuate of 100% in execution time.*

Giorgio

MariaDB server :
   - OS : Linux Mint 20
   - mariadb version : 10.3.38-MariaDB-0ubuntu0.20.04.1-log
   - innodb version : 10.3.38

[update] I made other interesting observations :

(i) : On the same OS session (i.e. no reboot) : closing the mariadb connection, or restarting the mariadb service (systemctl restart mariadb) does not prevent the 2nd iteration getting slower (50 to 87 s) than the first.

(ii) : After rebooting the OS, the B query gets fast again (40 sec).

GiorgioAbitbolo (1 rep)

Jun 8, 2023, 07:40 PM • Last activity: Jun 10, 2023, 06:57 AM

1 votes

1 answers

1314 views

LOAD DATA LOCAL INFILE with specifying first column as auto-increment value

mysql mysql-8.0 import load

``` LOAD DATA LOCAL INFILE 'index.csv' INSERT INTO TABLE `aws_pricing` FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n' IGNORE 6 LINES; ``` Thing is, my table has the first column as `id int NOT NULL AUTO_INCREMENT ... PRIMARY KEY (id)`, so I get : > ERROR...

LOAD DATA LOCAL INFILE 'index.csv'
INSERT INTO TABLE aws_pricing
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\n'
IGNORE 6 LINES;

Thing is, my table has the first column as id int NOT NULL AUTO_INCREMENT ... PRIMARY KEY (id), so I get : > ERROR 1265 (01000): Data truncated for column 'id' at row 1 So index.csv has one column less. How do I get LOAD DATA LOCAL INFILE to work by specifying to auto-increment before reading each line from index.csv file ? MySQL Ver 8.0.32

anjanesh (279 rep)

Apr 18, 2023, 10:42 AM • Last activity: Apr 18, 2023, 04:46 PM

0 votes

0 answers

26 views

Is there a difference in performance loading different datatypes

performance aws redshift load

Context: loading (COPY) a high volume of data to a warehouse (Redshift specifically however curious about all) Question: Is there a performance difference in the different datatypes that I am loading? For example would loading a string of a timestamp into a text column be faster/slower than loading...

                                  Context: loading (COPY) a high volume of data to a warehouse (Redshift specifically however curious about all)

Question: Is there a performance difference in the different datatypes that I am loading? For example would loading a string of a timestamp into a text column be faster/slower than loading it directly as a timestamp?

OctaveParango (101 rep)

Oct 3, 2022, 12:08 PM • Last activity: Oct 3, 2022, 01:13 PM

0 votes

1 answers

397 views

Db2 - load LOB data returns error SQL3025N A parameter specifying a filename or path is not valid

db2 export db2-luw load

I would like to learn how to load LOB data from files. I have a table with two columns: ID INTEGER DATA CLOB I can normaly export data with: export to c:\data\data.txt of del lobs to c:\data\ modified by coldel@ select * from mytab Now I would like to import back the same data: load client from c:\d...

                                  I would like to learn how to load LOB data from files. I have a table with two columns: 

    ID INTEGER
    DATA CLOB

I can normaly export data with:

    export to c:\data\data.txt of del lobs to c:\data\ modified by coldel@ select * from mytab

Now I would like to import back the same data:

    load client from c:\data\data.txt of del lobs from c:\data\ modified by coldel@ replace into mytab

but I get error "SQL3025N  A parameter specifying a filename or path is not valid."

Strange error. I have the same file paths using load as export.

How to solve the problem with load?

folow (523 rep)

Sep 15, 2022, 12:49 PM • Last activity: Sep 15, 2022, 02:35 PM

-1 votes

1 answers

148 views

Sometimes, not always, when loading an MSSQL table from Excel a new table suffixed with a $ is created by MSSQL. Why?

sql-server excel load

We are loading MSSQL tables via Microsoft Management Studio Import. Sometimes, not always, a new identical table is created in the load process with a $ at the end of the name. Why does this happen? Is this related to the IDs?

                                  We are loading MSSQL tables via Microsoft Management Studio Import. Sometimes, not always, a new identical table is created in the load process with a $ at the end of the name. Why does this happen? Is this related to the IDs?
                                

W Michael Armbruster (1 rep)

Feb 4, 2022, 05:50 PM • Last activity: Feb 7, 2022, 09:57 PM

1 votes

1 answers

87 views

Meaning of "Total work" for build indexes during load

db2 db2-luw load

I don't think I ever reflected about it, but I'm currently loading a table from a cursor and list utilities show: Phase Number = 3 Description = BUILD Total Work = 46 indexes Completed Work = 0 indexes Start Time = Not Started In the current schema, there are 4 indexes + an XML column (which seems t...

                                  I don't think I ever reflected about it, but I'm currently loading a table from a cursor and list utilities show:

      Phase Number               = 3
      Description                = BUILD
      Total Work                 = 46 indexes
      Completed Work             = 0 indexes
      Start Time                 = Not Started

In the current schema, there are 4 indexes + an XML column (which seems to contribute with 2 according to syscat.indexes). At first I thought it might be a bug counting indexes for the same table name in different schemas, but that does not count up either, since there are 10 schemas with a table named like that. 

I'm just curious what the "Total Work" means for this phase?
                                

Lennart - Slava Ukraini (23862 rep)

Sep 27, 2021, 07:57 PM • Last activity: Sep 28, 2021, 12:50 PM

0 votes

0 answers

26 views

How to Track The Server Usage by The PID?

mysql load

MySql is taking too much resource in one of my client's server. They are running a website developed in php. I have attached the screenshot below. Single process only taking too much resource. I want to track that usage by the PID. its possible or not? if its possible please help me to track it. [![...

                                  MySql is taking too much resource in one of my client's server. They are running a website developed in php. I have attached the screenshot below. Single process only taking too much resource.

I want to track that usage by the PID. its possible or not? if its possible please help me to track it.

Thank You!

Sathya (1 rep)

May 19, 2021, 03:16 PM

0 votes

1 answers

1863 views

How to identify row change timestamp columns in Db2 Table

db2 db2-luw load db2-zos

I need to copy some Db2 tables programmatically (java 11 using the standard Db2 jdbc driver). The tables in question lie on a Db2 for z/OS database (v12) and on a Db2 LUW database (v11.1). The target tables exist and have all the required columns. Some tables contain rows defined like this: TIMESTAM...

                                  I need to copy some Db2 tables programmatically (java 11 using the standard Db2 jdbc driver). The tables in question lie on a Db2 for z/OS database (v12) and on a Db2 LUW database (v11.1). The target tables exist and have all the required columns.

Some tables contain rows defined like this:

     TIMESTAMP NOT NULL GENERATED ALWAYS FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP

In order to do the copy, I wish to use the Db2 load utility and row change timestamps need to be preserved. The syntax for this is identical for both Db2 types. When the table contains a row change timestamp, then the LOAD statement requires a ROWCHANGETIMESTAMPOVERRIDE modifier. All this has been gleaned from the IBM documentation.

The difficulty is that I'm  unsure how to tell whether the table in question has a row change timestamp. For Db2 LUW it is possible to query SYSCAT.COLUMNS, since this has a a field called ROWCHANGETIMESTAMP which would be either 'Y' or 'N'. That would mean checking each column in each table, but it's doable. However, it would be nice if there was an easier way to determine whether the table has a row change timestamp.

For z/OS it isn't clear what should be checked. Is there a catalog query that would help?

I can provide more details, if necessary.

Achim Schmitz (141 rep)

Mar 26, 2021, 02:31 PM • Last activity: Mar 27, 2021, 02:09 PM

2 votes

1 answers

957 views

Load text column to postgres

postgresql copy aws-aurora load

I use the following to load data into postgres \copy tab FROM /tmp/file DELIMITER E'\t' QUOTE E'\b' NULL AS '\N' CSV Usually I select the data from a source db echo "select * from tab" | mysql --batch --quick --skip-column-names .... > /tmp/file It generates a file with tab delimiter fields. It work...

                                  I use the following to load data into postgres

    \copy tab FROM /tmp/file DELIMITER E'\t' QUOTE E'\b' NULL AS '\N' CSV

Usually I select the data from a source db

    echo "select * from tab" | mysql --batch --quick --skip-column-names .... > /tmp/file

It generates a file with tab delimiter fields. It works well for the most part. Up until I try to import multi line text column.
The error comes on:

    ERROR:  unquoted carriage return found in data

Any recommendations how to overcome this?

Nir (529 rep)

Nov 22, 2020, 09:36 AM • Last activity: Nov 23, 2020, 01:56 PM

1 votes

1 answers

5590 views

How can I get LOAD DATA LOCAL INFILE to work for MySQL 8.0.21 on a Windows machine?

mysql windows load data

I've created a file: `'C:\fff.txt'` which contains data for my database table. Then I'm trying to load data from that file like this: LOAD DATA LOCAL INFILE 'C:\\fff.txt' INTO TABLE fff LINES TERMINATED BY '\r\n'; but I get the following error all the time: ERROR 2 : File 'C:\fff.txt' not found (OS...

                                  I've created a file: 'C:\fff.txt' which contains data for my database table.
Then I'm trying to load data from that file like this:

    LOAD DATA LOCAL INFILE 'C:\\fff.txt' INTO TABLE fff
    LINES TERMINATED BY '\r\n'; 
but I get the following error all the time:

    ERROR 2 : File 'C:\fff.txt' not found (OS errno 2 - No such file or directory)
I've added local_infile=1 to [mysql] and [mysqld] sections of my.ini file.
When I type: SHOW GLOBAL VARIABLES LIKE 'local_infile'; I can see that it's set to 'ON'. Is there a way to solve this problem?
                                

JConstantine (141 rep)

Sep 25, 2020, 11:46 AM • Last activity: Sep 29, 2020, 02:32 PM

0 votes

2 answers

73 views

Near identical MySQL deployments behaving very different - High CPU Usage problem

mysql cpu load

So I have five identical websites, running on five machines provisioned in the same way. The only thing that differs between these installations are the language files and the languages of the text stored in MySQL tables. Four of them have no problems what so ever. One is struggling a LOT under the...

                                  So I have five identical websites, running on five machines provisioned in the same way. The only thing that differs between these installations are the language files and the languages of the text stored in MySQL tables.

Four of them have no problems what so ever. One is struggling a LOT under the same or somewhat less load than the other four.

I cannot understand why this is.

Things I've done so far:

 1. Checked slow queries. All queries uses indexes and are in the realm of 0.0008 Sec execution time i.e. very fast
 2. I've noticed that the thing that causes most trouble for this MySQL instance is UPDATE and INSERT, so much so, I've turned off the UPDATE's that were there, for this instance. Bear in mind that these UPDATE's doesn't cause a blip on the other servers.
 3. Tried to eliminate external factors i.e. noisy neighbours (moved host) etc.

Worth noticing is that the machines are deployed the same i.e. a vanilla Debian 10 installation with a LEMP stack, nothing out of the ordinary at all.

Still, the problem persists. I can see the load of the machine struggling to keep under 1.00.
The other machines are in the 0.10 - 0.20 range all the time.

Looking at CPU for the MySQL process on this machine (with 2 CPU cores as the other machines have as well) it is quite often above 100%. The other machines are never - EVER - over 60% for the MySQL process.

So, any help is much appreciated. 

Please do let me know if you need me to run a command that you need to see the output from in order to help. 

Thanks.

**EDIT**  *Spelling and clarifications*

Adergaard (363 rep)

Sep 12, 2020, 10:08 AM • Last activity: Sep 12, 2020, 02:19 PM

1 votes

1 answers

783 views

MySQL Error: Data Too Long For Column CHAR(5) - MariaDB 10.3.10

mysql mariadb command-line load

When inserting data from CSV to MariaDB 10.3, the following error is generated while trying to insert value ```'00010'``` into ```NCHAR(5)``` column ERROR 1406 (22001): Data too long for column 'nicSiegeUniteLegale' at row 38663 Here is MySQL command line : LOAD DATA INFILE "/filePath/StockUniteLega...

When inserting data from CSV to MariaDB 10.3, the following error is generated while trying to insert value

'00010'

into

(5)

column

ERROR 1406 (22001): Data too long for column 'nicSiegeUniteLegale' at row 38663

Here is MySQL command line :

LOAD DATA INFILE "/filePath/StockUniteLegale_utf8.csv" INTO TABLE erp_dev.sirene_UniteLegale
 FIELDS TERMINATED BY ','
 
IGNORE 1 ROWS
 
 (@vsiren,@vstatutDiffusionUniteLegale,@vunitePurgeeUniteLegale,@vdateCreationUniteLegale,@vsigleUniteLegale,@caractereEmployeurUniteLegalesexeUniteLegale,@vprenom1UniteLegale,@vprenom2UniteLegale,@vprenom3UniteLegale,@vprenom4UniteLegale,@vprenomUsuelUniteLegale,@vpseudonymeUniteLegale,@videntifiantAssociationUniteLegale,@vtrancheEffectifsUniteLegale,@vanneeEffectifsUniteLegale,@vdateDernierTraitementUniteLegale,@vnombrePeriodesUniteLegale,@vcategorieEntreprise,@vanneeCategorieEntreprise,@vdateDebut,@vetatAdministratifUniteLegale,@vnomUniteLegale,@vnomUsageUniteLegale,@vdenominationUniteLegale,@vdenominationUsuelle1UniteLegale,@vdenominationUsuelle2UniteLegale,@vdenominationUsuelle3UniteLegale,@vcategorieJuridiqueUniteLegale,@vactivitePrincipaleUniteLegale,@vnomenclatureActivitePrincipaleUniteLegale,@vnicSiegeUniteLegale,@veconomieSocialeSolidaireUniteLegale,@vcaractereEmployeurUniteLegale)

SET

 siren = nullif (@vsiren,''),
 statutDiffusionUniteLegale = nullif (@vstatutDiffusionUniteLegale,''),
 unitePurgeeUniteLegale = if (@vunitePurgeeUniteLegale ='true', 1, 0),
 dateCreationUniteLegale = nullif (@vdateCreationUniteLegale,''),
 -- Repeat above line for each column --
 nicSiegeUniteLegale = nullif (@vnicSiegeUniteLegale,''),
 economieSocialeSolidaireUniteLegale = nullif (@veconomieSocialeSolidaireUniteLegale,''),
 caractereEmployeurUniteLegale = nullif (@vcaractereEmployeurUniteLegale,'');

Here is table DDL (definition) :

CREATE TABLE sirene_UniteLegale (
  siren                                     NChar(9) COLLATE utf8_general_ci,
  statutDiffusionUniteLegale                Enum('O') CHARACTER SET utf8 COLLATE utf8_general_ci,
  unitePurgeeUniteLegale                    TinyInt(1),
  dateCreationUniteLegale                   Date,
  sigleUniteLegale                          NVarChar(20) COLLATE utf8_general_ci,
  sexeUniteLegale                           Enum('F', 'M') CHARACTER SET utf8 COLLATE utf8_general_ci,
  prenom1UniteLegale                        NVarChar(20) COLLATE utf8_general_ci,
  prenom2UniteLegale                        NVarChar(20) COLLATE utf8_general_ci,
  prenom3UniteLegale                        NVarChar(20) COLLATE utf8_general_ci,
  prenom4UniteLegale                        NVarChar(20) COLLATE utf8_general_ci,
  prenomUsuelUniteLegale                    NVarChar(20) COLLATE utf8_general_ci,
  pseudonymeUniteLegale                     NVarChar(100) COLLATE utf8_general_ci,
  identifiantAssociationUniteLegale         NChar(10) COLLATE utf8_general_ci,
  trancheEffectifsUniteLegale               Enum('NN', '01', '02', '03', '11', '12', '21', '22', '31', '32', '41', '42', '51', '52', '53') CHARACTER SET utf8 COLLATE utf8_general_ci,
  anneeEffectifsUniteLegale                 SmallInt(6),
  dateDernierTraitementUniteLegale          DateTime,
  nombrePeriodesUniteLegale                 TinyInt(4),
  categorieEntreprise                       Enum('PME', 'ETI', 'GE') CHARACTER SET utf8 COLLATE utf8_general_ci,
  anneeCategorieEntreprise                  SmallInt(6),
  dateDebut                                 Date,
  etatAdministratifUniteLegale              Enum('A', 'C') CHARACTER SET utf8 COLLATE utf8_general_ci,
  nomUniteLegale                            NVarChar(100) COLLATE utf8_general_ci,
  nomUsageUniteLegale                       NVarChar(100) COLLATE utf8_general_ci,
  denominationUniteLegale                   NVarChar(120) COLLATE utf8_general_ci,
  denominationUsuelle1UniteLegale           NVarChar(70) COLLATE utf8_general_ci,
  denominationUsuelle2UniteLegale           NVarChar(70) COLLATE utf8_general_ci,
  denominationUsuelle3UniteLegale           NVarChar(70) COLLATE utf8_general_ci,
  categorieJuridiqueUniteLegale             NChar(4) COLLATE utf8_general_ci,
  activitePrincipaleUniteLegale             NChar(6) COLLATE utf8_general_ci,
  nomenclatureActivitePrincipaleUniteLegale NChar(8) COLLATE utf8_general_ci,
  nicSiegeUniteLegale                       NChar(5) COLLATE utf8_general_ci,
  economieSocialeSolidaireUniteLegale       Enum('O', 'N') CHARACTER SET utf8 COLLATE utf8_general_ci,
  caractereEmployeurUniteLegale             Enum('O', 'N') CHARACTER SET utf8 COLLATE utf8_general_ci
) ENGINE=InnoDB AUTO_INCREMENT=1 ROW_FORMAT=DYNAMIC DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE sirene_UniteLegale COMMENT = '';

Here is the error generating line :

038684312,O,,1995-12-25,,,,,,,,,,01,2016,2018-09-29T12:00:37,3,PME,2016,2017-02-28,A,,,SYND.COPR. 6 RUE HEROLD 75001   PARIS RE,,,,9110,81.10Z,NAFRev2,00010,,N

**Debug :** - Inserting the line alone works (bash)

$ head -n 1 StockUniteLegale_utf8.csv > test.csv
$ head -n 38663 StockUniteLegale_utf8.csv | tail -n 1 >> test.csv

- Exact same field in previous lines where inserted without error (bash)

$ head -n 38663 StockUniteLegale_utf8.csv | cut -d ',' -f 31 | grep --line-number '00010' | tail -n 5
38635:00010   # Format: lineNumber:value
38649:00010
38652:00010
38660:00010
38663:00010   # Error here

Exemple of those entries (CSV format) :

$ grep -n ',00010,' StockUniteLegale_utf8.csv | egrep '^38663:' -C5
38628:038683751,O,,1995-12-25,,,,,,,,,,01,2016,2018-09-29T12:00:37,2,PME,2016,2008-01-01,A,,,SYND.COPR. DU 30 RUE DE         RICHELIE,,,,9110,81.10Z,NAFRev2,00010,,O
38635:038683850,O,,1995-12-25,,,,,,,,,,01,2016,2018-09-29T12:00:37,2,PME,2016,2008-01-01,A,,,SYND.COPR. 9 RUE THERESE 75001  PARIS RE,,,,9110,81.10Z,NAFRev2,00010,,O
38649:038684072,O,,1995-12-25,,,,,,,,,,NN,,2011-11-29T00:14:24,2,,,2008-01-01,A,,,SYND.COPR. 56 RUE DE RICHELIEU  75001 PA,,,,9110,81.10Z,NAFRev2,00010,,N
38652:038684114,O,,1995-12-25,,,,,,,,,,NN,,2016-04-26T01:04:52,2,PME,2016,2008-01-01,A,,,SYND.COPR. 270 RUE ST HONORE    75001 PA,,,,9110,81.10Z,NAFRev2,00010,,O
38660:038684270,O,,1997-12-25,,,,,,,,,,,,2011-11-29T00:14:24,2,,,2008-01-01,A,,,S C I DE LA PORTE ST HONORE,,,,9110,81.10Z,NAFRev2,00010,,N
38663:038684312,O,,1995-12-25,,,,,,,,,,01,2016,2018-09-29T12:00:37,3,PME,2016,2017-02-28,A,,,SYND.COPR. 6 RUE HEROLD 75001   PARIS RE,,,,9110,81.10Z,NAFRev2,00010,,N
38672:038684411,O,,1995-12-25,,,,,,,,,,01,2016,2018-09-29T12:00:37,2,PME,2016,2008-01-01,A,,,SYND.COPR. 6 R DE LA COSSONNERIE 75001 P,,,,9110,81.10Z,NAFRev2,00010,,O
38702:038685038,O,,1995-12-25,,,,,,,,,,01,2016,2019-04-16T04:03:37,3,PME,2016,2017-05-16,A,,,SYND.COPR. DU 55 RUE REAUMUR    75002 PA,,,,9110,81.10Z,NAFRev2,00010,,N
38705:038685095,O,,1995-12-25,,,,,,,,,,NN,,2011-11-29T00:14:24,2,,,2008-01-01,A,,,SYND.COPR. DU 26 RUE DE GRAMONT 75002 PA,,,,9110,81.10Z,NAFRev2,00010,,N
38708:038685137,O,,1995-12-25,,,,,,,,,,01,2016,2018-09-29T12:00:37,2,PME,2016,2008-01-01,A,,,SYND.COPR. DU 8 RUE SAINTE FOY  75002 PA,,,,9110,81.10Z,NAFRev2,00010,,O
38721:038685335,O,,1995-12-25,,,,,,,,,,NN,,2011-11-29T00:14:25,2,PME,2016,2008-01-01,A,,,SYND.COPR. 14 RUE SAINT SAUVEUR 75002 PA,,,,9110,81.10Z,NAFRev2,00010,,O

- As the file is really long (~20M lines), I tried segmenting into smaller files. Error still at the same entry. **Additionnal debug :** The error seems not to be engine or syntax related. If i run the command above in MySQL CLI, i got the error. If I copy-paste the command in a file and print it to mysql's stdin, i had no error and all entries are now in DB :

$ cat SqlLoadCommand.sql | mysql -uX -pX erp_dev

MySQL and System version :

mysql  Ver 15.1 Distrib 10.3.10-MariaDB, for Linux (x86_64) using readline 5.1
CentOS Linux release 7.5.1804
Linux kernel 3.10.0-862.14.4.el7.x86_64

*Mandatory License Information : All data displayed are public data, courtesy of French Government and administration*

Ender Lisan Al Gaïb (11 rep)

Jun 6, 2019, 09:35 AM • Last activity: Sep 2, 2020, 08:03 PM

0 votes

2 answers

381 views

MySql LOAD DATA with subquery performance issue

mysql subquery load

I'm migrating data into a MySql (5.7.26) database (32GB ram), running as a managed service on AWS. While importing the data, I need to map one of the columns of the CSV being imported to another value using a MEMORY table lookup; so my LOAD DATA resembles the following: LOAD DATA LOCAL INFILE 'file....

                                  I'm migrating data into a MySql (5.7.26) database (32GB ram), running as a managed service on AWS. While importing the data, I need to map one of the columns of the CSV being imported to another value using a MEMORY table lookup; so my LOAD DATA resembles the following:

    LOAD DATA LOCAL INFILE 'file.csv'
    INTO TABLE table_1(col_1, @var1)
    SET col_2 = (select mapped_value from table_2 where id = @var1)

table_2 is a 2-column (id, mapped_value) MEMORY table with 3.4MM rows.

When I import the CSV *without* the subquery, I get several million inserts per minute. However, when I run the same import with the subquery the LOAD DATA performance degrades to near zero (~100 inserts per minute). Is this to be expected with a subquery, or is there something I'm doing wrong in the example above?

Chad (3 rep)

Jan 28, 2020, 03:26 AM • Last activity: Jan 28, 2020, 12:47 PM

0 votes

0 answers

83 views

oracle load estimates

oracle rac load

Our 11g RAC cluster has two nodes, each with 8 CPU cores and a memory target of 16GB. System RAM is 64GB (don't ask me why the memory target is just 16GB, I wasn't there when that choice was made). The disk is SSD. The CPU load as shown in the Enterprise Manager never exceeds 25% and is usually arou...

                                  Our 11g RAC cluster has two nodes, each with 8 CPU cores and a memory target of 16GB. System RAM is 64GB (don't ask me why the memory target is just 16GB, I wasn't there when that choice was made). The disk is SSD. The CPU load as shown in the Enterprise Manager never exceeds 25% and is usually around 20%, and the RAM occupancy stays around 20%. The IO operations per second (that worry my the most) vary between 2000 and 8000, with an average around 5500. The load is mostly OLTP style. System load is around 1, maybe 1.5. 

My question is, can we consider this cluster to be "heavily" loaded? Again my concern is with IO, obviously CPU and RAM are not stressed.

I'm asking because the application occasionally shows signs of sluggishness. The developers refuse to put logging statements that would show how much time a query takes to run ("too many queries"), so I'm left looking for less direct ways to figure this out.

GID (53 rep)

Feb 1, 2019, 11:04 PM

0 votes

1 answers

81 views

How to restrict mySQL queries to not affect server and take it down?

mysql linux configuration load

There are times that some heavy queries increase server load and after sometime both CPU and Memory reaches 100%. This is `my.cnf` configuration file: [mysqld] expire_logs_days= 2 pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock datadir = /var/storage/mysql # LOGGING # log-...

                                  There are times that some heavy queries increase server load and after sometime both CPU and Memory reaches 100%. 


This is my.cnf configuration file:

    [mysqld]
    expire_logs_days= 2
    
    pid-file        = /var/run/mysqld/mysqld.pid
    socket          = /var/run/mysqld/mysqld.sock
    datadir         = /var/storage/mysql
    
    # LOGGING #
    log-error                      = /var/storage/mysql/mysql-error.log
    log-queries-not-using-indexes  = 1
    slow-query-log                 = 1
    slow-query-log-file            = /var/storage/mysql/mysql-slow.log
    
    # MyISAM #
    key-buffer-size                = 32M
    myisam-recover-options         = FORCE,BACKUP
    
    # SAFETY #
    max-allowed-packet             = 16M
    max-connect-errors             = 1000000
    
    # BINARY LOGGING #
    log-bin                        = /var/storage/mysql/mysql-bin
    expire-logs-days               = 14
    sync-binlog                    = 1
    
    # CACHES AND LIMITS #
    tmp-table-size                 = 32M
    max-heap-table-size            = 32M
    query-cache-type               = 0
    query-cache-size               = 0
    max-connections                = 500
    thread-cache-size              = 50
    open-files-limit               = 65535
    table-definition-cache         = 1024
    table-open-cache               = 2048
    
    innodb_file_per_table
    innodb_flush_method=O_DIRECT
    innodb-log-files-in-group      = 2
    innodb_log_file_size=500M
    innodb_buffer_pool_size=6G
    innodb_flush_log_at_trx_commit=2
    # recently added for compression, omit it if mysql has problems with it
    innodb_file_format=Barracuda
    
    # Disabling symbolic-links is recommended to prevent assorted security risks
    symbolic-links=0
    
    
    collation-server = utf8mb4_general_ci
    init-connect='SET NAMES utf8mb4'
    character-set-server = utf8mb4
    skip-character-set-client-handshake
    
    [client]
    default-character-set   = utf8mb4
    
    [mysql]
    default-character-set   = utf8mb4

The server has 6 Cores with 16 GB of RAM. Any suggestion to restrict MySQL and make it safe?


----------


**EDIT-1**
This is one of the slow queries that took 4.3 Secs from MySQL to return the data and has examined about 3 Million rows:

    # Time: 190107  6:22:30
    # User@Host: root[root] @ localhost [127.0.0.1]
    # Thread_id: 204732  Schema: my_db  QC_hit: No
    # Query_time: 4.306051  Lock_time: 0.000058  Rows_sent: 0  Rows_examined: 3253235
    # Rows_affected: 0
    SET timestamp=1546842150;
    SELECT ((old_credit + (1*amount))/10) credit
    FROM credits
    WHERE user_id=' ' or user_id=(SELECT user_id FROM accounts WHERE email=' ')
    ORDER BY id DESC
    LIMIT 1;

The query is created by a front-end BI application (it can be changed to some extent).

Indexes of the table is:

    MariaDB [my_db]> show index from credits;
    +---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | credits |          0 | PRIMARY    |            1 | id          | A         |     2014915 |     NULL | NULL   |      | BTREE      |         |               |
    | credits |          1 | credit_id  |            1 | credit_id   | A         |     2014915 |     NULL | NULL   | YES  | BTREE      |         |               |
    | credits |          1 | user_id    |            1 | user_id     | A         |      134327 |     NULL | NULL   | YES  | BTREE      |         |               |
    +---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    3 rows in set (0.00 sec)


----------

**EDIT-2:**

    Database changed
    MariaDB [my_db]> show index from accounts;
    +-------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | Table | Non_unique | Key_name             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
    +-------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    | accounts |          0 | PRIMARY             |            1 | id          | A         |      486034 |     NULL | NULL   |      | BTREE      |         |               |
    | accounts |          1 | user_id             |            1 | user_id    | A         |      486034 |     NULL | NULL   |      | BTREE      |         |               |
    | accounts |          1 | email_idx           |            1 | email       | A         |      486034 |      190 | NULL   |      | BTREE      |         |               |
    | accounts |          1 | user_id_email_phone |            1 | user_id    | A         |      486034 |     NULL | NULL   |      | BTREE      |         |               |
    | accounts |          1 | user_id_email_phone |            2 | email       | A         |      486034 |       40 | NULL   |      | BTREE      |         |               |
    | accounts |          1 | user_id_email_phone |            3 | phone       | A         |      486034 |       15 | NULL   | YES  | BTREE      |         |               |
    +-------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
    6 rows in set (0.00 sec)
                                

Alireza (3676 rep)

Jan 6, 2019, 11:48 AM • Last activity: Jan 19, 2019, 08:55 AM

6 votes

1 answers

14934 views

How can I load csv data into mongodb and choose the types I'd like to have created for each column?

mongodb import etl load

I'm looking for a method to load csv data in mongodb and specify the types that I'd like to have created for each column? I've been using mongoimport but I find it quite painful because there doesn't appear to be any way to specify the datatype to be used for a particular column in the csv file. I l...

                                  I'm looking for a method to load csv data in mongodb and specify the types that I'd like to have created for each column?

I've been using mongoimport but I find it quite painful because there doesn't appear to be any way to specify the datatype to be used for a particular column in the csv file.


I load the data ...

    mongoimport --drop -d myDatabase -c myCollection --type csv --fields name --file myFile.csv


Then I do analysis to see what types mongo actually created.

    > db.artist.itunes.feed.count()
    36545
    > db.artist.itunes.feed.count( { 'name' : { $type : 2 } } )
    36511
    > db.artist.itunes.feed.count( { 'name' : { $type : 1 } } )
    1
    > db.artist.itunes.feed.count( { 'name' : { $type : 16 } } )
    33

Then I do surgery to correct errors like so ...

    db.artist.itunes.feed.find( { 'name' : { $type : 1 } } ).forEach( function (x) { 
      x.name = new String(x.name); // convert field to string 
      db.artist.itunes.feed.save(x); 
    });

This is tedious.

It would be great if there were a method that allowed me to specify at import time what type to create for each column in the csv file like so:

    mongoimport --drop -d myDatabase -c myCollection --type csv --fields field1,field2 --types 2,2 --file myFile.csv

http://docs.mongodb.org/manual/reference/operator/query/type/ 

                                

Alex Ryan (161 rep)

Sep 20, 2014, 02:35 AM • Last activity: Nov 27, 2018, 09:06 PM

0 votes

1 answers

8236 views

REINDEX or REBUILD the indexes of a table after truncating it and then repopulating it?

oracle performance index load

Is it necessary to REINDEX or REBUILD the indexes of a table after truncating it and then repopulating it? I have a table with approximately 3.5 million rows in Oracle. Every week new data is recharged, truncating the table and then loading the data. There are 2 or 3 indexes in the table. Should I d...

                                  Is it necessary to REINDEX or REBUILD the indexes of a table after truncating it and then repopulating it?

I have a table with approximately 3.5 million rows in Oracle. Every week new data is recharged, truncating the table and then loading the data.

There are 2 or 3 indexes in the table. Should I delete and re-create the indexes or simply rebuild the ones I already have? , some indexed columns are encrypted with TDE.

Thank you

Arnaldo Raxach (145 rep)

Nov 20, 2018, 04:27 AM • Last activity: Nov 20, 2018, 10:02 AM

3 votes

0 answers

1398 views

Galera cluster consume many CPU percent and CPU high load

performance mariadb load galera

I have implemented a cluster include three nodes: two stored nodes and one garbd (also haproxy - load balancer). Some information about servers: SSD 480G Raid1 Memory 32G Swap 16G 2xCPU, number of cores: 24 Maria Galera Cluster 10.0.14 mysqld process consume CPU percent so much 229.3% - 300% The hig...

                                  I have implemented a cluster include three nodes: two stored nodes and one garbd (also haproxy - load balancer). 

Some information about servers:
SSD 480G Raid1
Memory 32G
Swap 16G
2xCPU, number of cores: 24
Maria Galera Cluster 10.0.14

mysqld process consume CPU percent so much 229.3% - 300% 

The highes CPU load 20-30.

When CPU load grow up, there is some log like that.

    *** Victim TRANSACTION:
    TRANSACTION 17362851, ACTIVE 5 sec
    mysql tables in use 1, locked 1
    809 lock struct(s), heap size 95784, 149905 row lock(s), undo log entries 19493
    MySQL thread id 182155, OS thread handle 0x7fc587451700, query id 1566816, query end
        *** WAITING FOR THIS LOCK TO BE GRANTED:
    RECORD LOCKS space id 180 page no 764 n bits 288 index PRIMARY of table xxx.yyy trx id 17362851 lock_mode X
    141128 10:36:45 [Note] WSREP: cluster conflict due to high priority abort for threads:
    141128 10:36:45 [Note] WSREP: Winning thread: 
       THD: 22, mode: applier, state: executing, conflict: no conflict, seqno: 1447458
       SQL: (null)
    141128 10:36:45 [Note] WSREP: Victim thread: 
       THD: 182155, mode: local, state: committing, conflict: no conflict, seqno: -1

I don't know why galera get cert fail because I am using master-slave. There is only node receiving writeset from clients.

Some solution I am applying:

    set wsrep_auto_increment_control=OFF on two nodes
    set query_cache_size=1G
    Decrease wsrep_slave_thread from 48 to 24

Now, my CPU load is in range 2.00-3.00 I am waiting and observing.
What can I do for better performance. I am mangaing use thread pool for decrease CPU percent usage of mysqld.
                                

Luke Nguyen (749 rep)

Nov 28, 2014, 06:20 AM • Last activity: Sep 3, 2018, 07:18 AM

Showing page 1 of 20 total questions