Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

237 views

Help preparing for MySQL Master Slave Replication

mysql replication mysql-5.5 percona-tools

I am not in the best situation. I inherited an Ubuntu 14.04 8 GB RAM, 8 CPU MySQL 5.5 database server with almost 400 GB of business-critical data (stored on external SSD) contained within several thousand different databases. My database administration skills and experience are nascent. I want to c...

                                  I am not in the best situation. I inherited an Ubuntu 14.04 8 GB RAM, 8 CPU MySQL 5.5 database server with almost 400 GB of business-critical data (stored on external SSD) contained within several thousand different databases. My database administration skills and experience are nascent. I want to create a backup of this data to set up MySQL Replication, but I need to create the backup with minimized impact and downtime.

These databases are individually backed up with mysqldump about every four hours. This unfortunately means that I have no single, point-in-time, logical or raw backup of the entire database server and to top it off, binary logging is not enabled on that server. But I do have the capability to individually restore these backups.

In total, there about 250,000 tables in the database server. Of those tables, about 90,000 use the myisam engine and about 160,000 use the innodb engine.

I know there will be some downtime but I would really just like to avoid having downtime of an unknown duration during which I am obliged fully backup the data and deploy replication at the same time.

In testing, I've given thought to or tried various approaches: 

 - using Percona Xtrabackup
 - using mysqldump with a single transaction (for innodb) and no locks for the myisam tables
 - rsync'ing the mysql data directory, then gracefully shutting down the MySQL server, and rsync'ing the flushed out changes
 - converting the myisam tables to innodb, then doing a mysqldump or using xtrabackup
 - using my existing backups to start replication, then letting the slave catch up
 - restoring my existing backups, then syncing the changes with pt-table-checksum and pt-table sync
 - and the list can go on...

Without me providing excessive detail about my testing methods and results, I would like to know how you would approach this situation. 

EDIT: In essence, my question is: With the goal of minimal downtime and given my scenario, how would you create a backup of the database server in anticipation of setting up MySQL Replication? 

I would appreciate any advice, opinions, services, or resources you may have. Thank you.
                                

Hman (33 rep)

Sep 18, 2017, 07:56 PM • Last activity: May 28, 2025, 11:06 PM

0 votes

1 answers

831 views

Stuck pt-archiver query during purge

mysql percona-tools

I am purging a table of size 1.2T using `pt-archiver`. In the `show processlist` i can see the query stuck for 650 secs in `Sending data` state. The query is quick when I execute independely. Any help will be appreciated. [shell]# pt-archiver --source h=localhost,D=dsm,t=subscriber_event -u XX -pXXX...

                                  I am purging a table of size 1.2T using pt-archiver. In the show processlist i can see the query stuck for 650 secs in Sending data state. The query is quick when I execute independely. Any help will be appreciated.

    [shell]# pt-archiver  --source h=localhost,D=dsm,t=subscriber_event -u XX -pXXX    --where="created  explain SELECT /*!40001 SQL_NO_CACHE */ id FROM dsm.subscriber_event FORCE INDEX(PRIMARY) WHERE (created SELECT /*!40001 SQL_NO_CACHE */ id FROM dsm.subscriber_event FORCE INDEX(PRIMARY) WHERE (created <= DATE_SUB(now(), interval 1 year)) AND (id < '3873802696') ORDER BY id LIMIT 10000;
    | 3205169561 |
    | 3205169562 |
    | 3205169563 |
    | 3205169564 |
    +------------+
    10000 rows in set (0.10 sec)

Please help me finding out the exact cause.
                                

dragon (95 rep)

Jan 16, 2021, 01:35 AM • Last activity: Apr 26, 2025, 05:06 AM

2 votes

1 answers

506 views

Is it safe to use pt-online-schema-change in a multimaster environment?

mysql percona percona-tools percona-toolkit

I have 2 MySQL servers with row-based replication between them. Both of them are masters and slaves for each other (active-active master-master setup). If I understand it correctly pt-osc creates triggers to catch any changes while running. But from what I know triggers are not fired in a row-based...

                                  I have 2 MySQL servers with row-based replication between them. Both of them are masters and slaves for each other (active-active master-master setup).

If I understand it correctly pt-osc creates triggers to catch any changes while running. But from what I know triggers are not fired in a row-based replication environment. So I guess pt-osc is not able to catch changes made on the second master during the change, is it?

EDIT: While doing some tests and I saw that pt-osc was creating triggers on both masters which would cover changes from both sides. Still I'm quite unsure if I can safely do online changes in this environment.

Mrskman (121 rep)

Jan 15, 2019, 02:10 PM • Last activity: Apr 21, 2025, 05:04 PM

1 votes

2 answers

579 views

Copying MySQL table to another table with no missing new changes

mysql mariadb percona-tools

I would like to recreate/copy a MySQL table (a actual new ibd file) - and most of the suggestions recommends the following : CREATE TABLE mytbcopy LIKE mytb; INSERT INTO mytbcopy SELECT * FROM mytb; The suggestion works well - but from my understanding (correct me if I'm wrong), it does not insert/u...

                                  I would like to recreate/copy a MySQL table (a actual new ibd file) - and most of the suggestions recommends the following : 

    CREATE TABLE mytbcopy LIKE mytb;
    INSERT INTO mytbcopy SELECT * FROM mytb; 

The suggestion works well - but from my understanding (correct me if I'm wrong), it does not insert/update records in the new table during the period of copying/recreating.

Example : During the copying, if record #45 is already inserted into the new table, and there was an update on record #45 on the original table - it won't be replicated over

Is there another way to ensure that after the full copying of data over, the data on the new table will be in the most updated state? I'm not sure using TRIGGERS is the solution for this.

SleepingHide123 (21 rep)

Dec 13, 2019, 02:37 AM • Last activity: Apr 21, 2025, 01:03 PM

0 votes

0 answers

35 views

When does xtrabackup take lock exactly and why there's slave lag after backup starts?

mysql backup mysql-5.7 percona percona-tools

MySQL Version -- MySQL Community Server 5.7.44 Percona xtrabackup Version 2.4.29 (x86_64) I want to understand when does xtrabackup exactly take the FLUSH TABLES WITH READ LOCK. Following is the simplified command I use for full backup. /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slav...

                                      MySQL Version -- MySQL Community Server 5.7.44
    Percona xtrabackup Version 2.4.29 (x86_64)

I want to understand when does xtrabackup exactly take the FLUSH TABLES WITH READ LOCK. 

Following is the simplified command I use for full backup.

    /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=bkp_user --tmpdir=/pfg_prod/tmp --parallel=4 --use-memory=1024M 2> bkp.log --stream=xbstream  | /usr/bin/lz4 stdin /bkp_dir/date.lz4

I use this same command on all my slave servers for daily backups.

When I check the xtrabackup logs, I notice that it does not immediately take FTWRL but the slave lag begins to grow.

Log for the 9th 

    250309 11:13:18 Executing FLUSH TABLES WITH READ LOCK...
    250309 11:13:32 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
    250309 11:13:32 Executing UNLOCK TABLES

Log for the 10th 

    250310 11:13:55 Executing FLUSH TABLES WITH READ LOCK...
    250310 11:14:05 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
    250310 11:14:05 Executing UNLOCK TABLES

Log for the 15th 

    250315 11:21:17 Executing FLUSH TABLES WITH READ LOCK...
    250315 11:21:18 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
    250315 11:21:18 Executing UNLOCK TABLES

***The backup starts at 08:00 AM every morning.***

Thus, if it starts the backup at 08:00 AM and executes the FTWRL at around after 11:00 AM, I want to understand why there's slave lag immediately after the backup starts and the lag continues till the backup is over.

***I want to understand when does xtrabackup exactly take the FLUSH TABLES WITH READ LOCK if not at the start of the backup? and why there's slave lag as soon as the backup starts, because if the FTWRL is taken after approximately 3 hours, I think there shouldn't be lag before that, I am not sure though. The slave lag is only during the backup window.*** 

I had been assuming so far that FTWRL is taken when the backup starts, hence the doubt.

    root@localhost 09:45:21 [(none)]  show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: master_ip
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: binlog.000982
              Read_Master_Log_Pos: 30958
                   Relay_Log_File: relay.000430
                    Relay_Log_Pos: 1058083259
            Relay_Master_Log_File: binlog.000981
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                  Replicate_Do_DB:
              Replicate_Ignore_DB:
               Replicate_Do_Table:
           Replicate_Ignore_Table:
          Replicate_Wild_Do_Table:
      Replicate_Wild_Ignore_Table:
                       Last_Errno: 0
                       Last_Error:
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 10580
                  Relay_Log_Space: 1383
                  Until_Condition: None
                   Until_Log_File:
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File:
               Master_SSL_CA_Path:
                  Master_SSL_Cert:
                Master_SSL_Cipher:
                   Master_SSL_Key:
            Seconds_Behind_Master: 5872
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error:
                   Last_SQL_Errno: 0
                   Last_SQL_Error:
      Replicate_Ignore_Server_Ids:
                 Master_Server_Id: 103454371
                      Master_UUID: UUID
                 Master_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: System lock
               Master_Retry_Count: 86400
                      Master_Bind:
          Last_IO_Error_Timestamp:
         Last_SQL_Error_Timestamp:
                   Master_SSL_Crl:
               Master_SSL_Crlpath:
               Retrieved_Gtid_Set: GTID-SET
                Executed_Gtid_Set: GTID-SETS
                    Auto_Position: 0
             Replicate_Rewrite_DB:
                     Channel_Name:
               Master_TLS_Version:
    1 row in set (0.25 sec)


    +-------+-------------------+---------------------+-------------------+-------------+--------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
    | Id    | User              | Host                | db                | Command     | Time   | State                                                         | Info                                                                                                 |
    +-------+-------------------+---------------------+-------------------+-------------+--------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+                                                                                              |
    |    13 | system user       |                     | NULL              | Connect     | 255661 | Waiting for master to send event                              | NULL                                                                                                 |
    |    14 | system user       |                     | NULL              | Connect     |      0 | Waiting for dependent transaction to commit                   | NULL                                                                                                 |
    |    15 | system user       |                     | NULL              | Connect     |   5931 | System lock                                                   | NULL                                                                                                 |
    |    16 | system user       |                     | NULL              | Connect     |   5931 | Waiting for an event from Coordinator                         | NULL                                                                                                 |
    | 12664 | bkp_user          | localhost           | NULL              | Sleep       |   6377 |                                                               | NULL                                                                                                 |                                                                                          |
    | 12989 | root              | localhost           | NULL              | Query       |      0 | starting                                                      | show processlist                                                                                     |
    | 12990 | prome        | 127.0.0.1:43240     | NULL              | Query       |      0 | Opening tables                                                | SELECT table_schema, table_name, column_name, auto_increment,
    		  pow(2, case data_type
    		    when ' |
    +-------+-------------------+---------------------+-------------------+-------------+--------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
                                

Avinash Pawar (216 rep)

Mar 16, 2025, 09:37 AM • Last activity: Mar 17, 2025, 02:24 AM

0 votes

0 answers

50 views

xtrabackup fails with error xb_stream_write_data() failed

backup mysql-5.7 percona-tools

I am facing a weird issue while using xtrabackup with xbstream. The weird part is that it was working fine just few days ago and apart from a reboot there were no changes on the machine. I have at least 100 machines with this exact same script, same OS, same packages and xtrabackup binary. Everythin...

                                  I am facing a weird issue while using xtrabackup with xbstream. The weird part is that it was working fine just few days ago and apart from a reboot there were no changes on the machine. I have at least 100 machines with this exact same script, same OS, same packages and xtrabackup binary. Everything is same on all the machines. It works on all the machines except one.

OS Details

    uname -a
    5.10.227-219.884.amzn2.x86_64 #1 SMP Tue Oct 22 16:38:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

    NAME="Amazon Linux"
    VERSION="2"
    ID="amzn"
    ID_LIKE="centos rhel fedora"
    VERSION_ID="2"
    PRETTY_NAME="Amazon Linux 2"

MySQL Version and xtrabackup version

    /usr/bin/xtrabackup --version
    /usr/bin/xtrabackup version 2.4.29 based on MySQL server 5.7.44 Linux (x86_64) (revision id: 2e6c0951)
    
    mysql-community-server-5.7.44-1.el7.x86_64

xtrabackup command which causes error

    /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=backup_user --tmpdir=/data/tmp --parallel=4 --use-memory=1024MB --stream=xbstream | /usr/bin/lz4 - /S3-Mountpoint/backup/app_date +%F.lz4

The error I receive is 

    xtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
    xb_stream_write_data() failed.
    xtrabackup: Error: write to logfile failed
    xtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
    xtrabackup: Error: xtrabackup_copy_logfile() failed.

So while checking what could be wrong, I checked with local machine path and it too doesn't work, fails with same error

    /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=backup_user --tmpdir=/data/tmp --parallel=4 --use-memory=1024MB --stream=xbstream | /usr/bin/lz4 - /local_machine_path/app_date +%F.lz4

I tried the same with tar and it does work

    /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=backup_user --tmpdir=/data/tmp --use-memory=1024MB --stream=tar | /usr/bin/gzip > /S3-Mountpoint/backup/app_date +%F.tar.gz

Again I tried it without xbsteram on local machine and it works

    /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=bakcup_user --tmpdir=/data/tmp --parallel=4 --use-memory=1024MB --target-dir=/local_machine_path/app_date +%F

So my conclusion is that something is wrong with xbstream, I am not sure though. Could this be a bug? Or am I missing something? Please help out.
                                

Avinash Pawar (216 rep)

Jan 20, 2025, 11:28 AM

0 votes

1 answers

62 views

Percona xtrabackup and percona toolkit for aarch64 Graviton Amazon Linux 2 machines

backup linux percona percona-tools

In my organisation, it was decided to move all the MySQL infrastructure (Amazon EC2 machines) to Graviton (aarch64) servers from AMD (x86_64). All are MySQL Community Servers. I need percona-xtrabackup and percona-toolkit installed on the new aarch64 (graviton) machines, but unfortunately, percona h...

                                  In my organisation, it was decided to move all the MySQL infrastructure (Amazon EC2 machines) to Graviton (aarch64) servers from AMD (x86_64).

All are MySQL Community Servers.

I need percona-xtrabackup and percona-toolkit installed on the new aarch64 (graviton) machines, but unfortunately, percona has not yet released the aarch64 packages. There's a lot of dependency on percona-toolkit and xtrabackup. Lots of scripts are running using percona-toolkit tools and backups are running using xtrabackup.

I know these packages can be built from source also, but I have never done it before.

I was able to build percona xtrabackup using this link
 https://docs.percona.com/percona-xtrabackup/8.0/compile-xtrabackup.html 
and percona-toolkit (version 3.3.1) from this link https://github.com/percona/percona-toolkit/blob/3.x/INSTALL 

Because I have never done this before and because Percona has not yet officially provided these packages, I am really not sure if building these packages from source will work on Graviton (aarch64) or not.

The graviton machines have Amazon Linux 2 on them. I am not even sure which Red Hat version is this.

    cat /etc/os-release
    NAME="Amazon Linux"
    VERSION="2"
    ID="amzn"
    ID_LIKE="centos rhel fedora"
    VERSION_ID="2"
    PRETTY_NAME="Amazon Linux 2"
    ANSI_COLOR="0;33"
    CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
    HOME_URL="https://amazonlinux.com/ "
    SUPPORT_END="2025-06-30"

    cat /proc/version
    Linux version 5.10.219-208.866.amzn2.aarch64 (mockbuild@ip-10-0-61-241) (gcc10-gcc (GCC) 10.5.0 20230707 (Red Hat 10.5.0-1), GNU ld version 2.35.2-9.amzn2.0.1) #1 SMP Tue Jun 18 14:00:02 UTC 2024

    rpm -E %{rhel}
    7
                                

Avinash Pawar (216 rep)

Jul 20, 2024, 07:04 PM • Last activity: Dec 21, 2024, 12:52 PM

0 votes

0 answers

29 views

Percona pt-query-digest vs. Performance Insights on AWS Aurora 3.x

mysql percona aws-aurora percona-tools

Several years ago when I was running MySQL on-prem and then EC2, I found Percona pt-query-digest to be very useful to find poorly performing SQL in prod. It's been a while since I've used it. AWS Performance Insights has similar functionality. Other than the fact that it's free, are there any advant...

                                  Several years ago when I was running MySQL on-prem and then EC2, I found Percona pt-query-digest to be very useful to find poorly performing SQL in prod. It's been a while since I've used it. AWS Performance Insights has similar functionality. Other than the fact that it's free, are there any advantages to pt-query-digest over PI?
                                

Swechsler (153 rep)

Oct 23, 2024, 07:57 PM

0 votes

0 answers

35 views

Percona pg_tde extension base backup failing

transparent-data-encryption percona-tools postgresql-16

I am just asking here so that someone from Percona can check if possible. I have followed below documentation to configure pg_tde in RHEL 8 with PG version 16. https://percona.github.io/pg_tde/main/ After enabling and configuring pg_tde key provider and master key . pg_basebackup is getting failed w...

[postgres@hostname postgres]$ pg_basebackup -D bkp_test -R -X stream -v
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/14000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_1391250"
WARNING:  aborting backup due to backend exiting before pg_backup_stop was called
pg_basebackup: error: COPY stream ended before last file was finished
pg_basebackup: removing contents of data directory "bkp_test"

In the logs I can see below error.

2024-10-18 07:55:21.132 EDT  LOG:  checkpoint starting: force wait
2024-10-18 07:55:21.136 EDT  LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.004 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16384 kB, estimate=27000 kB; lsn=0/14000060, redo lsn=0/14000028
2024-10-18 07:55:21.205 EDT  WARNING:  aborting backup due to backend exiting before pg_backup_stop was called
2024-10-18 07:55:21.205 EDT  ERROR:  invalid segment number 0 in file "pg_tde.map"
2024-10-18 07:55:21.205 EDT  STATEMENT:  BASE_BACKUP ( LABEL 'pg_basebackup base backup',  PROGRESS,  WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2024-10-18 07:55:21.221 EDT  LOG:  unexpected EOF on standby connection
2024-10-18 07:55:21.221 EDT  STATEMENT:  START_REPLICATION SLOT "pg_basebackup_1391250" 0/14000000 TIMELINE 1

After loading shared library pg_tde on server level and configuring master key for specific DB level, I can see below files in base directory that is creating issue. but unable to understand .

./16505/pg_tde.map
./16505/pg_tde.dat

below are the databases with OID.

test=# select oid,datname from pg_database;
  oid  |  datname
-------+-----------
     5 | postgres
     1 | template1
     4 | template0
 16505 | test
(4 rows)

Thanks for your help and suggestion if any.

Adam Mulla (143 rep)

Oct 18, 2024, 12:06 PM

2 votes

1 answers

1931 views

xb_stream_read_chunk(): wrong chunk magic at offset 0x0

mysql-5.5 mariadb percona percona-tools backup xtrabackup

I have taken a backup of mysql instance like this ``` innobackupex \ --user=$MYUSER \ --password=$MYPASS \ --no-timestamp \ --parallel=$DUMP_THREADS \ --stream=xbstream \ --slave-info \ --extra-lsndir=$LSN_DIR \ --tmpdir=$TMP_DIR \ --no-lock \ --safe-slave-backup ``` I am trying to restore this back...

I have taken a backup of mysql instance like this

innobackupex \
   --user=$MYUSER \
   --password=$MYPASS \
   --no-timestamp \
   --parallel=$DUMP_THREADS \
   --stream=xbstream \
   --slave-info \
   --extra-lsndir=$LSN_DIR \
   --tmpdir=$TMP_DIR \
   --no-lock \
   --safe-slave-backup

I am trying to restore this backup like this.

zcat backup_file.xbs.gz 2>/dev/null | xbstream -x -C /var/lib/mysql/

I am getting this following error.

xb_stream_read_chunk(): wrong chunk magic at offset 0x0.

I have tried all google suggestions, upgraded xtrabackup to the latest version. Nothing seems to be working.

vkrishna (121 rep)

Nov 18, 2017, 09:07 AM • Last activity: Aug 9, 2024, 10:09 AM

4 votes

3 answers

804 views

failed pt-online-schema-change left behind triggers. How to delete?

mysql mysql-8.0 percona-server percona-tools

We had a failed `pt-osc` - the server ran out of disk space. And now the triggers are left behind ``` SHOW TRIGGERS; pt_osc_xxx_production_orders_ins pt_osc_xxx_production_orders_upd pt_osc_xxx_production_orders_del ``` How can these be deleted? Doing a `DROP TRIGGER` seems to lock the table (it has...

We had a failed pt-osc - the server ran out of disk space. And now the triggers are left behind

SHOW TRIGGERS;

pt_osc_xxx_production_orders_ins
pt_osc_xxx_production_orders_upd
pt_osc_xxx_production_orders_del

How can these be deleted? Doing a DROP TRIGGER seems to lock the table (it has 150 million rows) and looks like it will take some hours. We cannot use pt-osc on this table anymore because it fails with a Trigger already exists error. We are running on mysql Ver 8.0.32-24 for Linux on x86_64 (Percona Server (GPL), Release 24, Revision e5c6e9d2) under CentOS8 (if that makes any difference)

phil (153 rep)

Jun 22, 2023, 01:07 PM • Last activity: Apr 12, 2024, 04:51 AM

10 votes

1 answers

3031 views

How to add index to a big table with 60M records without downtime?

mysql mysql-5.7 amazon-rds percona-tools online-operations

we have been struggling with one issue in the past few days. We want to add an index to a huge table with 60M records. At first we tried adding it with basic mysql syntax. But it clogged our production DB. That table is used very frequently in production queries. So everything suffered. Our DB is ho...

                                  we have been struggling with one issue in the past few days. We want to add an index to a huge table with 60M records. At first we tried adding it with basic mysql syntax. But it clogged our production DB. That table is used very frequently in production queries. So everything suffered.

Our DB is hosted on AWS RDS. Its Mysql 5.7. We are using Laravel as our PHP framework

Next thing we read about was, we can copy the current table into a new one. Then add index to the new table. Then shift the laravel model to use the new table. We thought it made sense and it would be easy enough

But copying the table data from one table to the new one, was taking quite a lot of time. Our calculations showed it would take days. We tried using Laravel as well as SQL commands. But it was too slow either way.

Then we tried exporting the data as CSV and importing it, but again, too slow. The first few million records would insert fast, but then the table would become extremely slow in inserting.

Finally we tried mysqldump and we realised it also locks the new table while inserting, so maybe that's why its fast enough. It took around 6 hours to copy the table into new one. BUT we were missing 2M records in this method. We also checked how many records came into the existing table while exporting/importing, it was only around a 100K. So the exporting/importing was missing 1.9M records, and we couldn't figure out why.

After going through all these different ways, we have decided to put the app in downtime and add the index on the huge table

I wanted to know, do others face this issue as well? Is there a way to either add indexes on a huge table without causing downtime on production? Or is there a faster way to copy a big mysql table without loss of data?

Rohan (53 rep)

Dec 15, 2023, 12:19 PM • Last activity: Dec 20, 2023, 10:24 AM

0 votes

1 answers

624 views

pt-online-schema-change drop_swap doesn't work, so what to do?

mysql mysql-8.0 percona-tools

After waiting 24 hours for a `ptosc` this: ``` 2023-05-29T11:29:40 Copied rows OK. 2023-05-29T11:29:40 Max rows for the rebuild_constraints method: 2710 Determining the method to update foreign keys... 2023-05-29T11:29:40 `xxx_production`.`click_tracks`: too many rows: 4325947; must use drop_swap --...

After waiting 24 hours for a ptosc this:

2023-05-29T11:29:40 Copied rows OK.
2023-05-29T11:29:40 Max rows for the rebuild_constraints method: 2710
Determining the method to update foreign keys...
2023-05-29T11:29:40   xxx_production.click_tracks: too many rows: 4325947; must use drop_swap                                                                                                  
--alter-foreign-keys-method=drop_swap doesn't work with MySQL 8.0+
See https://bugs.mysql.com/bug.php?id=89441 
2023-05-29T11:29:40 Dropping triggers...
2023-05-29T11:29:42 Dropped triggers OK.
Not dropping the new table xxx_production._orders_new because --swap-tables failed.  To drop the new table, execute:                                                                           
DROP TABLE IF EXISTS xxx_production._orders_new;
xxx_production.orders was not altered.

orders is a table with 136 million rows. But I think the issue is the click_tracks table that has 4.3 million rows. If drop_swap must be used, but the next line says drop_swap doesn't work on MySQL8....what are we supposed to do exactly? EDIT: *Before*

mysql> describe orders;
+----------------------------------+--------------+------+-----+---------+----------------+
| Field                            | Type         | Null | Key | Default | Extra          |
+----------------------------------+--------------+------+-----+---------+----------------+
...
| item_promotion_id                | int          | YES  |     | NULL    |                |

*After*

| item_promotion_id                | varchar(25)  | YES  |     | NULL    |                |

*Command*

pt-online-schema-change --critical-load='Threads_running=600' --alter-foreign-keys-method=auto --execute --alter "MODIFY COLUMN item_promotion_id varchar(25)" D=xxx_production,t=orders

Column (item_promotion_id) is NOT in a FK or used in an INDEX. Could the issue actually be a FK between orders and click_tracks? EDIT 2: Sadly, 'just' running the ADD COLUMN c VARCHAR, ALGORITHM=INPLACE; fails on tables so large because we hit ERROR 1062 (23000): Duplicate entry. This is described in the MySQL documentation . So it seems we are back to pt-osc but instead of doing an ALTER doing an ADD. EDIT 3: Trying to do an ADD COLUMN using pt-osc results in the same failure!

pt-online-schema-change --critical-load='Threads_running=600' --alter-foreign-keys-method=auto --execute --alter "ADD COLUMN item_promotion_ref varchar(25)" D=xxx_production,t=orders

2023-06-01T06:11:27 Max rows for the rebuild_constraints method: 3090
Determining the method to update foreign keys...
2023-06-01T06:11:27   xxx_production.click_tracks: too many rows: 4325947; must use drop_swap                                                                         
--alter-foreign-keys-method=drop_swap doesn't work with MySQL 8.0+
See https://bugs.mysql.com/bug.php?id=89441 
2023-06-01T06:11:27 Dropping triggers...
2023-06-01T06:11:28 Dropped triggers OK.
Not dropping the new table xxx_production._orders_new because --swap-tables failed.  To drop the new table, execute:                                                  
DROP TABLE IF EXISTS xxx_production._orders_new;
xxx_production.orders was not altered.

I think there is nothing I can do with this table. Time to hire a DBA.

phil (153 rep)

May 29, 2023, 12:48 PM • Last activity: Jun 1, 2023, 06:53 AM

1 votes

1 answers

998 views

Run pt-online-schema-change with multiple ALTER queries synchronously

mysql percona-tools online-operations

I want to run 3 ALTER queries with pt-online-schema-change tool: --alter "ADD INDEX userid_sid_ts_fid (user_id, scorecard_id, timestamp, factor_id), DROP INDEX uidts, RENAME INDEX userid_sid_ts_fid to uidts" However I face this error: Error altering new table `*****`.`_scoring_basis_new`: DBD::mysql...

                                  I want to run 3 ALTER queries with pt-online-schema-change tool:

    --alter "ADD INDEX userid_sid_ts_fid (user_id, scorecard_id, timestamp, factor_id), DROP INDEX uidts, RENAME INDEX userid_sid_ts_fid to uidts"

However I face this error:

    Error altering new table *****._scoring_basis_new: DBD::mysql::db do failed: Key 'userid_sid_ts_fid' doesn't exist in table '_scoring_basis_new'

So it looks like is trying to run these 3 queries asynchronously rather than 1 by 1. How can I prevent that?

Diego (113 rep)

Dec 15, 2022, 12:30 PM • Last activity: Dec 15, 2022, 04:11 PM

0 votes

1 answers

73 views

Can one use a view as a source on pt-table-sync?

mysql percona-tools

Any attempt to fill a table from a view results in an error, as the source-table does not exist: Error getting table structure for TABLE_NAME on SOURCE_DSN_DATA doesnt handle CREATE TABLE without quoting. at /usr/local/bin/pt-table-sync line 2872. Ensure that the table exists and is accessible. whil...

                                  Any attempt to fill a table from a view results in an error, as the source-table does not exist:

    Error getting table structure for TABLE_NAME on SOURCE_DSN_DATA doesnt handle CREATE TABLE without quoting.
    at /usr/local/bin/pt-table-sync line 2872. Ensure that the table exists and is accessible.  
    while doing TABLE_NAME on DESTINATION_NAME

I know that the source table doesn't exist, as I am trying to use a view as a source instead. Access is not a problem either. As tests with a source-table instead of a source-view confirmed.

In the DSN, I use d= and t= to define the database and table/view.

Replacing the view with a table and running the pt-table-sync again works as expected, so there seem to be no issue in the command/parameters themselves.

    pt-table-sync --execute DSN_SOURCE DSN_TARGET --verbose --print

The documentation by percona  does not indicate any additional parameters for what I want to do.

Is there any way to use a view as a source on pt-table-sync?

A-Tech (217 rep)

Dec 12, 2022, 01:58 PM • Last activity: Dec 12, 2022, 07:49 PM

1 votes

1 answers

585 views

how to setup percona pmm monitoring of percona 5.7 in docker container

percona percona-server percona-tools docker

I want to monitor percona running in docker with the percona monitoring and management client. I have percona monitoring and management installed, running and monitoring other percona instances (https://www.percona.com/doc/percona-monitoring-and-management/deploy/index.html). I recently added percon...

                                  I want to monitor percona running in docker with the percona monitoring and management client.

I have percona monitoring and management installed, running and monitoring other percona instances (https://www.percona.com/doc/percona-monitoring-and-management/deploy/index.html) . I recently added percona (currently version 5.7) running in docker container (pulled from https://github.com/docker-library/percona) .

I found an open feature request to add the pmm-admin executable from inside PMM Server at https://jira.percona.com/browse/PMM-627 

Is there way to setup to monitor my percona instance running docker?

washingon (111 rep)

Nov 24, 2017, 12:55 PM • Last activity: Oct 3, 2022, 12:04 PM

0 votes

1 answers

255 views

pt-online-schema-change's triggers fail with "DELETE command denied" on insert

percona-tools

We recently tried pt-online-schema-change to add a column to a table. It worked mostly as expected but one thing puzzles me: in case the account we use to do the migration doesn't have the DELETE permission, our application (which keeps running simultaneously) gets errors that say "DELETE command de...

                                  We recently tried pt-online-schema-change to add a column to a table. It worked mostly as expected but one thing puzzles me: in case the account we use to do the migration doesn't have the DELETE permission, our application (which keeps running simultaneously) gets errors that say "DELETE command denied to user 'pt'@'localhost' for table '_xxx_new'" when the app is performing an **insert** on the *xxx* table.

My understanding is that the triggers are to do **inserts** to the destitation *_xxx_new* table when a new record is inserted in the source *xxx* table. How come it may fail with not having a DELETE permission??

Percona docs are pretty generic on this and no googling helped, so will be thankful for any ideas!

Dmitry Novoselov (11 rep)

Mar 30, 2017, 03:27 PM • Last activity: Apr 4, 2022, 08:05 PM

3 votes

2 answers

1434 views

Prometheus High Memory and CPU Usage in PMM

mysql monitoring percona-tools

We are running PMM v1.17.0 and prometheus is causing huge cpu and mem usage (200% CPU and 100% RAM), and pmm went down because of this. We are running PMM on a VM with 2vCPUs and 7.5G RAM, and are monitoring about 25 servers. PMM is running with below command >> docker run -d -it --volumes-from pmm-...

                                  We are running PMM v1.17.0 and prometheus is causing huge cpu and mem usage (200% CPU and 100% RAM), and pmm went down because of this. We are running PMM on a VM with 2vCPUs and 7.5G RAM, and are monitoring about 25 servers. PMM is running with below command >>

    docker run -d -it --volumes-from pmm-data --name pmm-server  -e QUERIES_RETENTION=1095   -p 80:80   -e METRICS_RESOLUTION=3s  --restart always percona/pmm-server:1

The prometheus.log is filled with below entries:

    level=warn ts=2020-01-30T10:27:12.8156514Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="append failed" err="out of order sample"
    level=warn ts=2020-01-30T10:27:26.464361371Z caller=scrape.go:945 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.223:42002/metrics-mr  msg="Error on ingesting samples with different value but same timestamp" num_dropped=1
    level=warn ts=2020-01-30T10:27:27.81316996Z caller=scrape.go:942 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="Error on ingesting out-of-order samples" num_dropped=2
    level=warn ts=2020-01-30T10:27:27.813257165Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="append failed" err="out of order sample"
    level=warn ts=2020-01-30T10:27:41.462420708Z caller=scrape.go:945 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.223:42002/metrics-mr  msg="Error on ingesting samples with different value but same timestamp" num_dropped=1
    level=warn ts=2020-01-30T10:27:42.813356387Z caller=scrape.go:942 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="Error on ingesting out-of-order samples" num_dropped=2
    level=warn ts=2020-01-30T10:27:42.813441108Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="append failed" err="out of order sample"
    level=warn ts=2020-01-30T10:27:56.463798729Z caller=scrape.go:945 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.223:42002/metrics-mr  msg="Error on ingesting samples with different value but same timestamp" num_dropped=1
    level=warn ts=2020-01-30T10:27:57.82083775Z caller=scrape.go:942 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="Error on ingesting out-of-order samples" num_dropped=2
    level=warn ts=2020-01-30T10:27:57.820912309Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr  msg="append failed" err="out of order sample"

Can someone please let me know why prometheus is causing issue? Any parameters we need to add/change? 

                                

user5594148 (43 rep)

Jan 30, 2020, 10:56 AM • Last activity: Nov 14, 2021, 01:01 AM

0 votes

1 answers

420 views

Drop column with percona does not shrink table size

mysql mysql-5.7 percona percona-tools

I have a table in Aurora MySQL 5.7. table has few partitions with 800m rows and weights 2tb. Recently I dropped few column using percona. Surprisingly the table size did not change (looking in `information_schema.tables`. The way percona doing a change is using new table `_ _new` with triggers on th...

                                  I have a table in Aurora MySQL 5.7. table has few partitions with 800m rows and weights 2tb.
Recently I dropped few column using percona. Surprisingly the table size did not change (looking in information_schema.tables.

The way percona doing a change is using new table __new with triggers on the original table. it creates an empty new table with same DDL, execute the changes we wish, and copies everything to the new table with the triggers to keep it up to date. once the data is synced - percona renames the tables and drop the old one. So the table is been build from scratch (without locking).

However, after running alter table optimize partition I saw the size shrunk to 250gb. Anyone have explanation or know what I did wrong? 

pt command:

    pt-online-schema-change --user $MYSQL_DBA_USER --password $MYSQL_DBA_PASS --host $MYSQL_WRITER D=db,t=table_data --alter "drop column a1, drop column a2"  --execute --max-load Threads_running=18446744073709551606 --critical-load Threads_running=18446744073709551606 --recursion-method=none


optimize command:

    MySQL [(db)]> select table_rows,data_length/power(1024,3), index_length/power(1024,3),DATA_FREE/power(1024,3),AVG_ROW_LENGTH  from information_schema.tables where table_name='table_data';
    +------------+---------------------------+----------------------------+-------------------------+----------------+
    | table_rows | data_length/power(1024,3) | index_length/power(1024,3) | DATA_FREE/power(1024,3) | AVG_ROW_LENGTH |
    +------------+---------------------------+----------------------------+-------------------------+----------------+
    |  610884663 |        1847.7273712158203 |         202.40484619140625 |            0.0322265625 |           3247 |
    +------------+---------------------------+----------------------------+-------------------------+----------------+
    1 row in set (0.00 sec)
    
    
    MySQL [db]> ALTER TABLE table_data OPTIMIZE PARTITION p20210601;
    +---------------+----------+----------+---------------------------------------------------------------------------------------------+
    | Table         | Op       | Msg_type | Msg_text                                                                                    |
    +---------------+----------+----------+---------------------------------------------------------------------------------------------+
    | db.table_data | optimize | note     | Table does not support optimize on partitions. All partitions will be rebuilt and analyzed. |
    | db.table_data | optimize | status   | OK                                                                                          |
    +------------------------+----------+----------+---------------------------------------------------------------------------------------------+
    2 rows in set (5 hours 39 min 40.95 sec)
    
    MySQL [db]>
    MySQL [db]> select table_rows,data_length/power(1024,3), index_length/power(1024,3),DATA_FREE/power(1024,3),AVG_ROW_LENGTH  from information_schema.tables where table_name='table_data';
    
    
    +------------+---------------------------+----------------------------+-------------------------+----------------+
    | table_rows | data_length/power(1024,3) | index_length/power(1024,3) | DATA_FREE/power(1024,3) | AVG_ROW_LENGTH |
    +------------+---------------------------+----------------------------+-------------------------+----------------+
    |  736965899 |        104.25639343261719 |         155.98052978515625 |            0.0244140625 |            151 |
    +------------+---------------------------+----------------------------+-------------------------+----------------+


                                

Nir (529 rep)

Nov 1, 2021, 01:14 PM • Last activity: Nov 2, 2021, 12:57 PM

-1 votes

1 answers

994 views

InnoDB: Assertion failure on executing select Query - MySQL 5.7.31

mysql-5.7 percona percona-tools percona-toolkit

I am using pt-archiver for daily archiving of tables, but while selecting data from one tables I am getting following error and it restart mysql instance 2021-07-07 13:21:17 0x7fe0dffdc700 InnoDB: Assertion failure in thread 140603807352576 in file btr0pcur.cc line 46 I run pt-archiver with --dry-ru...

                                  I am using pt-archiver for daily archiving of tables, but while selecting data from one tables I am getting following error and it restart mysql instance

    2021-07-07 13:21:17 0x7fe0dffdc700  InnoDB: Assertion failure in thread 140603807352576 in file btr0pcur.cc line 46

I run pt-archiver with --dry-run and following is my select query

    SELECT /*!40001 SQL_NO_CACHE */ irig_time,device_id,message_id,mode,protection_1,protection_2,protection_3,protection_4,alarm_1,alarm_2,alarm_3,alarm_4,grid_switch_control,dc_switch_1_on,dc_switch_2_on,additional_feedback_external_sensor,module_communication_fault_position FROM acbm_status_v2_0_0 FORCE INDEX(PRIMARY) WHERE (DATE(irig_time)=DATE_SUB(CURDATE(), INTERVAL 1 DAY)) ORDER BY irig_time,device_id LIMIT 200

If i run this query manually still getting assertion error and it restart mysql instance.


Following is table structure

    Table: acbm_status_v2_0_0
    Columns:
    irig_time                             datetime(6) PK 
    device_id                             int(11) PK 
    message_id                            bigint(20) UN 
    mode                                  varchar(64) 
    protection_1                          int(10) UN 
    protection_2                          int(10) UN 
    protection_3                          int(10) UN 
    protection_4                          int(10) UN 
    alarm_1                               int(10) UN 
    alarm_2                               int(10) UN 
    alarm_3                               int(10) UN 
    alarm_4                               int(10) UN 
    grid_switch_control                   tinyint(1) 
    dc_switch_1_on                        tinyint(1) 
    dc_switch_2_on                        tinyint(1) 
    additional_feedback_external_sensor   tinyint(1) 
    module_communication_fault_position   int(10) UN

Below is complete trace

    2021-07-07 13:21:17 0x7fe0dffdc700  InnoDB: Assertion failure in thread 140603807352576 in file btr0pcur.cc line 461
    InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
    InnoDB: We intentionally generate a memory trap.
    InnoDB: Submit a detailed bug report to http://bugs.mysql.com .
    InnoDB: If you get repeated assertion failures or crashes, even
    InnoDB: immediately after the mysqld startup, there may be
    InnoDB: corruption in the InnoDB tablespace. Please refer to
    InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html 
    InnoDB: about forcing recovery.
    13:21:17 UTC - mysqld got signal 6 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.
    Attempting to collect some information that could help diagnose the problem.
    As this is a crash and something is definitely wrong, the information
    collection process might fail.
    
     
    
    key_buffer_size=8388608
    read_buffer_size=131072
    max_used_connections=18
    max_threads=500
    thread_count=18
    connection_count=17
    It is possible that mysqld could use up to
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 206883 K  bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.
    
     
    
    Thread pointer: 0x7fe01c000d40
    Attempting backtrace. You can use the following information to find out
    where mysqld died. If you see no messages after this, something went
    terribly wrong...
    stack_bottom = 7fe0dffdbe60 thread_stack 0x40000
    mysqld(my_print_stacktrace+0x2c)[0x556a3c9cab7c]
    mysqld(handle_fatal_signal+0x501)[0x556a3c2e1f01]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7fe1fffaa730]
    /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7fe1ffa857bb]
    /lib/x86_64-linux-gnu/libc.so.6(abort+0x121)[0x7fe1ffa70535]
    mysqld(+0x6c1083)[0x556a3c2a9083]
    mysqld(+0x6c30da)[0x556a3c2ab0da]
    mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0xd03)[0x556a3cc699a3]
    mysqld(_ZN11ha_innobase13general_fetchEPhjj+0xdf)[0x556a3cb6d4af]
    mysqld(_ZThn760_N11ha_innopart18index_next_in_partEjPh+0x2d)[0x556a3cb8351d]
    mysqld(_ZN16Partition_helper19handle_ordered_nextEPhb+0x299)[0x556a3c714199]
    mysqld(_ZN7handler13ha_index_nextEPh+0x1c5)[0x556a3c3358d5]
    mysqld(+0xb932dc)[0x556a3c77b2dc]
    mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x18f)[0x556a3c7817cf]
    mysqld(_ZN4JOIN4execEv+0x20b)[0x556a3c77aacb]
    mysqld(_Z12handle_queryP3THDP3LEXP12Query_resultyy+0x2e0)[0x556a3c7e2d50]
    mysqld(+0xbbd45b)[0x556a3c7a545b]
    mysqld(_Z21mysql_execute_commandP3THDb+0x4924)[0x556a3c7ac564]
    mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3dd)[0x556a3c7ae94d]
    mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x1062)[0x556a3c7afa22]
    mysqld(_Z10do_commandP3THD+0x207)[0x556a3c7b0d67]
    mysqld(handle_connection+0x298)[0x556a3c8690c8]
    mysqld(pfs_spawn_thread+0x157)[0x556a3ce77cd7]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7fe1fff9ffa3]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fe1ffb474cf]
    
     
    
    Trying to get some variables.
    Some pointers may be invalid and cause the dump to abort.
    Query (7fe01c004860): SELECT /*!40001 SQL_NO_CACHE */ irig_time,device_id,message_id,mode,protection_1,protection_2,protection_3,protection_4,alarm_1,alarm_2,alarm_3,alarm_4,grid_switch_control,dc_switch_1_on,dc_switch_2_on,additional_feedback_external_sensor,module_communication_fault_position FROM ycube2.acbm_status_v2_0_0 FORCE INDEX(PRIMARY) WHERE (DATE(irig_time)=DATE_SUB(CURDATE(), INTERVAL 1 DAY)) ORDER BY irig_time,device_id LIMIT 1000
    Connection ID (thread ID): 41
    Status: NOT_KILLED
    
    The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html  contains
    information that should help you find out what is causing the crash.
                                

ImranRazaKhan (149 rep)

Jul 9, 2021, 11:39 AM • Last activity: Jul 9, 2021, 02:57 PM

Showing page 1 of 20 total questions