Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
0
votes
1
answers
237
views
Help preparing for MySQL Master Slave Replication
I am not in the best situation. I inherited an Ubuntu 14.04 8 GB RAM, 8 CPU MySQL 5.5 database server with almost 400 GB of business-critical data (stored on external SSD) contained within several thousand different databases. My database administration skills and experience are nascent. I want to c...
I am not in the best situation. I inherited an Ubuntu 14.04 8 GB RAM, 8 CPU MySQL 5.5 database server with almost 400 GB of business-critical data (stored on external SSD) contained within several thousand different databases. My database administration skills and experience are nascent. I want to create a backup of this data to set up MySQL Replication, but I need to create the backup with minimized impact and downtime.
These databases are individually backed up with mysqldump about every four hours. This unfortunately means that I have no single, point-in-time, logical or raw backup of the entire database server and to top it off, binary logging is not enabled on that server. But I do have the capability to individually restore these backups.
In total, there about 250,000 tables in the database server. Of those tables, about 90,000 use the myisam engine and about 160,000 use the innodb engine.
I know there will be some downtime but I would really just like to avoid having downtime of an unknown duration during which I am obliged fully backup the data and deploy replication at the same time.
In testing, I've given thought to or tried various approaches:
- using Percona Xtrabackup
- using mysqldump with a single transaction (for innodb) and no locks for the myisam tables
- rsync'ing the mysql data directory, then gracefully shutting down the MySQL server, and rsync'ing the flushed out changes
- converting the myisam tables to innodb, then doing a mysqldump or using xtrabackup
- using my existing backups to start replication, then letting the slave catch up
- restoring my existing backups, then syncing the changes with pt-table-checksum and pt-table sync
- and the list can go on...
Without me providing excessive detail about my testing methods and results, I would like to know how you would approach this situation.
EDIT: In essence, my question is: With the goal of minimal downtime and given my scenario, how would you create a backup of the database server in anticipation of setting up MySQL Replication?
I would appreciate any advice, opinions, services, or resources you may have. Thank you.
Hman
(33 rep)
Sep 18, 2017, 07:56 PM
• Last activity: May 28, 2025, 11:06 PM
0
votes
1
answers
831
views
Stuck pt-archiver query during purge
I am purging a table of size 1.2T using `pt-archiver`. In the `show processlist` i can see the query stuck for 650 secs in `Sending data` state. The query is quick when I execute independely. Any help will be appreciated. [shell]# pt-archiver --source h=localhost,D=dsm,t=subscriber_event -u XX -pXXX...
I am purging a table of size 1.2T using
pt-archiver
. In the show processlist
i can see the query stuck for 650 secs in Sending data
state. The query is quick when I execute independely. Any help will be appreciated.
[shell]# pt-archiver --source h=localhost,D=dsm,t=subscriber_event -u XX -pXXX --where="created explain SELECT /*!40001 SQL_NO_CACHE */ id
FROM dsm
.subscriber_event
FORCE INDEX(PRIMARY
) WHERE (created SELECT /*!40001 SQL_NO_CACHE */ id
FROM dsm
.subscriber_event
FORCE INDEX(PRIMARY
) WHERE (created <= DATE_SUB(now(), interval 1 year)) AND (id
< '3873802696') ORDER BY id
LIMIT 10000;
| 3205169561 |
| 3205169562 |
| 3205169563 |
| 3205169564 |
+------------+
10000 rows in set (0.10 sec)
Please help me finding out the exact cause.
dragon
(95 rep)
Jan 16, 2021, 01:35 AM
• Last activity: Apr 26, 2025, 05:06 AM
2
votes
1
answers
506
views
Is it safe to use pt-online-schema-change in a multimaster environment?
I have 2 MySQL servers with row-based replication between them. Both of them are masters and slaves for each other (active-active master-master setup). If I understand it correctly pt-osc creates triggers to catch any changes while running. But from what I know triggers are not fired in a row-based...
I have 2 MySQL servers with row-based replication between them. Both of them are masters and slaves for each other (active-active master-master setup).
If I understand it correctly pt-osc creates triggers to catch any changes while running. But from what I know triggers are not fired in a row-based replication environment. So I guess pt-osc is not able to catch changes made on the second master during the change, is it?
EDIT: While doing some tests and I saw that pt-osc was creating triggers on both masters which would cover changes from both sides. Still I'm quite unsure if I can safely do online changes in this environment.
Mrskman
(121 rep)
Jan 15, 2019, 02:10 PM
• Last activity: Apr 21, 2025, 05:04 PM
1
votes
2
answers
579
views
Copying MySQL table to another table with no missing new changes
I would like to recreate/copy a MySQL table (a actual new ibd file) - and most of the suggestions recommends the following : CREATE TABLE mytbcopy LIKE mytb; INSERT INTO mytbcopy SELECT * FROM mytb; The suggestion works well - but from my understanding (correct me if I'm wrong), it does not insert/u...
I would like to recreate/copy a MySQL table (a actual new ibd file) - and most of the suggestions recommends the following :
CREATE TABLE mytbcopy LIKE mytb;
INSERT INTO mytbcopy SELECT * FROM mytb;
The suggestion works well - but from my understanding (correct me if I'm wrong), it does not insert/update records in the new table during the period of copying/recreating.
Example : During the copying, if record #45 is already inserted into the new table, and there was an update on record #45 on the original table - it won't be replicated over
Is there another way to ensure that after the full copying of data over, the data on the new table will be in the most updated state? I'm not sure using TRIGGERS is the solution for this.
SleepingHide123
(21 rep)
Dec 13, 2019, 02:37 AM
• Last activity: Apr 21, 2025, 01:03 PM
0
votes
0
answers
35
views
When does xtrabackup take lock exactly and why there's slave lag after backup starts?
MySQL Version -- MySQL Community Server 5.7.44 Percona xtrabackup Version 2.4.29 (x86_64) I want to understand when does xtrabackup exactly take the FLUSH TABLES WITH READ LOCK. Following is the simplified command I use for full backup. /usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slav...
MySQL Version -- MySQL Community Server 5.7.44
Percona xtrabackup Version 2.4.29 (x86_64)
I want to understand when does xtrabackup exactly take the FLUSH TABLES WITH READ LOCK.
Following is the simplified command I use for full backup.
/usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=bkp_user --tmpdir=/pfg_prod/tmp --parallel=4 --use-memory=1024M 2> bkp.log --stream=xbstream | /usr/bin/lz4 stdin /bkp_dir/date.lz4
I use this same command on all my slave servers for daily backups.
When I check the xtrabackup logs, I notice that it does not immediately take FTWRL but the slave lag begins to grow.
Log for the 9th
250309 11:13:18 Executing FLUSH TABLES WITH READ LOCK...
250309 11:13:32 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
250309 11:13:32 Executing UNLOCK TABLES
Log for the 10th
250310 11:13:55 Executing FLUSH TABLES WITH READ LOCK...
250310 11:14:05 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
250310 11:14:05 Executing UNLOCK TABLES
Log for the 15th
250315 11:21:17 Executing FLUSH TABLES WITH READ LOCK...
250315 11:21:18 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
250315 11:21:18 Executing UNLOCK TABLES
***The backup starts at 08:00 AM every morning.***
Thus, if it starts the backup at 08:00 AM and executes the FTWRL at around after 11:00 AM, I want to understand why there's slave lag immediately after the backup starts and the lag continues till the backup is over.
***I want to understand when does xtrabackup exactly take the FLUSH TABLES WITH READ LOCK if not at the start of the backup? and why there's slave lag as soon as the backup starts, because if the FTWRL is taken after approximately 3 hours, I think there shouldn't be lag before that, I am not sure though. The slave lag is only during the backup window.***
I had been assuming so far that FTWRL is taken when the backup starts, hence the doubt.
root@localhost 09:45:21 [(none)] show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: master_ip
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000982
Read_Master_Log_Pos: 30958
Relay_Log_File: relay.000430
Relay_Log_Pos: 1058083259
Relay_Master_Log_File: binlog.000981
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 10580
Relay_Log_Space: 1383
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 5872
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 103454371
Master_UUID: UUID
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: System lock
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: GTID-SET
Executed_Gtid_Set: GTID-SETS
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.25 sec)
+-------+-------------------+---------------------+-------------------+-------------+--------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-------------------+---------------------+-------------------+-------------+--------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ |
| 13 | system user | | NULL | Connect | 255661 | Waiting for master to send event | NULL |
| 14 | system user | | NULL | Connect | 0 | Waiting for dependent transaction to commit | NULL |
| 15 | system user | | NULL | Connect | 5931 | System lock | NULL |
| 16 | system user | | NULL | Connect | 5931 | Waiting for an event from Coordinator | NULL |
| 12664 | bkp_user | localhost | NULL | Sleep | 6377 | | NULL | |
| 12989 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 12990 | prome | 127.0.0.1:43240 | NULL | Query | 0 | Opening tables | SELECT table_schema, table_name, column_name, auto_increment,
pow(2, case data_type
when ' |
+-------+-------------------+---------------------+-------------------+-------------+--------+---------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
Avinash Pawar
(216 rep)
Mar 16, 2025, 09:37 AM
• Last activity: Mar 17, 2025, 02:24 AM
0
votes
0
answers
50
views
xtrabackup fails with error xb_stream_write_data() failed
I am facing a weird issue while using xtrabackup with xbstream. The weird part is that it was working fine just few days ago and apart from a reboot there were no changes on the machine. I have at least 100 machines with this exact same script, same OS, same packages and xtrabackup binary. Everythin...
I am facing a weird issue while using xtrabackup with xbstream. The weird part is that it was working fine just few days ago and apart from a reboot there were no changes on the machine. I have at least 100 machines with this exact same script, same OS, same packages and xtrabackup binary. Everything is same on all the machines. It works on all the machines except one.
OS Details
uname -a
5.10.227-219.884.amzn2.x86_64 #1 SMP Tue Oct 22 16:38:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
MySQL Version and xtrabackup version
/usr/bin/xtrabackup --version
/usr/bin/xtrabackup version 2.4.29 based on MySQL server 5.7.44 Linux (x86_64) (revision id: 2e6c0951)
mysql-community-server-5.7.44-1.el7.x86_64
xtrabackup command which causes error
/usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=backup_user --tmpdir=/data/tmp --parallel=4 --use-memory=1024MB --stream=xbstream | /usr/bin/lz4 - /S3-Mountpoint/backup/app_
date +%F
.lz4
The error I receive is
xtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
xb_stream_write_data() failed.
xtrabackup: Error: write to logfile failed
xtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
xtrabackup: Error: xtrabackup_copy_logfile() failed.
So while checking what could be wrong, I checked with local machine path and it too doesn't work, fails with same error
/usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=backup_user --tmpdir=/data/tmp --parallel=4 --use-memory=1024MB --stream=xbstream | /usr/bin/lz4 - /local_machine_path/app_date +%F
.lz4
I tried the same with tar and it does work
/usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=backup_user --tmpdir=/data/tmp --use-memory=1024MB --stream=tar | /usr/bin/gzip > /S3-Mountpoint/backup/app_date +%F
.tar.gz
Again I tried it without xbsteram on local machine and it works
/usr/bin/xtrabackup --defaults-file=/etc/my.cnf --backup --slave-info --login-path=bakcup_user --tmpdir=/data/tmp --parallel=4 --use-memory=1024MB --target-dir=/local_machine_path/app_date +%F
So my conclusion is that something is wrong with xbstream
, I am not sure though. Could this be a bug? Or am I missing something? Please help out.
Avinash Pawar
(216 rep)
Jan 20, 2025, 11:28 AM
0
votes
1
answers
62
views
Percona xtrabackup and percona toolkit for aarch64 Graviton Amazon Linux 2 machines
In my organisation, it was decided to move all the MySQL infrastructure (Amazon EC2 machines) to Graviton (aarch64) servers from AMD (x86_64). All are MySQL Community Servers. I need percona-xtrabackup and percona-toolkit installed on the new aarch64 (graviton) machines, but unfortunately, percona h...
In my organisation, it was decided to move all the MySQL infrastructure (Amazon EC2 machines) to Graviton (aarch64) servers from AMD (x86_64).
All are MySQL Community Servers.
I need percona-xtrabackup and percona-toolkit installed on the new aarch64 (graviton) machines, but unfortunately, percona has not yet released the aarch64 packages. There's a lot of dependency on percona-toolkit and xtrabackup. Lots of scripts are running using percona-toolkit tools and backups are running using xtrabackup.
I know these packages can be built from source also, but I have never done it before.
I was able to build percona xtrabackup using this link
https://docs.percona.com/percona-xtrabackup/8.0/compile-xtrabackup.html
and percona-toolkit (version 3.3.1) from this link https://github.com/percona/percona-toolkit/blob/3.x/INSTALL
Because I have never done this before and because Percona has not yet officially provided these packages, I am really not sure if building these packages from source will work on Graviton (aarch64) or not.
The graviton machines have Amazon Linux 2 on them. I am not even sure which Red Hat version is this.
cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/ "
SUPPORT_END="2025-06-30"
cat /proc/version
Linux version 5.10.219-208.866.amzn2.aarch64 (mockbuild@ip-10-0-61-241) (gcc10-gcc (GCC) 10.5.0 20230707 (Red Hat 10.5.0-1), GNU ld version 2.35.2-9.amzn2.0.1) #1 SMP Tue Jun 18 14:00:02 UTC 2024
rpm -E %{rhel}
7
Avinash Pawar
(216 rep)
Jul 20, 2024, 07:04 PM
• Last activity: Dec 21, 2024, 12:52 PM
0
votes
0
answers
29
views
Percona pt-query-digest vs. Performance Insights on AWS Aurora 3.x
Several years ago when I was running MySQL on-prem and then EC2, I found Percona pt-query-digest to be very useful to find poorly performing SQL in prod. It's been a while since I've used it. AWS Performance Insights has similar functionality. Other than the fact that it's free, are there any advant...
Several years ago when I was running MySQL on-prem and then EC2, I found Percona pt-query-digest to be very useful to find poorly performing SQL in prod. It's been a while since I've used it. AWS Performance Insights has similar functionality. Other than the fact that it's free, are there any advantages to pt-query-digest over PI?
Swechsler
(153 rep)
Oct 23, 2024, 07:57 PM
0
votes
0
answers
35
views
Percona pg_tde extension base backup failing
I am just asking here so that someone from Percona can check if possible. I have followed below documentation to configure pg_tde in RHEL 8 with PG version 16. https://percona.github.io/pg_tde/main/ After enabling and configuring pg_tde key provider and master key . pg_basebackup is getting failed w...
I am just asking here so that someone from Percona can check if possible.
I have followed below documentation to configure pg_tde in RHEL 8 with PG version 16.
https://percona.github.io/pg_tde/main/
After enabling and configuring pg_tde key provider and master key . pg_basebackup is getting failed with below error
[postgres@hostname postgres]$ pg_basebackup -D bkp_test -R -X stream -v
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/14000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_1391250"
WARNING: aborting backup due to backend exiting before pg_backup_stop was called
pg_basebackup: error: COPY stream ended before last file was finished
pg_basebackup: removing contents of data directory "bkp_test"
In the logs I can see below error.
2024-10-18 07:55:21.132 EDT LOG: checkpoint starting: force wait
2024-10-18 07:55:21.136 EDT LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.004 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16384 kB, estimate=27000 kB; lsn=0/14000060, redo lsn=0/14000028
2024-10-18 07:55:21.205 EDT WARNING: aborting backup due to backend exiting before pg_backup_stop was called
2024-10-18 07:55:21.205 EDT ERROR: invalid segment number 0 in file "pg_tde.map"
2024-10-18 07:55:21.205 EDT STATEMENT: BASE_BACKUP ( LABEL 'pg_basebackup base backup', PROGRESS, WAIT 0, MANIFEST 'yes', TARGET 'client')
2024-10-18 07:55:21.221 EDT LOG: unexpected EOF on standby connection
2024-10-18 07:55:21.221 EDT STATEMENT: START_REPLICATION SLOT "pg_basebackup_1391250" 0/14000000 TIMELINE 1
After loading shared library pg_tde on server level and configuring master key for specific DB level, I can see below files in base directory that is creating issue. but unable to understand .
./16505/pg_tde.map
./16505/pg_tde.dat
below are the databases with OID.
test=# select oid,datname from pg_database;
oid | datname
-------+-----------
5 | postgres
1 | template1
4 | template0
16505 | test
(4 rows)
Thanks for your help and suggestion if any.
Adam Mulla
(143 rep)
Oct 18, 2024, 12:06 PM
2
votes
1
answers
1931
views
xb_stream_read_chunk(): wrong chunk magic at offset 0x0
I have taken a backup of mysql instance like this ``` innobackupex \ --user=$MYUSER \ --password=$MYPASS \ --no-timestamp \ --parallel=$DUMP_THREADS \ --stream=xbstream \ --slave-info \ --extra-lsndir=$LSN_DIR \ --tmpdir=$TMP_DIR \ --no-lock \ --safe-slave-backup ``` I am trying to restore this back...
I have taken a backup of mysql instance like this
innobackupex \
--user=$MYUSER \
--password=$MYPASS \
--no-timestamp \
--parallel=$DUMP_THREADS \
--stream=xbstream \
--slave-info \
--extra-lsndir=$LSN_DIR \
--tmpdir=$TMP_DIR \
--no-lock \
--safe-slave-backup
I am trying to restore this backup like this.
zcat backup_file.xbs.gz 2>/dev/null | xbstream -x -C /var/lib/mysql/
I am getting this following error.
xb_stream_read_chunk(): wrong chunk magic at offset 0x0.
I have tried all google suggestions, upgraded xtrabackup to the latest version.
Nothing seems to be working.
vkrishna
(121 rep)
Nov 18, 2017, 09:07 AM
• Last activity: Aug 9, 2024, 10:09 AM
4
votes
3
answers
804
views
failed pt-online-schema-change left behind triggers. How to delete?
We had a failed `pt-osc` - the server ran out of disk space. And now the triggers are left behind ``` SHOW TRIGGERS; pt_osc_xxx_production_orders_ins pt_osc_xxx_production_orders_upd pt_osc_xxx_production_orders_del ``` How can these be deleted? Doing a `DROP TRIGGER` seems to lock the table (it has...
We had a failed
pt-osc
- the server ran out of disk space. And now the triggers are left behind
SHOW TRIGGERS;
pt_osc_xxx_production_orders_ins
pt_osc_xxx_production_orders_upd
pt_osc_xxx_production_orders_del
How can these be deleted?
Doing a DROP TRIGGER
seems to lock the table (it has 150 million rows) and looks like it will take some hours.
We cannot use pt-osc
on this table anymore because it fails with a Trigger already exists
error.
We are running on mysql Ver 8.0.32-24 for Linux on x86_64 (Percona Server (GPL), Release 24, Revision e5c6e9d2)
under CentOS8
(if that makes any difference)
phil
(153 rep)
Jun 22, 2023, 01:07 PM
• Last activity: Apr 12, 2024, 04:51 AM
10
votes
1
answers
3031
views
How to add index to a big table with 60M records without downtime?
we have been struggling with one issue in the past few days. We want to add an index to a huge table with 60M records. At first we tried adding it with basic mysql syntax. But it clogged our production DB. That table is used very frequently in production queries. So everything suffered. Our DB is ho...
we have been struggling with one issue in the past few days. We want to add an index to a huge table with 60M records. At first we tried adding it with basic mysql syntax. But it clogged our production DB. That table is used very frequently in production queries. So everything suffered.
Our DB is hosted on AWS RDS. Its Mysql 5.7. We are using Laravel as our PHP framework
Next thing we read about was, we can copy the current table into a new one. Then add index to the new table. Then shift the laravel model to use the new table. We thought it made sense and it would be easy enough
But copying the table data from one table to the new one, was taking quite a lot of time. Our calculations showed it would take days. We tried using Laravel as well as SQL commands. But it was too slow either way.
Then we tried exporting the data as CSV and importing it, but again, too slow. The first few million records would insert fast, but then the table would become extremely slow in inserting.
Finally we tried
mysqldump
and we realised it also locks the new table while inserting, so maybe that's why its fast enough. It took around 6 hours to copy the table into new one. BUT we were missing 2M records in this method. We also checked how many records came into the existing table while exporting/importing, it was only around a 100K. So the exporting/importing was missing 1.9M records, and we couldn't figure out why.
After going through all these different ways, we have decided to put the app in downtime and add the index on the huge table
I wanted to know, do others face this issue as well? Is there a way to either add indexes on a huge table without causing downtime on production? Or is there a faster way to copy a big mysql table without loss of data?
Rohan
(53 rep)
Dec 15, 2023, 12:19 PM
• Last activity: Dec 20, 2023, 10:24 AM
0
votes
1
answers
624
views
pt-online-schema-change drop_swap doesn't work, so what to do?
After waiting 24 hours for a `ptosc` this: ``` 2023-05-29T11:29:40 Copied rows OK. 2023-05-29T11:29:40 Max rows for the rebuild_constraints method: 2710 Determining the method to update foreign keys... 2023-05-29T11:29:40 `xxx_production`.`click_tracks`: too many rows: 4325947; must use drop_swap --...
After waiting 24 hours for a
ptosc
this:
2023-05-29T11:29:40 Copied rows OK.
2023-05-29T11:29:40 Max rows for the rebuild_constraints method: 2710
Determining the method to update foreign keys...
2023-05-29T11:29:40 xxx_production
.click_tracks
: too many rows: 4325947; must use drop_swap
--alter-foreign-keys-method=drop_swap doesn't work with MySQL 8.0+
See https://bugs.mysql.com/bug.php?id=89441
2023-05-29T11:29:40 Dropping triggers...
2023-05-29T11:29:42 Dropped triggers OK.
Not dropping the new table xxx_production
._orders_new
because --swap-tables failed. To drop the new table, execute:
DROP TABLE IF EXISTS xxx_production
._orders_new
;
xxx_production
.orders
was not altered.
orders
is a table with 136 million rows. But I think the issue is the click_tracks
table that has 4.3 million rows. If drop_swap
must be used, but the next line says drop_swap
doesn't work on MySQL8....what are we supposed to do exactly?
EDIT:
*Before*
mysql> describe orders;
+----------------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------------------+--------------+------+-----+---------+----------------+
...
| item_promotion_id | int | YES | | NULL | |
*After*
| item_promotion_id | varchar(25) | YES | | NULL | |
*Command*
pt-online-schema-change --critical-load='Threads_running=600' --alter-foreign-keys-method=auto --execute --alter "MODIFY COLUMN item_promotion_id varchar(25)" D=xxx_production,t=orders
Column (item_promotion_id) is NOT in a FK or used in an INDEX.
Could the issue actually be a FK between orders
and click_tracks
?
EDIT 2:
Sadly, 'just' running the ADD COLUMN c VARCHAR, ALGORITHM=INPLACE;
fails on tables so large because we hit ERROR 1062 (23000): Duplicate entry
.
This is described in the MySQL documentation .
So it seems we are back to pt-osc
but instead of doing an ALTER
doing an ADD
.
EDIT 3:
Trying to do an ADD COLUMN using pt-osc
results in the same failure!
pt-online-schema-change --critical-load='Threads_running=600' --alter-foreign-keys-method=auto --execute --alter "ADD COLUMN item_promotion_ref varchar(25)" D=xxx_production,t=orders
2023-06-01T06:11:27 Max rows for the rebuild_constraints method: 3090
Determining the method to update foreign keys...
2023-06-01T06:11:27 xxx_production
.click_tracks
: too many rows: 4325947; must use drop_swap
--alter-foreign-keys-method=drop_swap doesn't work with MySQL 8.0+
See https://bugs.mysql.com/bug.php?id=89441
2023-06-01T06:11:27 Dropping triggers...
2023-06-01T06:11:28 Dropped triggers OK.
Not dropping the new table xxx_production
._orders_new
because --swap-tables failed. To drop the new table, execute:
DROP TABLE IF EXISTS xxx_production
._orders_new
;
xxx_production
.orders
was not altered.
I think there is nothing I can do with this table.
Time to hire a DBA.
phil
(153 rep)
May 29, 2023, 12:48 PM
• Last activity: Jun 1, 2023, 06:53 AM
1
votes
1
answers
998
views
Run pt-online-schema-change with multiple ALTER queries synchronously
I want to run 3 ALTER queries with pt-online-schema-change tool: --alter "ADD INDEX userid_sid_ts_fid (user_id, scorecard_id, timestamp, factor_id), DROP INDEX uidts, RENAME INDEX userid_sid_ts_fid to uidts" However I face this error: Error altering new table `*****`.`_scoring_basis_new`: DBD::mysql...
I want to run 3 ALTER queries with pt-online-schema-change tool:
--alter "ADD INDEX userid_sid_ts_fid (user_id, scorecard_id, timestamp, factor_id), DROP INDEX uidts, RENAME INDEX userid_sid_ts_fid to uidts"
However I face this error:
Error altering new table
*****
._scoring_basis_new
: DBD::mysql::db do failed: Key 'userid_sid_ts_fid' doesn't exist in table '_scoring_basis_new'
So it looks like is trying to run these 3 queries asynchronously rather than 1 by 1. How can I prevent that?
Diego
(113 rep)
Dec 15, 2022, 12:30 PM
• Last activity: Dec 15, 2022, 04:11 PM
0
votes
1
answers
73
views
Can one use a view as a source on pt-table-sync?
Any attempt to fill a table from a view results in an error, as the source-table does not exist: Error getting table structure for TABLE_NAME on SOURCE_DSN_DATA doesnt handle CREATE TABLE without quoting. at /usr/local/bin/pt-table-sync line 2872. Ensure that the table exists and is accessible. whil...
Any attempt to fill a table from a view results in an error, as the source-table does not exist:
Error getting table structure for TABLE_NAME on SOURCE_DSN_DATA doesnt handle CREATE TABLE without quoting.
at /usr/local/bin/pt-table-sync line 2872. Ensure that the table exists and is accessible.
while doing TABLE_NAME on DESTINATION_NAME
I know that the source table doesn't exist, as I am trying to use a view as a source instead. Access is not a problem either. As tests with a source-table instead of a source-view confirmed.
In the DSN, I use
d=
and t=
to define the database and table/view.
Replacing the view with a table and running the pt-table-sync
again works as expected, so there seem to be no issue in the command/parameters themselves.
pt-table-sync --execute DSN_SOURCE DSN_TARGET --verbose --print
The documentation by percona does not indicate any additional parameters for what I want to do.
Is there any way to use a view as a source on pt-table-sync?
A-Tech
(217 rep)
Dec 12, 2022, 01:58 PM
• Last activity: Dec 12, 2022, 07:49 PM
1
votes
1
answers
585
views
how to setup percona pmm monitoring of percona 5.7 in docker container
I want to monitor percona running in docker with the percona monitoring and management client. I have percona monitoring and management installed, running and monitoring other percona instances (https://www.percona.com/doc/percona-monitoring-and-management/deploy/index.html). I recently added percon...
I want to monitor percona running in docker with the percona monitoring and management client.
I have percona monitoring and management installed, running and monitoring other percona instances (https://www.percona.com/doc/percona-monitoring-and-management/deploy/index.html) . I recently added percona (currently version 5.7) running in docker container (pulled from https://github.com/docker-library/percona) .
I found an open feature request to add the pmm-admin executable from inside PMM Server at https://jira.percona.com/browse/PMM-627
Is there way to setup to monitor my percona instance running docker?
washingon
(111 rep)
Nov 24, 2017, 12:55 PM
• Last activity: Oct 3, 2022, 12:04 PM
0
votes
1
answers
255
views
pt-online-schema-change's triggers fail with "DELETE command denied" on insert
We recently tried pt-online-schema-change to add a column to a table. It worked mostly as expected but one thing puzzles me: in case the account we use to do the migration doesn't have the DELETE permission, our application (which keeps running simultaneously) gets errors that say "DELETE command de...
We recently tried pt-online-schema-change to add a column to a table. It worked mostly as expected but one thing puzzles me: in case the account we use to do the migration doesn't have the DELETE permission, our application (which keeps running simultaneously) gets errors that say "DELETE command denied to user 'pt'@'localhost' for table '_xxx_new'" when the app is performing an **insert** on the *xxx* table.
My understanding is that the triggers are to do **inserts** to the destitation *_xxx_new* table when a new record is inserted in the source *xxx* table. How come it may fail with not having a DELETE permission??
Percona docs are pretty generic on this and no googling helped, so will be thankful for any ideas!
Dmitry Novoselov
(11 rep)
Mar 30, 2017, 03:27 PM
• Last activity: Apr 4, 2022, 08:05 PM
3
votes
2
answers
1434
views
Prometheus High Memory and CPU Usage in PMM
We are running PMM v1.17.0 and prometheus is causing huge cpu and mem usage (200% CPU and 100% RAM), and pmm went down because of this. We are running PMM on a VM with 2vCPUs and 7.5G RAM, and are monitoring about 25 servers. PMM is running with below command >> docker run -d -it --volumes-from pmm-...
We are running PMM v1.17.0 and prometheus is causing huge cpu and mem usage (200% CPU and 100% RAM), and pmm went down because of this. We are running PMM on a VM with 2vCPUs and 7.5G RAM, and are monitoring about 25 servers. PMM is running with below command >>
docker run -d -it --volumes-from pmm-data --name pmm-server -e QUERIES_RETENTION=1095 -p 80:80 -e METRICS_RESOLUTION=3s --restart always percona/pmm-server:1
The prometheus.log is filled with below entries:
level=warn ts=2020-01-30T10:27:12.8156514Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="append failed" err="out of order sample"
level=warn ts=2020-01-30T10:27:26.464361371Z caller=scrape.go:945 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.223:42002/metrics-mr msg="Error on ingesting samples with different value but same timestamp" num_dropped=1
level=warn ts=2020-01-30T10:27:27.81316996Z caller=scrape.go:942 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="Error on ingesting out-of-order samples" num_dropped=2
level=warn ts=2020-01-30T10:27:27.813257165Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="append failed" err="out of order sample"
level=warn ts=2020-01-30T10:27:41.462420708Z caller=scrape.go:945 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.223:42002/metrics-mr msg="Error on ingesting samples with different value but same timestamp" num_dropped=1
level=warn ts=2020-01-30T10:27:42.813356387Z caller=scrape.go:942 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="Error on ingesting out-of-order samples" num_dropped=2
level=warn ts=2020-01-30T10:27:42.813441108Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="append failed" err="out of order sample"
level=warn ts=2020-01-30T10:27:56.463798729Z caller=scrape.go:945 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.223:42002/metrics-mr msg="Error on ingesting samples with different value but same timestamp" num_dropped=1
level=warn ts=2020-01-30T10:27:57.82083775Z caller=scrape.go:942 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="Error on ingesting out-of-order samples" num_dropped=2
level=warn ts=2020-01-30T10:27:57.820912309Z caller=scrape.go:713 component="scrape manager" scrape_pool=mysql-mr target=https://10.40.4.21:42002/metrics-mr msg="append failed" err="out of order sample"
Can someone please let me know why prometheus is causing issue? Any parameters we need to add/change?
user5594148
(43 rep)
Jan 30, 2020, 10:56 AM
• Last activity: Nov 14, 2021, 01:01 AM
0
votes
1
answers
420
views
Drop column with percona does not shrink table size
I have a table in Aurora MySQL 5.7. table has few partitions with 800m rows and weights 2tb. Recently I dropped few column using percona. Surprisingly the table size did not change (looking in `information_schema.tables`. The way percona doing a change is using new table `_ _new` with triggers on th...
I have a table in Aurora MySQL 5.7. table has few partitions with 800m rows and weights 2tb.
Recently I dropped few column using percona. Surprisingly the table size did not change (looking in
information_schema.tables
.
The way percona doing a change is using new table __new
with triggers on the original table. it creates an empty new table with same DDL, execute the changes we wish, and copies everything to the new table with the triggers to keep it up to date. once the data is synced - percona renames the tables and drop the old one. So the table is been build from scratch (without locking).
However, after running alter table optimize partition
I saw the size shrunk to 250gb. Anyone have explanation or know what I did wrong?
pt command:
pt-online-schema-change --user $MYSQL_DBA_USER --password $MYSQL_DBA_PASS --host $MYSQL_WRITER D=db,t=table_data --alter "drop column a1, drop column a2" --execute --max-load Threads_running=18446744073709551606 --critical-load Threads_running=18446744073709551606 --recursion-method=none
optimize command:
MySQL [(db)]> select table_rows,data_length/power(1024,3), index_length/power(1024,3),DATA_FREE/power(1024,3),AVG_ROW_LENGTH from information_schema.tables where table_name='table_data';
+------------+---------------------------+----------------------------+-------------------------+----------------+
| table_rows | data_length/power(1024,3) | index_length/power(1024,3) | DATA_FREE/power(1024,3) | AVG_ROW_LENGTH |
+------------+---------------------------+----------------------------+-------------------------+----------------+
| 610884663 | 1847.7273712158203 | 202.40484619140625 | 0.0322265625 | 3247 |
+------------+---------------------------+----------------------------+-------------------------+----------------+
1 row in set (0.00 sec)
MySQL [db]> ALTER TABLE table_data OPTIMIZE PARTITION p20210601;
+---------------+----------+----------+---------------------------------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+---------------+----------+----------+---------------------------------------------------------------------------------------------+
| db.table_data | optimize | note | Table does not support optimize on partitions. All partitions will be rebuilt and analyzed. |
| db.table_data | optimize | status | OK |
+------------------------+----------+----------+---------------------------------------------------------------------------------------------+
2 rows in set (5 hours 39 min 40.95 sec)
MySQL [db]>
MySQL [db]> select table_rows,data_length/power(1024,3), index_length/power(1024,3),DATA_FREE/power(1024,3),AVG_ROW_LENGTH from information_schema.tables where table_name='table_data';
+------------+---------------------------+----------------------------+-------------------------+----------------+
| table_rows | data_length/power(1024,3) | index_length/power(1024,3) | DATA_FREE/power(1024,3) | AVG_ROW_LENGTH |
+------------+---------------------------+----------------------------+-------------------------+----------------+
| 736965899 | 104.25639343261719 | 155.98052978515625 | 0.0244140625 | 151 |
+------------+---------------------------+----------------------------+-------------------------+----------------+
Nir
(529 rep)
Nov 1, 2021, 01:14 PM
• Last activity: Nov 2, 2021, 12:57 PM
-1
votes
1
answers
994
views
InnoDB: Assertion failure on executing select Query - MySQL 5.7.31
I am using pt-archiver for daily archiving of tables, but while selecting data from one tables I am getting following error and it restart mysql instance 2021-07-07 13:21:17 0x7fe0dffdc700 InnoDB: Assertion failure in thread 140603807352576 in file btr0pcur.cc line 46 I run pt-archiver with --dry-ru...
I am using pt-archiver for daily archiving of tables, but while selecting data from one tables I am getting following error and it restart mysql instance
2021-07-07 13:21:17 0x7fe0dffdc700 InnoDB: Assertion failure in thread 140603807352576 in file btr0pcur.cc line 46
I run pt-archiver with --dry-run and following is my select query
SELECT /*!40001 SQL_NO_CACHE */
irig_time
,device_id
,message_id
,mode
,protection_1
,protection_2
,protection_3
,protection_4
,alarm_1
,alarm_2
,alarm_3
,alarm_4
,grid_switch_control
,dc_switch_1_on
,dc_switch_2_on
,additional_feedback_external_sensor
,module_communication_fault_position
FROM acbm_status_v2_0_0
FORCE INDEX(PRIMARY
) WHERE (DATE(irig_time)=DATE_SUB(CURDATE(), INTERVAL 1 DAY)) ORDER BY irig_time
,device_id
LIMIT 200
If i run this query manually still getting assertion error and it restart mysql instance.
Following is table structure
Table: acbm_status_v2_0_0
Columns:
irig_time datetime(6) PK
device_id int(11) PK
message_id bigint(20) UN
mode varchar(64)
protection_1 int(10) UN
protection_2 int(10) UN
protection_3 int(10) UN
protection_4 int(10) UN
alarm_1 int(10) UN
alarm_2 int(10) UN
alarm_3 int(10) UN
alarm_4 int(10) UN
grid_switch_control tinyint(1)
dc_switch_1_on tinyint(1)
dc_switch_2_on tinyint(1)
additional_feedback_external_sensor tinyint(1)
module_communication_fault_position int(10) UN
Below is complete trace
2021-07-07 13:21:17 0x7fe0dffdc700 InnoDB: Assertion failure in thread 140603807352576 in file btr0pcur.cc line 461
InnoDB: Failing assertion: page_is_comp(next_page) == page_is_comp(page)
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com .
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
13:21:17 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=18
max_threads=500
thread_count=18
connection_count=17
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 206883 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7fe01c000d40
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fe0dffdbe60 thread_stack 0x40000
mysqld(my_print_stacktrace+0x2c)[0x556a3c9cab7c]
mysqld(handle_fatal_signal+0x501)[0x556a3c2e1f01]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7fe1fffaa730]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7fe1ffa857bb]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x121)[0x7fe1ffa70535]
mysqld(+0x6c1083)[0x556a3c2a9083]
mysqld(+0x6c30da)[0x556a3c2ab0da]
mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0xd03)[0x556a3cc699a3]
mysqld(_ZN11ha_innobase13general_fetchEPhjj+0xdf)[0x556a3cb6d4af]
mysqld(_ZThn760_N11ha_innopart18index_next_in_partEjPh+0x2d)[0x556a3cb8351d]
mysqld(_ZN16Partition_helper19handle_ordered_nextEPhb+0x299)[0x556a3c714199]
mysqld(_ZN7handler13ha_index_nextEPh+0x1c5)[0x556a3c3358d5]
mysqld(+0xb932dc)[0x556a3c77b2dc]
mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x18f)[0x556a3c7817cf]
mysqld(_ZN4JOIN4execEv+0x20b)[0x556a3c77aacb]
mysqld(_Z12handle_queryP3THDP3LEXP12Query_resultyy+0x2e0)[0x556a3c7e2d50]
mysqld(+0xbbd45b)[0x556a3c7a545b]
mysqld(_Z21mysql_execute_commandP3THDb+0x4924)[0x556a3c7ac564]
mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3dd)[0x556a3c7ae94d]
mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x1062)[0x556a3c7afa22]
mysqld(_Z10do_commandP3THD+0x207)[0x556a3c7b0d67]
mysqld(handle_connection+0x298)[0x556a3c8690c8]
mysqld(pfs_spawn_thread+0x157)[0x556a3ce77cd7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7fe1fff9ffa3]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fe1ffb474cf]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fe01c004860): SELECT /*!40001 SQL_NO_CACHE */ irig_time
,device_id
,message_id
,mode
,protection_1
,protection_2
,protection_3
,protection_4
,alarm_1
,alarm_2
,alarm_3
,alarm_4
,grid_switch_control
,dc_switch_1_on
,dc_switch_2_on
,additional_feedback_external_sensor
,module_communication_fault_position
FROM ycube2
.acbm_status_v2_0_0
FORCE INDEX(PRIMARY
) WHERE (DATE(irig_time)=DATE_SUB(CURDATE(), INTERVAL 1 DAY)) ORDER BY irig_time
,device_id
LIMIT 1000
Connection ID (thread ID): 41
Status: NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
ImranRazaKhan
(149 rep)
Jul 9, 2021, 11:39 AM
• Last activity: Jul 9, 2021, 02:57 PM
Showing page 1 of 20 total questions