Using pt-osc(online schema change) in Multi-Master(Active/Passive) environment
0
votes
0
answers
32
views
MySQL version : Percona Server for MySQL 5.7.40-43-log
We have Master-Master(Active-Passive) Replication setup in our cluster. Only one Master(Active) is receiving the requests for writes while other one is Passive. Each Master then have 4 Replicas to it. There is bi-directional replica going on between Active and Passive Master.
Now, I need to perform an ALTER to a huge table that has ~600M rows and the table size is around : 250G. All the replicas including the Masters(Since they are replicas for each other anyways) have the following flag for replication : **Replicate_Wild_Do_Table as data%.%** so I am not concerned about the _new, _old table creation while actual execution of the pt-osc because **all the tables(%)** are included in replication.
Before running the pt-osc on the production, I would like to test it on a staging cluster so I have created a cluster like A --> B --> C where A is like my Active Master, B is my Passive Master and C is my Replica pointing to B. Please note that I have not set up Master-Master replication between A B. My main concern is with the replicas that are present in the production cluster as in while doing it on the Master, there shouldn't be any issues with the replication and the replicas should also have the updated change to the table.
I have used the following flags in the pt-osc statement but I am getting error with recurse & recursion-method
pt-online-schema-change --alter "ADD COLUMN Cost decimal(18,9) NOT NULL DEFAULT '0.000000000'" \
--alter-foreign-keys-method=auto \
--check-slave-lag=50 \
--max-lag=50 \
--recurse \
--recursion-method=processlist \
--host=replica-015 \
--max-load=Threads_running=30 \
--critical-load=Threads_running=50 \
--chunk-time=0.5 \
--slave-user=repl \
--slave-password='xxxxxxx' \
--chunk-size=10000 \
--progress percentage,1 \
--dry-run \
--user=test \
--password='xxxxxxx' \
D=avon,t=excel
I tried processlist, hosts but that didn't worked and I got the following error :
I updated the pt-osc command and used --recurse=1 and --recursion-method=processlist and the message is gone but I don't completely get it as to what recurse or recursion-method did here since in the out I don't see any slaves detected vs in some other output I saw online that says no slave detected.
I not completely sure about the DSN table which is mentioned in the percona documentation : https://docs.percona.com/percona-toolkit/pt-online-schema-change.html
Can someone help me here in understanding the replication related flag in pt-osc command and how should I proceed with the --dry-run or --execute in the staging cluster environment that I have created.
Thank you in advance.



Asked by msbeast
(21 rep)
Nov 15, 2024, 09:36 PM