Phenomenon over add column to alter table with 300M records
0
votes
1
answer
68
views
In the context of an existing question that I had asked https://dba.stackexchange.com/questions/345282/performant-way-to-perform-update-over-300m-records-in-mysql . I assumed that the alteration of table schema would possible be trivial for the following:
As soon, as the command was triggered I expected that the CPU was consistently be high, but that was evident only for around 20 mins, then there was a sudden dip to 10 % (**Q1**: What underlying event could cause this to occur?) which further went below 5% and remained consistently on this unless the execution of the command completed.
**Q2.** The command execution took **~35hrs**, is there a scope of improving this further?
**Update**: MYSQL version in use is 5.7.0
interaction | CREATE TABLE interaction
(
id
varchar(36) NOT NULL,
entity_id
varchar(36) DEFAULT NULL,
request_date
datetime DEFAULT NULL,
user_id
varchar(255) DEFAULT NULL,
sign
varchar(1) DEFAULT NULL,
PRIMARY KEY (id
),
KEY entity_id_idx
(entity_id
),
KEY user_id_idx
(user_id
),
KEY req_date_idx
(request_date
)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Apparently the best command that is recommended is to not aid any default value to the adde column and proceed with something like
ALTER TABLE interaction
ALGORITHM=INPLACE,
LOCK=NONE,
ADD COLUMN type VARCHAR(32);
I had attempted this on a test machine, but I would want to understand the underlying behaviour of MYSQL server to process this command.
Here is a screenshot for reference of observability from GCP over the VM this server is hosted on:

Query OK, 0 rows affected (1 day 10 hours 46 min 13.20 sec)
Records: 0 Duplicates: 0 Warnings: 0
**Q3.** Here is the reference of observability around this duration, is this behaviour guaranteed to be consistent, such that I can consider that if I run the same command on a production environment which involves read/write apart from this query, the resource consumption would remain similar and we can sustain through? [My test VM disk is 300 GB, do I really need to consider this much additional disk or more for executing the query mentioned on production too?]

Asked by Naman
(123 rep)
Feb 16, 2025, 05:45 PM
Last activity: Feb 18, 2025, 01:51 AM
Last activity: Feb 18, 2025, 01:51 AM