Sample Header Ad - 728x90

Database crash just before appending the checkpoint entry to write ahead log

1 vote
1 answer
53 views
- From what I read about WAL, its an append-only file where all the operations to the DB are written to before the operations are actually performed to the data. - There is also a concept of a "checkpoint" which is when the DB actually writes the data to disk from memory, and appends a special checkpoint entry at the end of the WAL. - Now if the DB crashes at any point, it can read the WAL starting from the latest checkpoint entry and redo all the subsequent operations. - But how does the DB ensure that the checkpoint WAL entry and the actual flushing of the data to disk happen in a transactional way? - What if the data is flushed but the DB crashes before the checkpoint entry is made in the WAL? - Conversely, if the WAL is modified first, then what happens if the DB crashes after the checkpoint entry but before the data is actually flushed. For example, consider the following case: - We have a dummy table Person(name, age, salary). - It has an entry John, 25, 100. - At time T1, a new transaction arrives UPDATE Person SET salary += 100 WHERE name='John'. - Assume that before T1, all the data had been flushed and the checkpoint entry had been appended to WAL. - Now after this transaction, the DB will first append the log with the exact transaction statement UPDATE Person SET salary += 100 WHERE name='John'. - Now the data become John, 25, 200. - Then after some time, lets say the DB decides to flush the data to disk at time T2. - Then at time T3 (just after T2), the DB attempts to write the checkpoint entry to the WAL. - However, before it could finish, there was a power failure between T2 and T3. - Now when the DB restarts and tries to recover, it will notice that there is one transaction after the latest checkpoint and will try to execute that: UPDATE Person SET salary += 100 WHERE name='John' - But since the transaction was already executed before the crash, this time the salary will take the value 300, although it should have been 200. How does the DB prevent these redundant updates during the recovery?
Asked by Anmol Singh Jaggi (113 rep)
Oct 2, 2024, 05:25 AM
Last activity: Oct 2, 2024, 11:38 AM