Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
0
votes
0
answers
16
views
Dimensional Modelling Standard Practices for Periodic Snapshot Table
Our company is relatively new to using dimensional models but we have a need for viewing account balances at certain points in time. Our company has billions of customer accounts so to take daily snapshots of these balances would be millions per day (excluding 0 dollar balances because our business...
Our company is relatively new to using dimensional models but we have a need for viewing account balances at certain points in time. Our company has billions of customer accounts so to take daily snapshots of these balances would be millions per day (excluding 0 dollar balances because our business model closes accounts once reaching 0).
What I've imagined was creating a periodic snapshot fact table where the balance for each account would utilize the snapshot from the end of the day but only include rows for end of week, end of month, and yesterday (to save memory and processing for days we are not interested in); then utilize a flag in the date dimension table to filter to monthly dates, weekly dates, or current data. I know standard periodic snapshot tables have predefined intervals; to me this sounds like a daily snapshot table that utilizes the dimension table to filter to the dates you're interested in.
My leadership seems to feel that this should be broken out into three different fact tables (current, weekly, monthly). I feel that this is excessive because it's the same calculation (all time balance at end of day) and could have overlap (i.e. yesterday could be end of week and end of month). Since this is balances at a point in time at end of day and there is no aggregations to achieve "weekly" or "monthly" data, what is standard practice here? Should we take leadership's advice or does it make more sense the way I envisioned it? Either way can someone give me some educational texts to support your opinions for this scenario?
I should also specify that there is already a traditional snapshot source table (not dimensionally modelled) that captures balances with start and end date for each balance change
Bolt
(1 rep)
Jul 27, 2025, 04:56 PM
• Last activity: Jul 28, 2025, 07:24 AM
3
votes
1
answers
30
views
Can Transact-SQL snapshot backups interface with Amazon EBS?
I have recently discovered [Transact-SQL snapshot backups](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/create-a-transact-sql-snapshot-backup?view=sql-server-ver17). They let you interact with snapshot backups, like this flowcharts explains. [ . They let you interact with snapshot backups, like this flowcharts explains.
It is obvious that this is compatible with a traditional SAN. I have found that EBS has [something similar](https://aws.amazon.com/blogs/modernizing-with-aws/automating-sql-server-point-in-time-recovery-using-ebs-snapshots/) , but are the EBS-style snapshots compatible with Transact-SQL snapshot backups? That is, can EBS generate snapshots that would agree with syntax like this?

BACKUP DATABASE testdb1
TO DISK = 'D:\Temp\db.bkm'
WITH METADATA_ONLY, FORMAT;
J. Mini
(1225 rep)
Jul 19, 2025, 08:46 PM
• Last activity: Jul 20, 2025, 07:53 AM
-2
votes
0
answers
63
views
Can Transact-SQL snapshot backups be used to recover from database corruption?
The most typical corruption recovery scenario involves using a [full database backup](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/create-a-full-database-backup-sql-server?view=sql-server-ver17) from before corruption struck and every [transaction log backup](https://lea...
The most typical corruption recovery scenario involves using a [full database backup](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/create-a-full-database-backup-sql-server?view=sql-server-ver17) from before corruption struck and every [transaction log backup](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/back-up-a-transaction-log-sql-server?view=sql-server-ver17) between then and now.
I have recently discovered [Transact-SQL snapshot backups](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/create-a-transact-sql-snapshot-backup?view=sql-server-ver17) . Like full backups, you can replay transaction logs on top of them. As this Microsoft diagram shows
Does this suggest that Transact-SQL snapshot backups can be used to recover from database corruption, just like full database backups can?

J. Mini
(1225 rep)
Jul 11, 2025, 08:13 PM
2
votes
1
answers
182
views
Snapshot Folder Location for Replication with Clustered Instance
I am assigned to create a transactional replication with a SQL Server clustered instance for reporting purposes. I'm planning to use the Distributor and the Subscriber on the same box. Regarding the snapshot folder, is it ok to place it on the same box where the Distributor\Subscriber resides? Will...
I am assigned to create a transactional replication with a SQL Server clustered instance for reporting purposes. I'm planning to use the Distributor and the Subscriber on the same box.
Regarding the snapshot folder, is it ok to place it on the same box where the Distributor\Subscriber resides? Will there by any problems?
SQL_NoExpert
(1117 rep)
Sep 11, 2018, 02:24 PM
• Last activity: Jul 6, 2025, 01:06 AM
0
votes
3
answers
194
views
How to restore snapshots to another table?
Someone accidentally deleted several records from Cassandra's table in the cluster. I want to restore the last snapshot to another copy table with the same structure within the cluster. What is the best way to do that? Cassandra v4.0.4
Someone accidentally deleted several records from Cassandra's table in the cluster.
I want to restore the last snapshot to another copy table with the same structure within the cluster.
What is the best way to do that?
Cassandra v4.0.4
EdiM
(51 rep)
May 13, 2024, 07:29 AM
• Last activity: Jul 5, 2025, 02:02 PM
1
votes
1
answers
70
views
How can I version my knowledge database?
I have a large number of documents (>= 1 million or more). I need to implement snapshot-style versioning, but I can't figure it out. It should be able to take snapshots from anywhere. (like git?) For example, if have Snap1, add data, take Snap2, and then roll back to Snap1, add data, and take Snap3....
I have a large number of documents (>= 1 million or more).
I need to implement snapshot-style versioning, but I can't figure it out.
It should be able to take snapshots from anywhere. (like git?)
For example, if have Snap1, add data, take Snap2,
and then roll back to Snap1, add data, and take Snap3.
And it should only be able to query on each snapshot.
Snap1 ---- Snap2
---- Snap3 ---- Snap 4
How can I design such a structure?
I was thinking of creating 1 million snapshot id - document id
s every time a snapshot is created and then joining them, but that doesn't seem like a good idea.
In more detail, this should be possible to implement in milvus (of course, snapshot metadata would probably have to be stored in RDBMS/NoSQL).
user212942
(61 rep)
Jun 2, 2025, 05:25 PM
• Last activity: Jun 7, 2025, 07:00 AM
0
votes
1
answers
296
views
Can't bring up slave from ec2-consistent-snapshot due to uncommitted prepared transaction
I'm struggling with bringing up a slave instance using a snapshot created by `ec2-consistent-snapshot`, in my log it's describing the fact that an unprocessed transaction exists, but isn't that what ec2-consistent-snapshot is supposed to prevent? My execution statement for creating snapshots is as f...
I'm struggling with bringing up a slave instance using a snapshot created by
ec2-consistent-snapshot
, in my log it's describing the fact that an unprocessed transaction exists, but isn't that what ec2-consistent-snapshot is supposed to prevent?
My execution statement for creating snapshots is as follows... _(forgive the ansible variable placeholders)_
/usr/local/bin/ec2-consistent-snapshot-master/ec2-consistent-snapshot -q --aws-access-key-id {{ aws.access_key }} --aws-secret-access-key {{ aws.secret_key }} --region {{ aws.region }} --tag "Name={{ inventory_hostname }};Role={{ mysql_repl_role }}" --description "Database backup snapshot - {{ inventory_hostname_short }}" --freeze-filesystem /mnt/perconadata --percona --mysql-host localhost --mysql-socket /mnt/perconadata/mysql.sock --mysql-username root --mysql-password {{ mysql_root_password }} $VOLUME_ID
And the log resulting from the failed attempt to bring it up on the slave is as follows... (
InnoDB: Doing recovery: scanned up to log sequence number 64107621643
InnoDB: Transaction 1057322289 was in the XA prepared state.
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 0 row operations to undo
InnoDB: Trx id counter is 1057322752
2017-01-27 14:33:44 11313 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 33422772, file name mysql-bin.000011
2017-01-27 14:33:46 11313 [Note] InnoDB: 128 rollback segment(s) are active.
InnoDB: Starting in background the rollback of uncommitted transactions
2017-01-27 14:33:46 7f3a90c75700 InnoDB: Rollback of non-prepared transactions completed
2017-01-27 14:33:46 11313 [Note] InnoDB: Waiting for purge to start
2017-01-27 14:33:46 11313 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.34-79.1 started; log sequence number 64107621643
CONFIG: num_threads=8
CONFIG: nonblocking=1(default)
CONFIG: use_epoll=1
CONFIG: readsize=0
CONFIG: conn_per_thread=1024(default)
CONFIG: for_write=0(default)
CONFIG: plain_secret=(default)
CONFIG: timeout=300
CONFIG: listen_backlog=32768
CONFIG: host=(default)
CONFIG: port=9998
CONFIG: sndbuf=0
CONFIG: rcvbuf=0
CONFIG: stack_size=1048576(default)
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
CONFIG: num_threads=1
CONFIG: nonblocking=1(default)
CONFIG: use_epoll=1
CONFIG: readsize=0
CONFIG: conn_per_thread=1024(default)
CONFIG: for_write=1
CONFIG: plain_secret=
CONFIG: timeout=300
CONFIG: listen_backlog=32768
CONFIG: host=(default)
CONFIG: port=9999
CONFIG: sndbuf=0
CONFIG: rcvbuf=0
CONFIG: stack_size=1048576(default)
CONFIG: wrlock_timeout=12
CONFIG: accept_balance=0
handlersocket: initialized
2017-01-27 14:33:46 7f3dfe768840 InnoDB: Starting recovery for XA transactions...
2017-01-27 14:33:46 7f3dfe768840 InnoDB: Transaction 1057322289 in prepared state after recovery
2017-01-27 14:33:46 7f3dfe768840 InnoDB: Transaction contains changes to 1 rows
2017-01-27 14:33:46 7f3dfe768840 InnoDB: 1 transactions in prepared state after recovery
2017-01-27 14:33:46 11313 [Note] Found 1 prepared transaction(s) in InnoDB
2017-01-27 14:33:46 11313 [ERROR] Found 1 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery informat$
2017-01-27 14:33:46 11313 [ERROR] Aborting
My two thoughts are that I've missed something while creating the snapshot, or I've missed something bringing up the slave from this type of snapshot, so my question is...
**Am I missing some important parameters that force mysql/percona to commit transactions prior to freezing the file system?**
-- OR --
**Is there a parameter I should be using to bring the slave up to force it to act as if it's recovering from a crash?**
oucil
(516 rep)
Jan 27, 2017, 08:30 PM
• Last activity: May 13, 2025, 06:03 AM
2
votes
1
answers
291
views
Is there a Group Policy that can block replication snapshots?
Has anyone seen a domain Group Policy stop a transactional replication snapshot from being configured and executed with the message: > A required privilege is not held by the client. I am having trouble getting to much from our server admin guys. All they would do was create me a new server, added i...
Has anyone seen a domain Group Policy stop a transactional replication snapshot from being configured and executed with the message:
> A required privilege is not held by the client.
I am having trouble getting to much from our server admin guys. All they would do was create me a new server, added it to an OU with a filter to block the Group Policy and I got it to work (using domain user for agent) but now I can't get anything out of them about the Group Policies being delivered.
Does anyone know the policy setting that could be blocking the making of a snapshot?
David
(21 rep)
Feb 2, 2015, 03:43 AM
• Last activity: May 11, 2025, 12:07 PM
2
votes
3
answers
4738
views
Sql Server replication: Initial snapshot for publication is not yet available?
I had setup transactional replication and my distributor, publisher are on same windows server and 2 subscriber one on same server and another on other windows server both using push snapshot. I am getting the message in the synchronization status, that snapshot is not yet available, just a few hour...
I had setup transactional replication and my distributor, publisher are on same windows server and 2 subscriber one on same server and another on other windows server both using push snapshot.
I am getting the message in the synchronization status, that snapshot is not yet available, just a few hours back it was working fine every transaction was synchronized. The only thing i did in between was, added more articles(stored procedures, etc.) to the publication using its properties.
Is it causing the issue here, there is no error message displayed. Just that I am not able to get the data replicated
tsu90280
(175 rep)
Apr 15, 2020, 07:25 AM
• Last activity: Apr 17, 2025, 03:53 PM
1
votes
1
answers
72
views
Transactional Replication Snapshot Agent - Error 3616 - Ideas?
I got a server with 5 transactional database replication, first 4 are working fine. However, on the last one, the snapshot agent keeps failing with this message. I've checked permissions on the database, and I am just trying to pinpoint what could be causing it, any ideas? I noticed that if I remove...
I got a server with 5 transactional database replication, first 4 are working fine. However, on the last one, the snapshot agent keeps failing with this message. I've checked permissions on the database, and I am just trying to pinpoint what could be causing it, any ideas?
I noticed that if I remove all objects and create an article for just one table, the snapshot agent works fine.
2025-03-19 03:13:40.03 [0%] The replication agent had encountered an exception.
2025-03-19 03:13:40.03 Source: Replication
2025-03-19 03:13:40.03 Exception Type: Microsoft.SqlServer.Replication.ReplicationAgentSqlException
2025-03-19 03:13:40.03 Exception Message: An error was raised during trigger execution. The batch has been aborted and the user transaction, if any, has been rolled back.
2025-03-19 03:13:40.03 Message Code: 3616
Marcelo Castro
(23 rep)
Mar 19, 2025, 03:40 AM
• Last activity: Mar 19, 2025, 06:27 PM
0
votes
1
answers
819
views
Best way to create a daily snapshotted read-only db in RDS?
We have a PostgreSQL db instance in RDS. In our application we need to run a daily export in the early hours of every day, the export and transformation may take some time and one issue we've found is it can export incorrect data if the data is being modified while the export is running (we use limi...
We have a PostgreSQL db instance in RDS. In our application we need to run a daily export in the early hours of every day, the export and transformation may take some time and one issue we've found is it can export incorrect data if the data is being modified while the export is running (we use limit/offset as our table has ~100 million rows). This is probably a common issue.
To get over this we want to create a snapshot of our instance at midnight and use this for the export.
What's the best way to do this in RDS? Ideally the snapshot shouldn't take too long to create.
I've researched some things to try:
- I tried to create a stored procedure to duplicate tables, but then it takes time to reindex etc, and the table name has to be different (which I want to keep the same so the query, which is used in other areas of the app, doesn't have to change).
- RDS has read only replicas, but from brief reading they have a lag time so it seems to me they are constantly updated from the master instance. Is there any way to stop updates, or update once a day at midnight?
- We have daily snapshots ran at midnight already, and I think we could create an AWS lambda to restore this snapshot (which I believe creates a new instance). It seems clunkier than the read-only replica but doable.
- Use pg_dump, but this seems more work than using the AWS snapshot.
Firepanda
(33 rep)
Sep 14, 2017, 09:43 AM
• Last activity: Mar 11, 2025, 04:03 AM
2
votes
2
answers
854
views
Online index maintenance in SQL Server using SNAPSHOT isolation level
I have recently altered our SQL Server 2008 R2 database to enable SNAPSHOT isolation level and made appropriate changes to hibernate to run all transactions in SNAPSHOT mode. I have however noticed that when I add/delete/alter indexes, the queries which are accessing the underlying table are rolled...
I have recently altered our SQL Server 2008 R2 database to enable SNAPSHOT isolation level and made appropriate changes to hibernate to run all transactions in SNAPSHOT mode. I have however noticed that when I add/delete/alter indexes, the queries which are accessing the underlying table are rolled back and I get this exception:
> Snapshot isolation transaction failed in database 'foo' because the
> object accessed by the statement has been modified by a DDL statement
> in another concurrent transaction since the start of this transaction.
> It is disallowed because the metadata is not versioned. A concurrent
> update to metadata can lead to inconsistency if mixed with snapshot
> isolation
...which is exactly what has happened.
According to https://msdn.microsoft.com/en-us/library/bb933783(v=sql.105).aspx "These statements are permitted when you are using snapshot isolation within implicit transactions." So I tried:
set implicit_transactions on
GO
DROP INDEX blabla
GO
IF @@TRANCOUNT > 0 COMMIT TRAN set implicit_transactions off
But the problem still persists. There must be a way to do this in SQL Server?
infiniteLoop
(66 rep)
Dec 4, 2015, 08:58 AM
• Last activity: Jan 11, 2025, 01:51 PM
1
votes
2
answers
691
views
Automating MySQL replication using snapshots on AWS RDS (or Linux)
I'm currently working on a BASH script to build a new Aurora cluster based on the latest snapshot of an existing cluster, then set up MySQL replication from the old one to the new. In the past I've automated replication using **mysqldump** and **mydumper**, both of which give you the most recent bin...
I'm currently working on a BASH script to build a new Aurora cluster based on the latest snapshot of an existing cluster, then set up MySQL replication from the old one to the new. In the past I've automated replication using **mysqldump** and **mydumper**, both of which give you the most recent binlog and position. However, it's not as straightforward with snapshots (the question would presumably be the same using standard Linux snapshots rather than Aurora, although with Linux, you can get to the binary log index files before the instance is started. You can't read those with RDS).
Since a snapshot is an exact copy of an existing instance, you would think you could just take the current binlog and position from the new instance and use those for the starting position in the
change master
(or its RDS equivalent) statement. However, that's not the case, as you'll see (this example is from a dev instance so doesn't have much traffic, but a prod instance could easily roll over several binary logs between the time of the snapshot creation and restore):
Old instance:
MySQL [(none)]> show master status;
+----------------------------+----------+--------------+------------------+-------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+----------------------------+----------+--------------+------------------+-------------------------------------------------+
| mysql-bin-changelog.000510 | 98957391 | | | 0e332788-51c9-3440-b8e5-a6edc29aba7e:1-10746050 |
+----------------------------+----------+--------------+------------------+-------------------------------------------------+
MySQL [(none)]> show binary logs;
+----------------------------+-----------+
| Log_name | File_size |
+----------------------------+-----------+
| mysql-bin-changelog.000498 | 134301760 |
| mysql-bin-changelog.000499 | 134284404 |
| mysql-bin-changelog.000500 | 134647151 |
| mysql-bin-changelog.000501 | 134236349 |
| mysql-bin-changelog.000502 | 134236200 |
| mysql-bin-changelog.000503 | 134238768 |
| mysql-bin-changelog.000504 | 134321441 |
| mysql-bin-changelog.000505 | 134223686 |
| mysql-bin-changelog.000506 | 134275221 |
| mysql-bin-changelog.000507 | 134221341 |
| mysql-bin-changelog.000508 | 134219161 |
| mysql-bin-changelog.000509 | 134222780 |
| mysql-bin-changelog.000510 | 98926253 |
+----------------------------+-----------+
13 rows in set (0.00 sec)
New instance:
MySQL [(none)]> show master status;
+----------------------------+----------+--------------+------------------+-------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+----------------------------+----------+--------------+------------------+-------------------------------------------------+
| mysql-bin-changelog.000513 | 194 | | | 0e332788-51c9-3440-b8e5-a6edc29aba7e:1-10742576 |
+----------------------------+----------+--------------+------------------+-------------------------------------------------+
MySQL [(none)]> show binary logs;
+----------------------------+-----------+
| Log_name | File_size |
+----------------------------+-----------+
| mysql-bin-changelog.000498 | 134301760 |
| mysql-bin-changelog.000499 | 134284404 |
| mysql-bin-changelog.000500 | 134647151 |
| mysql-bin-changelog.000501 | 134236349 |
| mysql-bin-changelog.000502 | 134236200 |
| mysql-bin-changelog.000503 | 134238768 |
| mysql-bin-changelog.000504 | 134321441 |
| mysql-bin-changelog.000505 | 134223686 |
| mysql-bin-changelog.000506 | 134275221 |
| mysql-bin-changelog.000507 | 134221341 |
| mysql-bin-changelog.000508 | 134219161 |
| mysql-bin-changelog.000509 | 134222780 |
| mysql-bin-changelog.000510 | 88104589 |
| mysql-bin-changelog.000511 | 194 |
| mysql-bin-changelog.000512 | 194 |
| mysql-bin-changelog.000513 | 194 |
+----------------------------+-----------+
16 rows in set (0.01 sec)
As you can see, there are three additional binary logs that don't exist on the old one, and there have been additional transactions on the old instance that we don't want to lose. So my next thought was to compare the binary logs of the two instances and take the first one where the size on the master instance was larger than that of the replica; or, if they're all the same, then take the last one that exists on both; this will handle the case where no new data has been written to the master. However, as you can see, the new instance has additional binary logs (presumably created during the restarts as it was built) that have no real data, but will wreak havoc during comparisons of old and new. Another thought was to work backwards from the last binlog to the first on the replica, but I can't think of a reliable way to determine programmatically that it should be skipped. Skipping files with exactly 194 bytes seems very sketchy.
I'm not sure how to resolve this conflict. Any thoughts?
Swechsler
(153 rep)
Oct 26, 2022, 12:46 AM
• Last activity: Dec 30, 2024, 03:07 AM
0
votes
0
answers
69
views
when was the last time I restored a database from a database snapshot in the current server?
I generally [create a database snapshot][1], depending on circumstances - in order to compare data before and after changes, or for an easy way to roll back changes, like for instance, an alter table. I have recently had to roll back a considerate big change in one of our live systems and I had a sn...
I generally create a database snapshot , depending on circumstances - in order to compare data before and after changes, or for an easy way to roll back changes, like for instance, an alter table.
I have recently had to roll back a considerate big change in one of our live systems
and I had a snapshot to use, and it was very quickly and effective way to recover the database to before the changes.
(let me stress that that database in particular was not involved in replication, otherwise it would not have been possible, without breaking the replication)
this script shows my last database restores :
WITH LastRestores AS
(
SELECT
DatabaseName = [d].[name] ,
[d].[create_date] ,
[d].[compatibility_level] ,
[d].[collation_name] ,
r.*,
RowNum = ROW_NUMBER() OVER (PARTITION BY d.Name ORDER BY r.[restore_date] DESC)
FROM master.sys.databases d
LEFT OUTER JOIN msdb.dbo.[restorehistory] r ON r.[destination_database_name] = d.Name
)
SELECT *
FROM [LastRestores]
WHERE [RowNum] = 1
order by restore_history_id desc
when I run the above script it does not show my restore from a database snapshot.
this is how I restore a database from a snapshot (example - snapshot previously created)
RESTORE DATABASE [PPROD_Milano_202407] FROM DATABASE_SNAPSHOT = 'PPROD_Milano_202407_SS';
how can I see when the database was last restored from a snapshot?
Marcello Miorelli
(17274 rep)
Sep 23, 2024, 12:07 PM
1
votes
1
answers
807
views
Oracle Database Snapshot?
Is there a proper snapshot function or anything analogue to it on Oracle Database (19c, 21c) on a schema or PDB level? For example, I wanted to run some tests that will alter tables, procedures and update data from a schema. After finishing my tests I wanted to revert it to the state it was before I...
Is there a proper snapshot function or anything analogue to it on Oracle Database (19c, 21c) on a schema or PDB level?
For example, I wanted to run some tests that will alter tables, procedures and update data from a schema. After finishing my tests I wanted to revert it to the state it was before I started changing it.
I'm used to do Virtual Machine snapshots, but I'm new to Oracle DB and was looking for something similar to it.
Is there anything like that?
Thanks!
Daichi42
(11 rep)
Mar 15, 2023, 09:39 PM
• Last activity: Sep 3, 2024, 04:04 AM
0
votes
0
answers
81
views
postgresql 16 filesystem snapshot in combination with pit recovery "invalid checkpoint record"
we are migrating many deployment to a kubernetes cluster. I'm currently in the process of testing and documenting the restore process for our postgresql databases. In our legacy environment we did/do netapp snapshots of the volumes of the database. We also write WAL on replica level. In the rare cas...
we are migrating many deployment to a kubernetes cluster.
I'm currently in the process of testing and documenting the restore process for our postgresql databases.
In our legacy environment we did/do netapp snapshots of the volumes of the database. We also write WAL on replica level.
In the rare cases that a rollback to a snapshot was not enough and we saved the WALs from pg_wal and our wal_archive.
Then restored from snapshot and exchanged the WALs from the snapshot for the previously saved WALs and configured a recovery_target_time.
To my general knowledge this did always work when needed but was very rarely the case when it was needed.
Now in the kubernetes environment I'd like to do the same.
It is still a netapp snapshot although now triggered by velero. But technically netapp is still doing the snap.
I can start the database from snapshot, it will of cause complain of not having been shut down correctly, but will recover.
But anytime I try to do a PIT-Recovery by removing all WALs from the snapshot before starting the database and exchanging them with all WALs from the live system I get a variant of invalid checkpoint record:
2024-08-14 09:25:58.631
2024-08-14 09:25:58.630 CEST LOG: starting PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-08-14 09:25:58.631
2024-08-14 09:25:58.631 CEST LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-08-14 09:25:58.631
2024-08-14 09:25:58.631 CEST LOG: listening on IPv6 address "::", port 5432
2024-08-14 09:25:58.636
2024-08-14 09:25:58.636 CEST LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2024-08-14 09:25:58.675
2024-08-14 09:25:58.674 CEST LOG: database system was interrupted; last known up at 2024-08-13 09:36:16 CEST
2024-08-14 09:26:04.262
sh: 1: cannot open /var/lib/postgresql/wal/pg_wal/wal_archive/00000002.history.gz: No such file
2024-08-14 09:26:04.265
2024-08-14 09:26:04.265 CEST LOG: starting point-in-time recovery to 2024-08-14 08:40:00+02
2024-08-14 09:26:04.267
2024-08-14 09:26:04.267 CEST LOG: invalid checkpoint record
2024-08-14 09:26:04.267
2024-08-14 09:26:04.267 CEST PANIC: could not locate a valid checkpoint record
2024-08-14 09:26:04.600
2024-08-14 09:26:04.599 CEST LOG: startup process (PID 41) was terminated by signal 6: Aborted
2024-08-14 09:26:04.600
2024-08-14 09:26:04.599 CEST LOG: aborting startup due to startup process failure
2024-08-14 09:26:04.603
2024-08-14 09:26:04.603 CEST LOG: database system is shut down
Does the recovery_target_time conflict with the crash recovery?
archive_command=test ! -f /var/lib/postgresql/wal/pg_wal/wal_archive/%f.gz && /bin/gzip -c %p > /var/lib/postgresql/wal/pg_wal/wal_archive/%f.gz
restore_command = 'gunzip < /var/lib/postgresql/wal/pg_wal/wal_archive/%f.gz %p'
recovery_target_time = '2024-08-14 08:40:00 Europe/Berlin'
I also tried to copy the save WAls over the ones in the snapshot, since crash recovery from snapshot does work, but this did only need to complains about corruption and premature abort of recovery.
I fail to see why we were in the past able to do recoveries and no won't.
There is one difference.
In the test in kubernetes there was an error in which a seperate pod was launched with a script that should delete old WALs from wal_archive. This script was misconfigured and did a pg_basebackup.
The information about that basebackup could possibly be the problem, I hope…
I rebuild the setup and try with a configuration without basebackup.
Do you have any suggestions?
I was hoping to do safe snapshots with pg_backup_start()/stop() as velero pre- and posthooks. But this seems to be not so good because of the need to have one session open during snaphot for start/stop.
I might not be able to do this as pre-/posthook.
pema83
(23 rep)
Aug 14, 2024, 08:36 AM
0
votes
2
answers
997
views
Postgresql backup with ZFS snapshots: Is pg_start_backup()/pg_stop_backup() necessary?
The title says it all. I have found [this][1] from 10 years ago saying that if database data is all in the same snapshot, then `pg_start_backup()` isn't needed. PostgreSQL will start from the snapshot like after a typical crash. But what if there isn't a single snapshot for the whole database? What...
The title says it all. I have found this from 10 years ago saying that if database data is all in the same snapshot, then
pg_start_backup()
isn't needed. PostgreSQL will start from the snapshot like after a typical crash.
But what if there isn't a single snapshot for the whole database? What if there are some tablespaces in other datasets and WAL is in its own dataset too? This way, snapshots could be out of sync by a very small period of time. Would this make necessary to run pg_start_backup()
to ensure no data corruption?
I have found this too from 8 years ago, by a guy testing exactly this, if PostgreSQL would start again, creating an intentional delay between WAL and data snapshots, but using virtual machine snapshot technology. So it seems that it can work, the question would be, will it _always_ work?
In fact, going one step further, why would be pg_start_backup()
needed in any circumstance? Isn't WAL replay capable of fixing the internal inconsistencies of a non-instantaneous backup?
Best regards.
Héctor
(307 rep)
Nov 11, 2022, 05:18 PM
• Last activity: Jul 4, 2024, 09:26 PM
0
votes
1
answers
133
views
Restore from database snapshot
When I do a restore of a database from a `database_snapshot` on a `filestream enabled` database, I am getting the error: >The database cannot be reverted because FILESTREAM BLOBs are present. Is it just not possible to do a restore from snapshot or do I have to specify something extra in the restore...
When I do a restore of a database from a
database_snapshot
on a filestream enabled
database, I am getting the error:
>The database cannot be reverted because FILESTREAM BLOBs are present.
Is it just not possible to do a restore from snapshot or do I have to specify something extra in the restore command?
Ludo
(66 rep)
Apr 2, 2024, 12:44 PM
• Last activity: Apr 2, 2024, 10:30 PM
7
votes
1
answers
5510
views
What is the proper way to back up MySQL database with rsnapshot?
I want to back up my website that contains both user-uploaded images and data stored in MySQL databases such that they are always consistent with each other at any particular point of time. While searching for solutions, I found this application, rsnapshot, which might be appropriate for the task. H...
I want to back up my website that contains both user-uploaded images and data stored in MySQL databases such that they are always consistent with each other at any particular point of time. While searching for solutions, I found this application, rsnapshot, which might be appropriate for the task. Having gone through a few blogs on the internet, I realize that this is actually done with a backup_script using mysqldump method:
backup_script ssh root@example.com "mysqldump -A > /var/db/dump/mysql.sql" unused2
backup root@example.com:/var/db/dump/ example.com/
What I am not sure is whether there is a necessity to flush all tables with read lock before performing the dump in the script to ensure consistency of the database. If so, how should this be incorporated? If no, why is it not necessary?
Question Overflow
(1009 rep)
Jan 31, 2013, 07:01 AM
• Last activity: Mar 31, 2024, 12:34 AM
0
votes
0
answers
46
views
How to Preserve related Table Entries for a final composition?
Please bear with me if I don't quite hit the right wording for my question, or perhaps don't understand the problem correctly. I have an application where you can take an abstract version of something, pimp this base version up in the next sate and in the final state this something will be somehow m...
Please bear with me if I don't quite hit the right wording for my question, or perhaps don't understand the problem correctly.
I have an application where you can take an abstract version of something, pimp this base version up in the next sate and in the final state this something will be somehow marked unique to a specific user.
What I want to know is, how do I preserve related content of involed tables so that I can always get the final user-unique state, even when some entries (marked via grey boxes) of related tables change over time.
To give that a practical example for demonstration purpose, I have chosen a very simple ER Diagram, but you can image in real there are much more tables involved:
1. In the first state there is

book_abstract
. It has only the title mandatory. The title can change over time (maybe the publisher wants to add a noun marker).
2. The next state is book_lang_version
, where the book becomes available in different languages and also in different types such as paperback or hardcover. This state relies on book_abstract
to obtain *title* and optional category
. There might be a *price* set in the **m:n** table, but it's not a must.
3. The final state book_purchase_item
is the user-unique state. This state relies on book_lang_version
to know which language is chosen and further on to obtain information from book_abstract
. Also the price has to be set. At last, the teaser text may be different due to some special circumstances, let's say it's an adult book and the government restrictions say, that one may only display adult teaser content during some night hours ;).
And here is my question:
What is an established and common solution to preserve the involed entries, which are scattered around different realted tables and may change in the future (I have marked the entries in question with grey boxes), so that I can rebuild the purchased book at any time in the future? I know that there are special databases which take snapshots of the operating database, but this is not an option for me. I want to have all in one database accessible programmatically at any time. Many thanks for any help
Slevin
(101 rep)
Feb 6, 2024, 11:38 AM
Showing page 1 of 20 total questions