Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes
0 answers
39 views
Checkpoint pages/sec performance counter constantly zero
First of all, I don't have enough reputation to comment on [this post][1] to ask my question, hence asking here. I have a SQL Server 2022 instance which is target of an ETL process, receiving multiple million records every 2 hours. There are also heavy calculations on the same instance resulting in...
First of all, I don't have enough reputation to comment on this post to ask my question, hence asking here. I have a SQL Server 2022 instance which is target of an ETL process, receiving multiple million records every 2 hours. There are also heavy calculations on the same instance resulting in heavy table truncate and writes right after ETL. I have heavy disk writes when ETL starts (Disk Write Bytes/sec performance counter from SQLServer:Resource Pool Stats), but Checkpoint pages/sec is constantly zero at the same time. Recovery interval is set to zero (automatic). How can I find why checkpoint pages/sec is zero? Other performance counters (e.g. Buffer cache hit ratio and PLE) return reasonable values. enter image description here
Mohammad Javahery (1 rep)
Dec 16, 2024, 05:51 AM • Last activity: Jan 3, 2025, 12:38 PM
8 votes
2 answers
14893 views
PostgreSQL checkpoint log explained
I know what PostgreSQL checkpoint is and when it is happening. I need some additional information about the logs produced by the `log_checkpoints = on` parameter, so please explain some points of it to me: `2017-09-09 16:31:37 EEST [6428-6524] LOG: checkpoint complete: wrote 30057 buffers (22.9%); 0...
I know what PostgreSQL checkpoint is and when it is happening. I need some additional information about the logs produced by the log_checkpoints = on parameter, so please explain some points of it to me: 2017-09-09 16:31:37 EEST [6428-6524] LOG: checkpoint complete: wrote 30057 buffers (22.9%); 0 transaction log file(s) added, 0 removed, 47 recycled; write=148.465 s, sync=34.339 s, total=182.814 s; sync files=159, longest=16.143 s, average=0.215 s 1. I know that 22.9% of shared buffers are written (I have 1024 MB shared_buffers so that means 234 MB are written out). 2. I know that 47 WAL files are recycled, i.e., they are not needed anymore for crash recovery, because the real data from them is already on disk. **Question A**. But what about write=148.465 s and sync=34.339? What is the difference? What is write and why its time is far more than the fsync() operation? **Question B**. What are sync files? Which files: WAL files? Why sync files are 159, but there are only 47 recycled files? What is the relation between these? Thank you!
inivanoff1 (183 rep)
Sep 9, 2017, 02:11 PM • Last activity: Oct 28, 2024, 07:32 PM
1 votes
1 answers
109 views
why does SQL Server log recovery replay to disk, instead of to RAM
SQL Server’s [docs][1] talk about applying committed transactions to data files during crash recovery, kind of like it was doing an immediate checkpoint. This is supposed to happen prior to undoing uncommitted transactions, which must happen to complete the recovery process. Why does it flush commit...
SQL Server’s docs talk about applying committed transactions to data files during crash recovery, kind of like it was doing an immediate checkpoint. This is supposed to happen prior to undoing uncommitted transactions, which must happen to complete the recovery process. Why does it flush committed transactions to data files immediately, instead of just loading dirty data into the RAM cache (the same way the data was stored in cache + logs, prior to the crash) and then let it be written out as part of the next checkpoint? It seems like this could needlessly lengthen crash recovery time.
Dan (299 rep)
Oct 5, 2024, 09:32 PM • Last activity: Oct 7, 2024, 08:35 PM
1 votes
1 answers
413 views
Behavior of PostgreSQL checkpoint_timeout when the previous checkpoint is still in progress
Dear PostgreSQL Community, I am trying to better understand PostgreSQL checkpoints internally, so I was thinking about this scenario. Lets say we have checkpoint_timeout 15 mins and checkpoint_completion_target 0.85. Now if we face a case when one checkpoint for some reason takes lets say 20 minutes...
Dear PostgreSQL Community, I am trying to better understand PostgreSQL checkpoints internally, so I was thinking about this scenario. Lets say we have checkpoint_timeout 15 mins and checkpoint_completion_target 0.85. Now if we face a case when one checkpoint for some reason takes lets say 20 minutes, for example 1:00PM - checkpoint1 started 1:15PM checkpoint timeout fired but checkpoint1 is still running (checkpoint2 here was scheduled) 1:20PM - checkpoint1 finished My question is will the next checkpoint trigger right away at 1:20PM or will it be suspended and started at 1:30 as initially scheduled? I assume that two checkpoints cannot overlap. ( For now lets do not discuss other affecting parameters as max_wal_size, which can also trigger) So does the checkpoint1 write ALL the dirty buffers to the disk, so we have very clean state at 1:20 so there is no need for the second checkpoint to start? As far as I understand the old checkpoint will not touch pages which were dirtied after its start, so in this case there is a need for the new checkpoint to just fire after the old one finishes. Also, will checkpoint_timeout scheduling be switched any way? Like scheduled to run late for as much time as was the previous one delayed, or is it not alterable? I will be more than happy if you can share more details regarding internal insights.
igelr (2162 rep)
Feb 29, 2024, 01:29 PM • Last activity: Feb 29, 2024, 04:16 PM
0 votes
1 answers
1614 views
How to flush WAL files?
I'm trying to force WAL files to be flushed on to the disk so that I can copy them somewhere else. I believe [`CHECKPOINT`](https://www.postgresql.org/docs/current/sql-checkpoint.html) is the command I'm interested in but it doesn't seem to work. I have a docker container of Postgres running and I u...
I'm trying to force WAL files to be flushed on to the disk so that I can copy them somewhere else. I believe [CHECKPOINT](https://www.postgresql.org/docs/current/sql-checkpoint.html) is the command I'm interested in but it doesn't seem to work. I have a docker container of Postgres running and I use adminer to add new records to a table I created. I tried to flush the WAL files manually:
$ docker exec db psql pgtest -U postgres -c "CHECKPOINT;"
CHECKPOINT
I believe that should flush the data and I would see a new WAL file created, however there are no new files - I only see the old WAL files. When I restart the container:
$ docker restart db
db
Then I see a new WAL file. This is what my postgresql.conf looks like:
#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------

# - Settings -

wal_level = logical                   # minimal, replica, or logical
                                      # (change requires restart)
fsync = on                            # flush data to disk for crash safety
                                      # (turning this off can cause
                                      # unrecoverable data corruption)
synchronous_commit = local            # synchronization level;
                                      # off, local, remote_write, remote_apply, or on
wal_sync_method = fsync               # the default is the first option
                                      # supported by the operating system:
                                      #   open_datasync
                                      #   fdatasync (default on Linux)
                                      #   fsync
                                      #   fsync_writethrough
                                      #   open_sync
#full_page_writes = on                # recover from partial page writes
#wal_compression = off                # enable compression of full-page writes
#wal_log_hints = off                  # also do full page writes of non-critical updates
                                      # (change requires restart)
#wal_buffers = -1                     # min 32kB, -1 sets based on shared_buffers
                                      # (change requires restart)
#wal_writer_delay = 200ms             # 1-10000 milliseconds
wal_writer_flush_after = 1MB          # measured in pages, 0 disables

#commit_delay = 0                     # range 0-100000, in microseconds
#commit_siblings = 5                  # range 1-1000

# - Checkpoints -

#checkpoint_timeout = 1min            # range 30s-1d
#max_wal_size = 1GB
#min_wal_size = 4MB
#checkpoint_completion_target = 0.5   # checkpoint target duration, 0.0 - 1.0
#checkpoint_flush_after = 256kB       # measured in pages, 0 disables
#checkpoint_warning = 30s             # 0 disables

# - Archiving -

archive_mode = on                     # enables archiving; off, on, or always
                                      # (change requires restart)
archive_command = 'test ! -f /var/lib/postgresql/data/wal/%f && cp %p /var/lib/postgresql/data/wal/%f'
                                      # placeholders: %p = path of file to archive
                                      #               %f = file name only
                                      # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
#archive_timeout = 0                  # force a logfile segment switch after this
                                      # number of seconds; 0 disables
dokgu (123 rep)
Feb 12, 2024, 05:51 PM • Last activity: Feb 12, 2024, 06:31 PM
0 votes
0 answers
99 views
SQL Server 2014 Negative log. Stuck at CHECKPOINT. Unable to do anything
I have a database that is stuck at **CHECKPOINT** for hours and I already tried all solutions that have been suggested in this site with no avail and I'm starting to lose faith. Keep in mind that the database is not in recovery, emergency or anything. I'm able to query the tables just fine. The only...
I have a database that is stuck at **CHECKPOINT** for hours and I already tried all solutions that have been suggested in this site with no avail and I'm starting to lose faith. Keep in mind that the database is not in recovery, emergency or anything. I'm able to query the tables just fine. The only difference I've noticed between my problem and the other questions is that the "available free space" when I try to shrink the log is negative (-0.24MB). **How did it start?** The server had a task that made a backup of the database at midnight. The problem started when the server ran out of disk space. I've already free up some space. **I already check:** 1. Disk space. It had zero now it has more than 150GB available (more than enough since he database is relative small) 2. No transactions are running 3. No users are connected 4. No replication 5. Database Recovery mode is **SIMPLE** (always been) 6. Max size for log is unlimited **I already tried unsuccessfully:** 7. Creating a new log file 8. Increasing actual log size 9. Shrink log file (I've noticed that the "free space" is negative. It says -0.24MB) 10. Repairing with DBCC 11. Restarted the server and SQL Server Service 12. Disabled all tasks (SQL and Windows) 13. Executing CHECKPOINT command twice 14. "No checkpoint has occurred since the last log truncation, or the head of the log has not yet moved beyond" if I ran the query from Troubleshoot a full transaction log I can't do anything because it always responds with "Transaction log is full due to checkpoint" Any ideas?
Pedro (1 rep)
Jan 2, 2024, 08:36 PM
1 votes
1 answers
735 views
Shared buffers, WAL buffers and Checkpointers
I am taking this [EDB PostgreSQL essential course][1] and Instructor explained about PostgreSQL architecture referring to the diagram that, whenever a client make an update request and suppose data is present in shared buffer (that means no need to fetch it from file storage) then it'll make an entr...
I am taking this EDB PostgreSQL essential course and Instructor explained about PostgreSQL architecture referring to the diagram that, whenever a client make an update request and suppose data is present in shared buffer (that means no need to fetch it from file storage) then it'll make an entry in WAL buffers and upon *committing* the WAL writer will write the transaction to transaction logs and make it permanent but not in the file systems (as far as I've understood, that's the task of *checkpointer*, below.) So far so good. enter image description here **image courtesy traning.enterprisedb.com** Now comes checkpointer, *it is a process which runs after every certain interval of time "usually 5 mins is an ideal time" and, write anything in the shared buffer into the file storage.* My question is, suppose checkpointer just ran and after that I initiated an atomic transaction and transferred 100 bucks to my friend, how is it that my friend can see it immediately, is Postgres making query to transaction logs? Or, how's this happening? But upon little pondering, I realize that when the request is made to update the data and in order to update it, Postgres has bring it into the main memory and a viable way to do that is to keep track of dirty data in shared buffer and update the data in shared buffer itself and in the transaction logs we can have 0/1 with every DML transaction entry to identify whether data is present in shared buffer or not. This can also come handy while doing analysis. Can someone help me understand? Thanks in advance!
commonSense (123 rep)
Nov 25, 2023, 01:40 AM • Last activity: Nov 25, 2023, 03:08 PM
0 votes
1 answers
79 views
Clarification on Automatic Checkpoints and IO Latency Thresholds in SQL Server
I have been reading the [SQL Server documentation][1] and came across the following statement about automatic checkpoints: > Issued automatically in the background to meet the upper time limit suggested by the recovery interval server configuration option. Automatic checkpoints run to completion. Au...
I have been reading the SQL Server documentation and came across the following statement about automatic checkpoints: > Issued automatically in the background to meet the upper time limit suggested by the recovery interval server configuration option. Automatic checkpoints run to completion. Automatic checkpoints are throttled based on the number of outstanding writes and whether the Database Engine detects an increase in write latency above 50 milliseconds. I'm trying to understand what "Automatic checkpoints run to completion" means. I'm not a native English speaker and this sentence sounds very weird to me. Are there any checkpoints do not run to completion and only do half of their work? Additionally, I recall a statement from Paul Randal where he mentioned that the checkpoint process will throttle outstanding IOs if the IO latency is greater than 20ms. During shutdown, this threshold increases to 100ms to expedite the process. However, the SQL Server documentation suggests a threshold of 50 milliseconds for throttling automatic checkpoints. Has there been a change in the threshold value, or am I misunderstanding the concept? Any clarification on these points would be greatly appreciated. Thank you in advance!
Fajela Tajkiya (1239 rep)
Oct 6, 2023, 04:35 PM • Last activity: Oct 7, 2023, 03:20 AM
1 votes
1 answers
464 views
Postgres 14 WAL FILE cannot change min wal size(in WAL DIRECTORY) after changing propperties min_wal_size max_wal_size
I'm completely new in Postgres optimization. I was reading a lot about checkpoint and WAL files. I found a query to examine the ratio of checkpoints time to requested (exceeded WAL max size) ``` SELECT checkpoints_timed, checkpoints_req FROM pg_stat_bgwriter; ``` |checkpoints_timed | checkpoints_req...
I'm completely new in Postgres optimization. I was reading a lot about checkpoint and WAL files. I found a query to examine the ratio of checkpoints time to requested (exceeded WAL max size)
SELECT checkpoints_timed, checkpoints_req
FROM pg_stat_bgwriter;
|checkpoints_timed | checkpoints_req | |-|-| |20496|53| In order to minimalize requested checkpoint I changed postgres configuration to:
checkpoint_timeout = 15min  
max_wal_size = 2GB  
min_wal_size = 200MB
But when I check WAL dir space is still 83MB (as the default value from configuration).
select sum(size)
from pg_ls_waldir();
Can anyone explain why WAL directory size is still ~83MB and not increasing to 200MB? I noted that changing parameters has triggered 2 checkpoints_req so I guess that max_wal_size is still 1GB?
BARTOSZ BALA (11 rep)
Apr 28, 2023, 10:08 PM • Last activity: May 1, 2023, 10:37 AM
3 votes
2 answers
612 views
In SQL Server, what's the impact of a checkpoint to log records in log buffer?
Does checkpoint operations flush everything in the log buffer to log file? Or just the log records relating to the dirty pages that are about to be flushed? I found some inconsistent description about this by the top names in the SQL Server industry. From Kalen Delaney's "Microsoft SQL Server 2012 I...
Does checkpoint operations flush everything in the log buffer to log file? Or just the log records relating to the dirty pages that are about to be flushed? I found some inconsistent description about this by the top names in the SQL Server industry. From Kalen Delaney's "Microsoft SQL Server 2012 Internals": > Checkpoint operations also write log records from transactions in progress to disk because the cached log records are also considered to be dirty. "from transactions in progress", yes that's reasonable, since the log records for committed transactions were already written to disk. So it basically means all unflushed log records will be flushed. From Itzik Ben-Gan's "Understanding log buffer flushes" at https://sqlperformance.com/2018/11/sql-performance/understanding-log-buffer-flushes : > SQL Server needs to harden dirty data pages, e.g., during a checkpoint process, and the log records representing the changes to those pages were not yet hardened (write ahead logging, or WAL in short) Are the the log records in log buffer that corresponding to the dirty pages all unflushed log records in log buffer? I'm not sure. Is Itzik's description basically means the same thing as Kalen's description?
Just a learner (2082 rep)
Apr 19, 2021, 04:38 PM • Last activity: Nov 2, 2022, 01:06 AM
-1 votes
1 answers
568 views
What is the purpose of indirect checkpoint when the default checkpoint time (when recovery time is 0) does a 1 minute checkpoint?
Indirect checkpoint is performed when the target recovery time is configured. When the value is 0, then it will do checkpoints such that default recovery time is 1 minute. Where as if the value is greater than 0 (recommended value is 60), then SQL server will do indirect checkpoints. I'm trying to u...
Indirect checkpoint is performed when the target recovery time is configured. When the value is 0, then it will do checkpoints such that default recovery time is 1 minute. Where as if the value is greater than 0 (recommended value is 60), then SQL server will do indirect checkpoints. I'm trying to understand menaing of indirect checkpoint with an example - Suppose value of 40 is configured, then does it mean that: - SQL server does checkpoint so that recovery time is 40 seconds? Then what is the difference between automatic and indirect checkpoint?
variable (3590 rep)
Sep 27, 2022, 04:08 PM • Last activity: Sep 28, 2022, 09:23 AM
0 votes
2 answers
462 views
Keep a history of transactions for easy rollback
Is there a database (either SQL or NoSQL) which allows you to keep a history or log of all past transactions so that you can easily rollback to any given point in time (similar to how git manages a source repository)? Taking a checkpoint after every transaction probably will do the same job, but kee...
Is there a database (either SQL or NoSQL) which allows you to keep a history or log of all past transactions so that you can easily rollback to any given point in time (similar to how git manages a source repository)? Taking a checkpoint after every transaction probably will do the same job, but keeping a history looks more efficient.
Cyker (125 rep)
Nov 26, 2018, 05:17 AM • Last activity: Aug 1, 2022, 11:01 PM
1 votes
1 answers
776 views
What is difference between checkpoint_timeout and checkpoint_completion_target in PostgreSQL?
I am MSSQL guy and I find it a bit difficult to understand the main purpose of checkpoint_completion_target. I cannot find a comprehensive resource that would explain more clearly the difference between checkpoint_timeout and checkpoint_completion_target parameters in PostgreSQL.
I am MSSQL guy and I find it a bit difficult to understand the main purpose of checkpoint_completion_target. I cannot find a comprehensive resource that would explain more clearly the difference between checkpoint_timeout and checkpoint_completion_target parameters in PostgreSQL.
Rauf Asadov (1313 rep)
Mar 9, 2022, 10:13 AM • Last activity: Mar 9, 2022, 12:09 PM
0 votes
1 answers
512 views
Recovery of uncommitted transactions after checkpoint
After reading from several resources including: * [COMMIT][1] (Oracle Database SQL Reference) * [Checkpoints][2] (Oracle Programmer Reference) * https://dba.stackexchange.com/q/124697 (on this site) * [ORACLE CHECKPOINTS][4] by Anju Garg I have concluded the following: 1. The effects of any transact...
After reading from several resources including: * COMMIT (Oracle Database SQL Reference) * Checkpoints (Oracle Programmer Reference) * https://dba.stackexchange.com/q/124697 (on this site) * ORACLE CHECKPOINTS by Anju Garg I have concluded the following: 1. The effects of any transaction are written to the log buffer in the memory. 2. Commit flushes the log buffer to the logfiles in the disk. 3. On the occurrence of a Checkpoint, DBW writes the dirty blocks to datafiles on the basis of logfile entries. 4. In the event of a crash, the Recovery Manager reads the logfiles (till the last checkpoint) and performs Undo/Redo on the transactions. I understand that the committed transactions are redone and uncommitted transactions are undone. What I don't understand is how are uncommitted transactions known, given that they exist only in the log buffer in the memory and any crash is supposed to wipe-out the memory and the only surviving transactions are those residing in the logfiles on the disk (after getting committed). Is something wrong in my understanding?
shantanu4raje (1 rep)
Jun 11, 2021, 07:32 AM • Last activity: Jun 11, 2021, 09:56 AM
5 votes
1 answers
1058 views
Dirty buffer pages after issuing CHECKPOINT
I am currently working on a test system and due to the nature of the queries I want to optimise, I am trying to simulate a "cold" read a well as I can. Part of that is clearing the buffer cache before performing the queries. From everything I can find dirty buffer pages are supposed to be written du...
I am currently working on a test system and due to the nature of the queries I want to optimise, I am trying to simulate a "cold" read a well as I can. Part of that is clearing the buffer cache before performing the queries. From everything I can find dirty buffer pages are supposed to be written during a checkpoint. However, even after issuing a CHECKPOINT, there still seem to be 169 dirty pages of my database in the buffer pool (assessed via SELECT * FROM sys.dm_os_buffer_descriptors WHERE database_id=7 AND is_modified=1). Is there anything I am misunderstanding about checkpoints or the content of sys.dm_os_buffer_descriptors? If not, why do I still have dirty pages after they were supposedly written away?
Florian (341 rep)
Dec 3, 2020, 10:55 AM • Last activity: Dec 4, 2020, 08:28 PM
1 votes
2 answers
850 views
Why xlog checkpoint was started?
Consider the following config shared_buffers = 20GB max_wal_size = 2GB min_wal_size = 1GB checkpoint_completion_target = 0.7 checkpoint_timeout = 5min and checkpoints' log 2020-05-25 16:28:37.128 | LOG: checkpoint starting: time 2020-05-25 16:32:07.098 | LOG: checkpoint complete: wrote 21420 buffers...
Consider the following config shared_buffers = 20GB max_wal_size = 2GB min_wal_size = 1GB checkpoint_completion_target = 0.7 checkpoint_timeout = 5min and checkpoints' log 2020-05-25 16:28:37.128 | LOG: checkpoint starting: time 2020-05-25 16:32:07.098 | LOG: checkpoint complete: wrote 21420 buffers (0.8%); 0 WAL file(s) added, 0 removed, 7 recycled; write=209.932 s, sync=0.009 s, total=209.969 s; sync files=295, longest=0.002 s, average=0.000 s; distance=45191 kB, estimate=503334 kB 2020-05-25 16:33:37.165 | LOG: checkpoint starting: time 2020-05-25 16:37:08.001 | LOG: checkpoint complete: wrote 25041 buffers (1.0%); 0 WAL file(s) added, 0 removed, 25 recycled; write=209.842 s, sync=0.865 s, total=210.835 s; sync files=421, longest=0.120 s, average=0.002 s; distance=82180 kB, estimate=461218 kB 2020-05-25 16:38:02.690 | LOG: checkpoint starting: xlog 2020-05-25 16:41:32.181 | LOG: checkpoint complete: wrote 191629 buffers (7.3%); 0 WAL file(s) added, 0 removed, 42 recycled; write=209.239 s, sync=0.049 s, total=209.491 s; sync files=666, longest=0.006 s, average=0.000 s; distance=758070 kB, estimate=758070 kB Why at 16:41 the xlog checkpoint was started? The distance from prev checkpoint is just 758070 KB and it's greatly less then max_wal_size. **PS** It seems that max_wal_size is spread between several checkpoints, and xlog checkpoint is started if value max_wal_size / (2 + checkpoint_completion_target ) is exceeded. Am I right?
sim (149 rep)
May 25, 2020, 03:26 PM • Last activity: May 26, 2020, 12:54 PM
0 votes
2 answers
323 views
Why does SQL Server start REDO phase from minLSN instead of last CHECKPOINT?
while reading an [article][1] from BOL, I stuck on the picture which explains the recovery process of database (without ADR): [![enter image description here][2]][2] Shouldn't *Phase 2:**Redo*** start from Last CHECKPOINT as till that point everything is already in data file? Why does it starts from...
while reading an article from BOL, I stuck on the picture which explains the recovery process of database (without ADR): enter image description here Shouldn't *Phase 2:**Redo*** start from Last CHECKPOINT as till that point everything is already in data file? Why does it starts from minLSN?
Rauf Asadov (1313 rep)
Apr 8, 2020, 12:06 PM • Last activity: Apr 8, 2020, 03:26 PM
1 votes
1 answers
2365 views
TempDB Transaction log not releasing space
Every time I complete a transaction in tempdb, the temp log grows, however the log doesn't seem to release the space once the transaction has completed.The log usage percent is 43.4% but seems to increase in around 4% increments. When I look for open transactions (`sys.dm_exec_sessions`), there are...
Every time I complete a transaction in tempdb, the temp log grows, however the log doesn't seem to release the space once the transaction has completed.The log usage percent is 43.4% but seems to increase in around 4% increments. When I look for open transactions (sys.dm_exec_sessions), there are none. Yet the log_reuse_wait_desc in sys.databases shows 'ACTIVE_TRANSACTION'. When I query the sys.dm_tran_active_transactions DMV, it shows only work table. I'm unsure of how I can release this space from the tempdb transaction log. I've just run a manual checkpoint and that seems to free up the space so I'm not sure as to why the space isn't being freed up on it's own
Krishnp92 (19 rep)
Aug 8, 2019, 12:02 PM • Last activity: Aug 8, 2019, 01:56 PM
5 votes
1 answers
701 views
Checkpoints on secondary replica AlwaysOn AG
**Setup** 3 Node Alwayson cluster - 1 sync and 1 async secondary replica - SQL Server 2012 **Situation** We are witnessing PageIOLatches when reading from the asynchronous secondary replica. This is mostly caused by the throughput of the SAN that has been throttled. The hosting partner told us that...
**Setup** 3 Node Alwayson cluster - 1 sync and 1 async secondary replica - SQL Server 2012 **Situation** We are witnessing PageIOLatches when reading from the asynchronous secondary replica. This is mostly caused by the throughput of the SAN that has been throttled. The hosting partner told us that this limitation cannot immediately be alleviated due to hardware constraints. The primary and synchrounous replica use other SANs with a higher throughput. Although this situation is far from ideal, this is a temporary situation that will be solved soon and not the subject of my question. When investigating the IO waits we noticed that these occur concurrently with an increase in the number of checkpoint pages/sec. enter image description here I was under the impression that checkpoints don’t occur on secondary replica’s in an AG, just as discussed [here]( https://social.msdn.microsoft.com/Forums/sqlserver/en-US/9446acb1-b64c-45f8-a5f0-43a551a49466/alwayson-checkpoints-on-secondary-servers?forum=sqldisasterrecovery) . To verify this behaviour, I’ve set up an extended event to monitor the checkpoint events on the asynchronous replica. Just as expected, no checkpoints were captured for this database, nor have I found any checkpoints from other databases that match the pattern. Next, I’ve created the same extended event on the primary replica and started a perfmon to verify if we could witness the same behaviour. Here we were able to capture (automatic) checkpoints, they happen approx. once per minute. These checkpoints occur simultaneously with the checkpoint pages/sec increase on our secondary (and primary) replica. It seems that checkpoints are being generated on the primary and redone on the secondary replicas. This would mean that checkpoints do occur implicitly on secondary replica's in an AG. **Question** Is my assumption correct that in an AG checkpoints are being generated on the primary replica and redone on all secondary replicas? And thus, if the database TARGET_RECOVERY_TIME isn't set, the recovery interval setting of the primary replica will dictate the checkpoints on all secondary replicas for these databases.
Thomas Costers (738 rep)
Jul 16, 2019, 07:39 AM • Last activity: Jul 16, 2019, 03:16 PM
-3 votes
1 answers
154 views
System failure causes inconsistency between two transactions
[![enter image description here][1]][1] Let's say the picture above the transaction 4 for user A. A updated a record from Account table, change the balance from 100 to 0, then a checkpoint occurred, so all dirty pages got written to disk. At time t, user B checked the Account table and add records w...
enter image description here Let's say the picture above the transaction 4 for user A. A updated a record from Account table, change the balance from 100 to 0, then a checkpoint occurred, so all dirty pages got written to disk. At time t, user B checked the Account table and add records whose balance is zero to Audit table and commit immediately before the system failure. Then a system failure occurred, so the transaction 4 was rolled-back, and for account id 1234, then balance was 100 again, so this account id shouldn't be in the Audit table, but it was because user B added it. so how to tackle this inconsistency?
slowjams (223 rep)
Jul 3, 2019, 12:02 AM • Last activity: Jul 3, 2019, 01:09 PM
Showing page 1 of 20 total questions