Report Uptime Metrics for SQL Server by Database/Application
2
votes
1
answer
793
views
I have a requirement to report monthly metrics on SQL Server uptime, by application/database. This would be at the cluster level. I.E., if a secondary replica went offline, but a primary was still available to process transactions, uptime would still be considered 100%.
I've looked at various tools that might serve this purpose, and a few come close. But, none of them seem to capture anything more than if the SQL Server service is online and accepting connections. They also fail to aggregate these metrics at the cluster/AG level. Meaning these uptime reports would take a hit if a secondary replica were to go offline.
For example, let's say a database goes offline or a log file fills up, and transactions are unable to process against a single database. Those tools would say that SQL Server is up, but I would still have people saying it was a database issue. Thus, these metrics would need to reflect that SQL was not fully up at that time.
The best idea of I'm come up with at this point is to create a SQL Agent job that inserts a record into a Canary table in each database, once a minute. Then at the end of the month, to query that table and divide the previous month's row count by the expected row count. I figured there was no better way to prove a database was actually available than to try to insert a row.
I already have the above solution developed, tested and working. But I'm curious if anyone knows of a better way to do this. Including any goods tools or DMVs I may have overlooked, that I could use to extrapolate end-user availability metrics of all databases on a SQL Instance?
Asked by Brendan McCaffrey
(3444 rep)
Mar 4, 2022, 10:14 PM
Last activity: Mar 5, 2022, 12:22 AM
Last activity: Mar 5, 2022, 12:22 AM