Backups/CHECKDB errors - "Cannot find server certificate"
4
votes
1
answer
658
views
I have been exchanging TDE certificates on various servers to make restores easier - some had the wrong name, the wrong thumbprint, or even both. This involves transferring databases to a temporary cert, dropping old certs, creating the intended cert (from the certificate & key files), and transferring databases to it. This has all gone fine and dandy, except for databases on *one* server.
On executing
DBCC CHECKDB(SomeDatabase) WITH NO_INFOMSGS, ALL_ERRORMSGS, DATA_PURITY
, I receive this:
Msg 33111, Level 16, State 3, Line 106
Cannot find server certificate with thumbprint ''.
Msg 1823, Level 16, State 2, Line 106
A database snapshot cannot be created because it failed to start.
Msg 1823, Level 16, State 8, Line 106
A database snapshot cannot be created because it failed to start.
Msg 7928, Level 16, State 1, Line 106
The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.
If rerun with info messages, it does the expected CHECKDB behavior afterwards, so it appears to get the necessary access for offline.
Backups have a similar output. I typically use the SSMS UI for ad hoc backups, which scripts out as this command:
BACKUP DATABASE [SomeDatabase] TO DISK = N'some\path\SomeDatabase.bak' WITH DIFFERENTIAL , NOFORMAT, NOINIT, NAME = N'SomeDatabase-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
. This results in:
Msg 33111, Level 16, State 3, Line 108
Cannot find server certificate with thumbprint ''.
Msg 3013, Level 16, State 1, Line 108
BACKUP DATABASE is terminating abnormally.
The same thing happens with full and copy-only backups.
Observing the encryption status:
SELECT a.dbid, a.name AS DatabaseName
,b.encryption_state
,CASE b.encryption_state
WHEN 0 THEN 'No database encryption key present, no encryption'
WHEN 1 THEN 'Unencrypted - Encryption enabled, but not turned on'
WHEN 2 THEN 'Encryption in progress'
WHEN 3 THEN 'Encrypted'
WHEN 4 THEN 'Key change in progress'
WHEN 5 THEN 'Decryption in progress'
WHEN 6 THEN 'Protection change in progress (The certificate or asymmetric key that is encrypting the database encryption key is being changed.)'
END AS encryption_state_desc
,percent_complete
,encryptor_type
,key_algorithm
,key_length
,encryptor_thumbprint
,create_date
,regenerate_date
,modify_date
,set_date
,opened_date
FROM master.dbo.sysdatabases a
LEFT JOIN sys.dm_database_encryption_keys b
ON a.dbid = b.database_id
ORDER BY b.encryption_state desc,
DatabaseName;
The failing databases are in state 1 (database encryption key created but encryption not turned on) and show the new TdeCert thumbprint as the encryptor_thumbprint
column. There are a few databases that are in state 3 (encrypted), and those will happily back up without the old certificate. If I simply put the certificate back on the server, then CHECKDB/backups will happily execute - no additional commands against the database are required.
### Questions ###
* Why is it still trying to use the old certificate? I thought that since the database encryption key was encrypted by the new certificate, there would be no need for the old certificate anymore.
* Why does simply putting the old cert back on the server make things work again? I would have expected to have to change something in the database again.
* Why isn't there an error when I remove the old certificate if something is using it?
* Is there a likely reason that I encountered this behavior on only one server?
### General information ###
- Version: Microsoft SQL Server 2019 (RTM-CU27) (KB5037331) - 15.0.4375.4 (X64) ... Standard Edition
- TDE is set up with symmetric keys, no external key vaults.
- Everything is on-prem, no Azure/cloud factors.
- This is a non-prod server; all databases use the SIMPLE recovery model (no transaction logs).
- These errors occur in automated jobs as well as statements executed directly in SSMS. Both myself and the account used for jobs have sysadmin on the server.
- This is happening on a *single* server. Every other server I have exchanged certificates on (including ones on the same version) has not had this problem, and I cannot replicate this problem on other servers.
- The old cert had the same name as the certificate I would like it to use. I wouldn't expect this to matter - it didn't matter to other servers - but given that something seems to be "stuck", it may be relevant.
### Tests ###
* Failure: CHECKDB outputs an error message and goes to work offline
* Success: No error message, commands execute successfully
1. Fully enable encryption (ALTER DATABASE SomeDatabase SET ENCRYPTION ON) - **failure**
2. Just bring the old cert back, don't alter any databases - **success**
1. But as soon as the old cert is dropped again, it goes back to **failure**
3. Change database encryption key to something else, then run CHECKDB
1. New temporary certificate - **failure**
2. Restoring the desired certificate and changing the DEK to that cert - **success**
3. Transfer back to temporary cert while restored cert is still present - **success**
4. Drop old/restored cert - **failure**
4. Regenerate database encryption key - **failure**
5. Get rid of the new standard TdeCert certificate and rerun the transfer script - still a **failure**
5. Drop the database encryption key, then run CHECKDB - **success**
1. Create a new database encryption key with any cert - **success**
### Related items that didn't solve my problem ###
- [This question](https://dba.stackexchange.com/questions/196364/backup-errors-after-tde-certificate-expired-and-new-cert-was-created-with-always) demonstrates something quite similar, but doesn't seem relevant as I am on a different version, several releases after the described fix.
- [This Microsoft article](https://support.microsoft.com/en-us/topic/kb4052134-fix-errors-33111-and-3013-when-backing-up-tde-encrypted-database-in-sql-server-551fd963-1ed2-ba49-30aa-a017e29b7d98) describing an error where "Cannot find server certificate" occurred when specifying COMPRESSION and MAXTRANSFERSIZE options. The backup command I am using does not specify these options, and the described issue was resolved in older versions.
- There are many questions and articles out there about encountering "Cannot find server certificate" when attempting a restore, none of which appear relevant as I am not doing a restore.
Asked by ImperviousInclemency
(107 rep)
Jul 3, 2024, 02:04 PM
Last activity: Jul 3, 2024, 09:04 PM
Last activity: Jul 3, 2024, 09:04 PM