Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

2 answers

225 views

Query store switches to read only under load

sql-server sql-server-2022 query-store

I have big database with the query store on. However, under load, the query stores switches to read only mode, meaning that that we lack data on what was actually happening. Is there way to override or work around this behavior? ## Details ### @@VERSION: * Microsoft SQL Server 2022 (RTM-CU12) (KB5033663) - 16.0.4115.5 (X64) Mar 4 2024 08:56:10 Copyright (C) 2022 Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2019 Standard 10.0 (Build 17763: ) (Hypervisor) ### Current output of sys.database_query_store_options * desired_state: 2 * desired_state_desc: READ_WRITE * actual_state: 1 * actual_state_desc: READ_ONLY * readonly_reason: 262144 * current_storage_size_mb: 732 * flush_interval_seconds: 900 * interval_length_minutes: 15 * max_storage_size_mb: 65536 * stale_query_threshold_days: 1 * max_plans_per_query: 20 * query_capture_mode: 4 * query_capture_mode_desc: CUSTOM * capture_policy_execution_count: 1000 * capture_policy_total_compile_cpu_time_ms: 1000 * capture_policy_total_execution_cpu_time_ms: 120000 * capture_policy_stale_threshold_hours: 1 * size_based_cleanup_mode: 1 * size_based_cleanup_mode_desc: AUTO * wait_stats_capture_mode: 1 * wait_stats_capture_mode_desc: ON * actual_state_additional_info: ### Waits

SELECT
    dows.*,
    avg_ms_per_wait =
        CONVERT
        (
            decimal(38, 2),
            dows.wait_time_ms /
            NULLIF
            (1.* dows.waiting_tasks_count, 0.)
        )
FROM sys.dm_os_wait_stats AS dows
WHERE dows.wait_type LIKE N'QDS[_]%'
AND   dows.waiting_tasks_count > 0
ORDER BY
    dows.wait_time_ms DESC;

| wait_type | waiting_tasks_count | wait_time_ms | max_wait_time_ms | signal_wait_time_ms | avg_ms_per_wait | |-------------------------------|---------------------|--------------|------------------|---------------------|-----------------| | QDS_PERSIST_TASK_MAIN_LOOP_SLEEP | 18709 | 1122619927 | 60871 | 11675 | 60004.27 | | QDS_ASYNC_QUEUE | 307974 | 1121663651 | 189002 | 821494 | 3642.07 | | QDS_LOADDB | 33 | 64934 | 6420 | 64 | 1967.70 | | QDS_STMT | 12898 | 35639 | 746 | 2987 | 2.76 | | QDS_DYN_VECTOR | 126 | 111 | 7 | 50 | 0.88 | | QDS_ASYNC_PERSIST_TASK_START | 1 | 0 | 0 | 0 | 0.00 | | QDS_TASK_START | 1 | 0 | 0 | 0 | 0.00 |

Yano_of_Queenscastle (1998 rep)

Jun 13, 2025, 07:26 AM • Last activity: Jun 25, 2025, 12:20 AM

0 votes

1 answers

199 views

Query store data maintenance background tasks

sql-server-2016 query-store

Query store has been running for years on the server without any noticeable disk IO spikes on log backups. We run on SQL Server 2016 EE version 13.0.5622.0 Recently we experienced a sudden surge in disk I/O every twenty minutes or so. We backup logs every five minutes, and every third or fourth back...

                                  
Query store has been running for years on the server without any noticeable disk IO spikes on log backups. We run on SQL Server 2016 EE version 13.0.5622.0

Recently we experienced a sudden surge in disk I/O every twenty minutes or so. We backup logs every five minutes, and every third or fourth backup would increase dramatically (from ~100 MB to >3 GB in size).

It happened at 2 AM without any changes being done to the server.

I dumped the contents of some of the log files and noticed a huge surge in swappages on the Query store system data tables when these log backup IO spikes occured. Disabling query store entirely caused the disk IO to resume at normal levels when backing up the log.

I presume the reason for the large transaction log backups would be online index rebuilds on the query store tables. We had configured query store to capture all transactions, auto cleanup, retain data for 180 days and max the data cache to 15 GB.

So far so good. I re-enabled query store after changing the configuration to auto, auto, 120 days and 8 GB, but now we are starting to see the same type of spikes in disk IO every 20 minutes or so when the transaction log is being backed up.

Have anyone had an experience like this and what did they do to fix the issue?

Rasmus Remmer Bielidt (11 rep)

May 21, 2021, 07:03 AM • Last activity: Jun 22, 2025, 06:07 PM

1 votes

0 answers

73 views

What's the best way to track table usage across multiple schemas?

sql-server audit query-store

I'm working with a large SQL Server database that contains approximately 1000 tables across several schemas. I need to determine which of these tables are actually being used in user queries, stored procedures, views, functions, and ad-hoc queries. What I've considered so far: - SQL Server Audit - S...

                                  I'm working with a large SQL Server database that contains approximately 1000 tables across several schemas. I need to determine which of these tables are actually being used in user queries, stored procedures, views, functions, and ad-hoc queries.
What I've considered so far:
 - SQL Server Audit - Setting up auditing to track table access, but I'm concerned about the performance impact and log volume with so many tables.
 - Query Store - I'm wondering if Query Store can help identify which tables are being referenced in queries, but I'm not sure if it can track all the objects I need across schemas.
My goal is to generate a report showing which tables are actively used and which ones might be candidates for archiving or removal. I need to track all types of access including: SELECT/INSERT/UPDATE/DELETE operations, References in views, Usage in stored procedures and functions, Ad-hoc queries from applications

Is Query Store the right approach for this, or would SQL Server Audit be better? Are there other options I should consider for tracking table usage at this scale with minimal performance impact?

Thanks in advance for any advice or recommendations!

                                

adam.g (465 rep)

Jun 16, 2025, 10:19 AM • Last activity: Jun 16, 2025, 01:06 PM

4 votes

1 answers

433 views

Query Store Plan Forcing is Failing with OPTIMIZATION_REPLAY_FAILED

sql-server azure-sql-database query-store

I am trying to use Query Store to force a particular plan in an Azure SQL DB database, and it's not obeying me! I've checked in `sys.query_store_plan`, and it has a column which describes the reason for the failure. Some of these are pretty straightforward to understand (e.g. "NO_INDEX: Index specif...

                                  I am trying to use Query Store to force a particular plan in an Azure SQL DB database, and it's not obeying me! I've checked in sys.query_store_plan, and it has a column which describes the reason for the failure. Some of these are pretty straightforward  to understand (e.g. "NO_INDEX: Index specified in plan no longer exists"). However, the error message that I get is OPTIMIZATION_REPLAY_FAILED, which is described in the documentation as "The optimization replay script failed to execute." 

I would like to figure out why the optimization replay script failed to execute, but I can't find anything out there that really says anything more detailed that what I've wrote above. How I can figure out what is stopping my plan from being forced?

Andy (151 rep)

Jun 26, 2024, 01:27 PM • Last activity: Apr 22, 2025, 03:03 PM

0 votes

0 answers

45 views

How to get info from Query Store for ONLY scans and seeks for a specific index, and NOT for when index is just updated

sql-server xml query-store

I'm am finding indexes to delete, and have come up with some scripts that are really useful, in terms of figuring out when an index is used. There's 2 sql statements, the first creates a temp table called #Main with query store info, formatted nicely. The second queries #Main for usage of the specif...

-- This puts the last 2 days worth of query store data in a temp table
if object_id('tempdb..#Main') is not null drop table tempdb..#Main
SELECT 
    ObjectName = ISNULL(OBJECT_NAME(query_store_query.object_id), 'AdHoc Query ') 
    ,QueryID = query_store_query.Query_ID
    ,PlanID = query_store_plan.Plan_ID
    ,Duration = ROUND(CONVERT(FLOAT, SUM(query_store_runtime_stats.avg_duration * query_store_runtime_stats.count_executions)) * 0.001, 0) 
    ,CPU = SUM(query_store_runtime_stats.avg_cpu_time * query_store_runtime_stats.count_executions)
    ,QueryPlan = query_store_plan.query_plan
Into #Main
FROM sys.query_store_runtime_stats query_store_runtime_stats
    JOIN sys.query_store_plan query_store_plan
        ON query_store_plan.plan_id = query_store_runtime_stats.plan_id
    JOIN sys.query_store_query query_store_query
        ON query_store_query.query_id = query_store_plan.query_id
    JOIN sys.query_store_query_text query_store_query_text
        ON query_store_query.query_text_id = query_store_query_text.query_text_id
WHERE NOT (
    query_store_runtime_stats.first_execution_time > getdate()
    OR query_store_runtime_stats.last_execution_time

If it's just updated, the xml of the query plan looks like this. There's usually some other indexes being updated as well:

My XML skills are not good enough to extract JUST the seeks/scans, and not the updates. Can anyone help out?

Sylvia (213 rep)

Mar 28, 2025, 07:09 PM • Last activity: Mar 28, 2025, 08:13 PM

3 votes

1 answers

77 views

SQL Server Query Store goes into error state with FILESTREAM garbage collection error

sql-server-2019 filestream query-store

I have a database on a SQL Server 2019 instance which has query store enabled and has done for many years All of a sudden, the query store state returned by SELECT actual_state_desc FROM sys.database_query_store_options went to > ERROR so I re-enabled it ALTER DATABASE [BOB] SET QUERY_STORE (OPERATI...

                                  I have a database on a SQL Server 2019 instance which has query store enabled and has done for many years

All of a sudden, the query store state returned by 

    SELECT actual_state_desc FROM sys.database_query_store_options

went to 

> ERROR

so I re-enabled it

    ALTER DATABASE [BOB] SET QUERY_STORE (OPERATION_MODE = READ_WRITE)

and all was fine. However, a few hours later, the same thing happened and I noted the following message in the SQL Server log around the same time

> Internal FILESTREAM error: failed to access the garbage collection table.

It seems the two are related, though I am not sure how exactly.

Our database doesn't have any filestream files:

    SELECT	COUNT(*)
    FROM	sys.database_files
    WHERE	type_desc = 'FILESTREAM'

returns 0

I have re-enabled query store once again but would like to understand what is happening here to prevent this happening again in the future.

Can anyone advise what is happening and how I may fix it?

SE1986 (2182 rep)

Mar 5, 2025, 10:37 AM • Last activity: Mar 6, 2025, 11:11 AM

0 votes

1 answers

1214 views

SSMS Query store keeps getting disabled

sql-server ssms query-store

I activated query store about a month ago with the settings below. Now, every morning starting up SSMS I find the Query store folder gone and query store disabled. Initially I used 1 Gb size, but changed it to five once it started disabling. This did not alleviate the problem. I found nothing about...

                                  I activated query store about a month ago with the settings below. Now, every morning starting up SSMS I find the Query store folder gone and query store disabled. Initially I used 1 Gb size, but changed it to five once it started disabling. This did not alleviate the problem. I found nothing about this with google. I'm using SSMS 18.9.2. and SQL Server 14.0.3411.3. What could be causing this problem? Last server reboot was over a week ago and I've activated it twice since, so this does not seem to be the cause.

mpn275 (11 rep)

Sep 27, 2021, 06:01 AM • Last activity: Feb 27, 2025, 12:07 PM

1 votes

0 answers

51 views

Calculate total compile time spent in an runtime interval from query store data

sql-server query-performance query-store

I'm trying to create a query to fill a bar graph to show the amount of - query execution time - CPU time - Plan compile time per runtime interval. The first 2 are easy, because the are already stored per runtime_stats_interval in query store. The compile time is more difficult since it's stored per...

SELECT IntervalID = rs.runtime_stats_interval_id, -- just the interval id
    IntervalStart = rsi.start_time,  -- Start of the 10-minute interval
    ExecCount     = SUM(rs.count_executions),  -- Total executions in the interval
	TotalExecTime_s = SUM(rs.avg_duration * rs.count_executions) / 1000000.0,  -- Total execution time in seconds
	TotalCPUTime_s = SUM(rs.avg_cpu_time * rs.count_executions) / 1000000.0,  -- Total CPU time in seconds
    TotalCompileTime_s = SUM(
        CASE --when the inital compilation time and the last_compilation_time both fall within the runtime_stats_interval us it.
            WHEN p.initial_compile_start_time >= rsi.start_time  
              AND p.last_compile_start_time < rsi.end_time 
            THEN p.count_compiles * p.avg_compile_duration 
            ELSE 0 --if it doesn't don't use it.
        END
    ) / 1000000.0  -- Total compile time in seconds
FROM sys.query_store_runtime_stats AS rs
JOIN sys.query_store_runtime_stats_interval AS rsi 
    ON rs.runtime_stats_interval_id = rsi.runtime_stats_interval_id
JOIN sys.query_store_plan AS p
    ON rs.plan_id = p.plan_id
WHERE rs.execution_type = 0  -- Only successful executions
GROUP BY rs.runtime_stats_interval_id, rsi.start_time
ORDER BY 2 desc;

Figuring that if the plan compiled within the runtime stats interval I can safely use the compile time. It give this result:

And I was happy with that. Until I was manually checking some plans that did execute **within** the runtime stat interval but were compiled **before** the runtime stats interval:

At first I thought, no problem, this plan was compiled on the 18th of january, it's now being reused. Great. However what I don't understand is that I don't see the plan in query_runtime_stats before the 2025-02-06. Since it was compiled on the 18th of januari the compiletime will never be added in my query above. Another scenario is when a single plan is being (re)compiled many times in a much wider interval then the runtime_stats_interval (mine is 10 minutes). for example:

the first plan is constantly being compiled between 2024-11-30 and 2025-01-18. Compiletime from this plan is not being added to the query as well. Are there any solutions for these 2 scenarios?

Edward Dortland (2865 rep)

Feb 6, 2025, 09:26 PM

1 votes

2 answers

613 views

SQL Server Extended Event Session for Attention is missing field sql_text

sql-server extended-events query-store

I have an extended event session set up to track Attention events, to monitor procedures that timed out. This is the create script: CREATE EVENT SESSION [ApplicationTimeout3] ON SERVER ADD EVENT sqlserver.attention( ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_name,s...

                                  I have an extended event session set up to track Attention events, to monitor procedures that timed out. This is the create script:

    CREATE EVENT SESSION [ApplicationTimeout3] ON SERVER 
    ADD EVENT sqlserver.attention(
    ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_name,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_stack,sqlserver.username)
        WHERE ([package0].[equal_boolean]([sqlserver].[is_system],(0))))
    ADD TARGET package0.event_file(SET filename=N'D:\ExtendedEvents\ApplicationTimeout3.xel',max_file_size=(100),max_rollover_files=(5))
    WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
    GO

The problem is that in the timeouts that I get, I do NOT get the sql_text. (Nor do I get tsql_frame or tsql_stack). It's not returned when I do the "Watch Live Data".

I do SEE the timeouts, they occur at the same time as the ones I see when I query against sys.query_store_runtime_stats (with execution_type_desc = 'Aborted'). However, I get a limited set of fields back, and they're not the ones I need.

Also incidentally, there are many MORE events logged via the Extended Event than what I see when I query sys.query_store_runtime_stats. I have no idea what they are.

Any clues as to how I could actually get the sql that's being executed? I can get the stored procedure name by querying the query store tables. What I'd like to get is the parameters that the stored procedure is called with.

 
                                

Sylvia (213 rep)

Nov 29, 2023, 08:06 PM • Last activity: Jan 11, 2025, 02:08 PM

6 votes

1 answers

445 views

Query Store plan force fails with NO_PLAN dependent on where filter operator is in plan

sql-server sql-server-2016 execution-plan query-store hints

I have a query which I force a plan for in Query Store (the plan is the one SQL Server compiled for this query) If I run the query immediately afer forcing the plan, I get the `NO_PLAN` last_force_failure_reason_desc despite no changes to the database. I can force a different plan for the same query...

                                  I have a query which I force a plan for in Query Store (the plan is the one SQL Server compiled for this query) If I run the query immediately afer forcing the plan, I get the NO_PLAN last_force_failure_reason_desc despite no changes to the database. I can force a different plan for the same query successfully

The problem can be illustrated below:

Create our test database

    USE [master]
    CREATE DATABASE NO_PLAN
    ALTER DATABASE [NO_PLAN] SET QUERY_STORE = ON
    ALTER DATABASE [NO_PLAN] SET QUERY_STORE (OPERATION_MODE = READ_WRITE, QUERY_CAPTURE_MODE = ALL)
    GO
    
    USE NO_PLAN
    GO
    IF EXISTS (SELECT 1 FROM sys.tables WHERE name = 'MyTableA') DROP TABLE MyTableA
    IF EXISTS (SELECT 1 FROM sys.tables WHERE name = 'MyTableB') DROP TABLE MyTableB
    
    /* create  our tables */
    CREATE TABLE [dbo].[MyTableA](
        [Column1] VARCHAR(50) NULL ,
        [Column2] VARCHAR(255) NULL ,
        [Column3] INT NULL ,
        [Column4] DATETIME NULL ,
        [Column5] INT NULL ,
        [Column6] VARCHAR(50) NULL ,
        [Column7] VARCHAR(255) NULL ,
        [Column8] INT NULL ,
        [Column9] DATETIME NULL ,
        [Column10] INT NULL ,
        [Column11] INT NULL ,
        [Column12] DATETIME NULL ,
        [Column13] VARCHAR(50) NULL ,
        [Column14] VARCHAR(50) NULL ,
        [Column15] DATETIME NULL ,
        [Column16] DATETIME NULL ,
        [Column17] VARCHAR(8) NULL ,
        [Column18] DATETIME NULL ,
        [Column19] INT NULL ,
        [Column20] INT NULL ,
        [Column21] VARCHAR(50) NULL ,
        [Column22] VARCHAR(255) NULL ,
        [Column23] VARCHAR(50) NULL ,
        [Column24] VARCHAR(255) NULL ,
        [Column25] VARCHAR(50) NULL ,
        [Column26] INT NULL ,
        [Column27] INT NULL ,
        [Column28] INT NULL ,
        [Column29] INT NULL ,
        [Column30] INT NULL ,
        [Column31] INT NULL ,
        [Column32] INT NULL ,
        [Column33] INT NULL ,
        [Column34] INT NULL ,
        [Column35] VARCHAR(50) NULL ,
        [Column36] VARCHAR(50) NULL ,
        [Column37] VARCHAR(50) NULL ,
        [Column38] VARCHAR(50) NULL ,
        [Column39] VARCHAR(255) NULL ,
        [Column40] INT NULL ,
        [Column41] VARCHAR(50) NULL ,
        [Column42] INT NULL ,
        [Column43] VARCHAR(255) NULL ,
        [Column44] INT NULL ,
        [Column45] VARCHAR(255) NULL ,
        [Column46] INT NULL ,
        [Column47] DATETIME NULL ,
        [Column48] DATETIME NULL ,
        [Column49] DATETIME NULL ,
        [Column50] INT NULL ,
        [Column51] VARCHAR(50) NULL ,
        [Column52] VARCHAR(255) NULL ,
        [Column53] VARCHAR(50) NULL ,
        [Column54] VARCHAR(255) NULL ,
        [Column55] VARCHAR(50) NULL ,
        [Column56] VARCHAR(255) NULL ,
        [Column57] VARCHAR(50) NULL ,
        [Column58] VARCHAR(50) NULL ,
        [Column59] CHAR NULL ,
        [Column60] CHAR NULL ,
        [Column61] CHAR NULL ,
        [Column62] CHAR NULL ,
        [Column63] CHAR NULL ,
        [Column64] CHAR NULL ,
        [Column65] CHAR NULL ,
        [Column66] CHAR NULL ,
        [Column67] CHAR NULL ,
        [Column68] CHAR NULL ,
        [Column69] CHAR NULL ,
        [Column70] CHAR NULL ,
        [Column71] CHAR NULL ,
        [Column72] CHAR NULL ,
        [Column73] CHAR NULL ,
        [Column74] CHAR NULL ,
        [Column75] CHAR NULL ,
        [Column76] DATETIME NULL ,
        [Column77] INT NULL ,
        [Column78] INT NULL ,
        [Column79] VARCHAR(50) NULL ,
        [Column80] VARCHAR(255) NULL ,
        [Column81] VARCHAR(50) NULL ,
        [Column82] VARCHAR(255) NULL ,
        [Column83] VARCHAR(50) NULL ,
        [Column84] VARCHAR(255) NULL ,
        [Column85] VARCHAR(50) NULL ,
        [Column86] VARCHAR(255) NULL ,
        [Column87] VARCHAR(50) NULL ,
        [Column88] VARCHAR(255) NULL ,
        [Column89] VARCHAR(50) NULL ,
        [Column90] VARCHAR(255) NULL ,
        [Column91] VARCHAR(50) NULL ,
        [Column92] VARCHAR(255) NULL ,
        [Column93] VARCHAR(50) NULL ,
        [Column94] VARCHAR(255) NULL ,
        [Column95] VARCHAR(50) NULL ,
        [Column96] VARCHAR(255) NULL ,
        [Column97] VARCHAR(50) NULL ,
        [Column98] VARCHAR(255) NULL ,
        [Column99] VARCHAR(50) NULL ,
        [Column100] VARCHAR(255) NULL ,
        [Column101] VARCHAR(50) NULL ,
        [Column102] VARCHAR(255) NULL ,
        [Column103] VARCHAR(50) NULL ,
        [Column104] VARCHAR(255) NULL ,
        [Column105] VARCHAR(50) NULL ,
        [Column106] VARCHAR(255) NULL ,
        [Column107] VARCHAR(50) NULL ,
        [Column108] VARCHAR(50) NULL ,
        [Column109] VARCHAR(50) NULL ,
        [Column110] VARCHAR(255) NULL ,
        [Column111] VARCHAR(50) NULL ,
        [Column112] VARCHAR(255) NULL ,
        [Column113] VARCHAR(50) NULL ,
        [Column114] VARCHAR(255) NULL ,
        [Column115] VARCHAR(50) NULL ,
        [Column116] VARCHAR(255) NULL ,
        [Column117] VARCHAR(50) NULL ,
        [Column118] VARCHAR(255) NULL ,
        [Column119] VARCHAR(50) NULL ,
        [Column120] VARCHAR(50) NULL ,
        [Column121] VARCHAR(255) NULL ,
        [Column122] VARCHAR(50) NULL ,
        [Column123] VARCHAR(255) NULL ,
        [Column124] VARCHAR(50) NULL ,
        [Column125] VARCHAR(255) NULL ,
        [Column126] VARCHAR(50) NULL ,
        [Column127] VARCHAR(255) NULL ,
        [Column128] VARCHAR(50) NULL ,
        [Column129] VARCHAR(255) NULL ,
        [Column130] VARCHAR(50) NULL ,
        [Column131] VARCHAR(255) NULL ,
        [Column132] DATETIME NULL ,
        [Column133] VARCHAR(50) NULL ,
        [Column134] VARCHAR(255) NULL ,
        [Column135] VARCHAR(50) NULL ,
        [Column136] INT NULL ,
        [Column137] VARCHAR(50) NULL ,
        [Column138] VARCHAR(255) NULL ,
        [Column139] VARCHAR(50) NULL ,
        [Column140] VARCHAR(255) NULL ,
        [Column141] VARCHAR(50) NULL ,
        [Column142] VARCHAR(255) NULL ,
        [Column143] VARCHAR(50) NULL ,
        [Column144] VARCHAR(255) NULL ,
        [Column145] VARCHAR(50) NULL ,
        [Column146] VARCHAR(255) NULL ,
        [Column147] VARCHAR(50) NULL ,
        [Column148] VARCHAR(255) NULL ,
        [Column149] VARCHAR(50) NULL ,
        [Column150] VARCHAR(255) NULL ,
        [Column151] VARCHAR(50) NULL ,
        [Column152] VARCHAR(255) NULL ,
        [Column153] VARCHAR(50) NULL ,
        [Column154] VARCHAR(255) NULL ,
        [Column155] VARCHAR(50) NULL ,
        [Column156] VARCHAR(255) NULL ,
        [Column157] VARCHAR(50) NULL ,
        [Column158] VARCHAR(255) NULL ,
        [Column159] INT NULL ,
        [Column160] INT NULL ,
        [Column161] VARCHAR(50) NULL ,
        [Column162] VARCHAR(50) NULL ,
        [Column163] VARCHAR(50) NULL ,
        [Column164] VARCHAR(50) NULL ,
        [Column165] VARCHAR(50) NULL ,
        [Column166] VARCHAR(50) NULL ,
        [Column167] VARCHAR(50) NULL ,
        [Column168] VARCHAR(50) NULL ,
        [Column169] VARCHAR(255) NULL ,
        [Column170] INT NULL ,
        [Column171] VARCHAR(50) NULL ,
        [Column172] INT NULL ,
        [Column173] VARCHAR(50) NULL ,
        [Column174] VARCHAR(50) NULL ,
        [Column175] VARCHAR(50) NULL ,
        [Column176] VARCHAR(255) NULL ,
        [Column177] VARCHAR(50) NULL ,
        [Column178] VARCHAR(255) NULL ,
        [Column179] VARCHAR(50) NULL ,
        [Column180] VARCHAR(50) NULL ,
        [Column181] VARCHAR(50) NULL ,
        [Column182] VARCHAR(255) NULL ,
        [Column183] VARCHAR(50) NULL ,
        [Column184] VARCHAR(255) NULL ,
        [Column185] VARCHAR(50) NULL ,
        [Column186] VARCHAR(255) NULL ,
        [Column187] VARCHAR(50) NULL ,
        [Column188] VARCHAR(255) NULL ,
        [Column189] VARCHAR(50) NULL ,
        [Column190] VARCHAR(50) NULL ,
        [Column191] VARCHAR(50) NULL ,
        [Column192] VARCHAR(255) NULL ,
        [Column193] VARCHAR(50) NULL ,
        [Column194] VARCHAR(255) NULL ,
        [Column195] VARCHAR(50) NULL ,
        [Column196] VARCHAR(50) NULL ,
        [Column197] VARCHAR(255) NULL ,
        [Column198] INT IDENTITY (1,1) ,
        [Column199] VARCHAR(500) NULL ,
        [Column200] VARCHAR(255) NULL ,
        [Column201] VARCHAR(50) NULL ,
        [Column202] VARCHAR(255) NULL ,
        [Column203] CHAR NULL ,
        [Column204] CHAR NULL ,
        [Column205] VARCHAR(50) NULL ,
        [Column206] VARCHAR(255) NULL ,
        [Column207] BIGINT NULL ,
        [Column208] VARCHAR(50) NULL ,
        [Column209] VARCHAR(50) NULL ,
        [Column210] VARCHAR(50) NULL ,
        [Column211] VARCHAR(255) NULL ,
        [Column212] VARCHAR(50) NULL ,
        [Column213] VARCHAR(255) NULL ,
        [Column214] VARCHAR(50) NULL ,
        [Column215] VARCHAR(50) NULL ,
        [Column216] VARCHAR(50) NULL ,
        [Column217] VARCHAR(50) NULL ,
        [Column218] VARCHAR(50) NULL ,
        [Column219] VARCHAR(50) NULL ,
        [Column220] VARCHAR(50) NULL ,
        [Column221] VARCHAR(50) NULL ,
        [Column222] DATETIME NULL ,
        [Column223] VARCHAR(50) NULL ,
        [Column224] VARCHAR(50) NULL ,
        [Column225] CHAR NULL ,
        [Column226] CHAR NULL ,
        [Column227] CHAR NULL ,
        [Column228] CHAR NULL ,
        [Column229] CHAR NULL ,
        [Column230] CHAR NULL ,
        [Column231] VARCHAR(50) NULL ,
        [Column232] VARCHAR(50) NULL ,
        [Column233] VARCHAR(50) NULL ,
        [Column234] VARCHAR(255) NULL ,
        [Column235] VARCHAR(50) NULL ,
        [Column236] VARCHAR(50) NULL ,
        [Column237] VARCHAR(255) NULL ,
        [Column238] VARCHAR(50) NULL ,
        [Column239] VARCHAR(255) NULL ,
        [Column240] VARCHAR(50) NULL ,
        [Column241] VARCHAR(255) NULL ,
        [Column242] CHAR NULL ,
        [Column243] CHAR NULL ,
        [Column244] DATE NULL ,
        [Column245] DATE NULL ,
        [Column246] DATE NULL ,
        [Column247] VARCHAR(50) NULL ,
        [Column248] VARCHAR(255) NULL ,
        [Column249] VARCHAR(50) NULL ,
        [Column250] VARCHAR(255) NULL ,
        [Column251] DATE NULL ,
        [Column252] DATE NULL ,
        CONSTRAINT [PKC_MyTableA] PRIMARY KEY CLUSTERED 
        (
            [Column198] ASC
        )
    )
    GO
    
    CREATE TABLE [dbo].[MyTableB]
    (
        Column1 [INT] IDENTITY(1,1) NOT NULL,
        Column2 [INT] NULL,
        Column3 [VARCHAR](255) NOT NULL,
        Column4 [VARCHAR](255) NULL,
        Column5 [CHAR](1) NOT NULL,
        Column6 [VARCHAR](MAX) NULL,
        Column7 [VARCHAR](50) NULL,
        CONSTRAINT [PK_MyTableB] PRIMARY KEY CLUSTERED 
        (
            Column3 ASC
        )
    )
    GO

Insert some dummy data:

    DECLARE @valsSQL NVARCHAR(MAX) = 'SET IDENTITY_INSERT MyTableA ON; 
    
    INSERT INTO [MyTableA] (' 
    
    SELECT  @valsSQL += c.name + ','
    FROM    sys.columns c
            JOIN sys.tables t
                ON c.object_id = t.object_id
    WHERE   t.name = 'MyTableA'
    ORDER BY column_id
    
    SET @valsSQL = STUFF(@valsSQL,LEN(@valsSQL),1,')')
    
    SET @valsSQL += ' VALUES ( '
    
    SELECT  @valsSql +=
            CASE
                WHEN c.system_type_id = 167 OR --varchar 
                        c.system_type_id = 175 -- char
                THEN '''' +  REPLICATE('a',c.max_length) + ''''
                WHEN c.system_type_id = 61
                THEN '''' +  CONVERT(NVARCHAR,GETDATE(),120) + ''''
                WHEN c.system_type_id = 56 OR --int OR
                    c.system_type_id = 47 OR -- bigint
                    c.system_type_id = 127
                THEN CONVERT(NVARCHAR(10),CONVERT(INT,FLOOR(RAND()*2147483647)))
                WHEN c.system_type_id = 40
                THEN '''' +  '1900-01-01' + ''''
            END + ','
    FROM    sys.columns c
            JOIN sys.types t
                ON c.system_type_id = t.system_type_id
    WHERE   OBJECT_NAME(object_id) = 'MyTableA'
    ORDER BY column_id
    
    SET @valsSQL = STUFF(@valsSQL,LEN(@valsSQL),1,')')
    SET @valsSQL += '; SET IDENTITY_INSERT MyTableA OFF;'
    
    EXEC sp_executesql @stmt = @valsSQL 
    GO 500
    
 

Now the database is set up, run the query:

    USE NO_PLAN
    SELECT  1
            -- my unique text to find this query in query store views
    FROM    MyTableA 
            INNER JOIN MyTableB Alias  
                ON Alias.Column3 = 'value'
            LEFT JOIN MyTableB  
                ON MyTableB.Column3 =  'value'
    WHERE   MyTableB.Column4 IS NULL

NB - the actual execution plan is here 

use the query store DMVs to get the query id and plan id so we can force the plan:

    SELECT  t.query_sql_text,
            q.query_id,
            p.plan_id,
            p.query_plan,
            p.is_forced_plan,
            p.last_force_failure_reason_desc,
            p.last_execution_time 
     FROM    sys.query_store_plan p
            JOIN sys.query_store_query q
                ON q.query_id = p.query_id
            JOIN sys.query_store_query_text t
                ON t.query_text_id = q.query_text_id 
     WHERE   t.query_sql_text LIKE '%-- my unique text to find this query in query store views%' AND
            t.query_sql_text NOT LIKE '%sys.query_store_plan%' /* exclude this query */

My output is below:



Now force SQL server to use the plan it just compiled, each time it runs this query

    EXEC sp_query_store_force_plan @query_id = 6, @plan_id = 6

Run the query again:

    USE NO_PLAN
    SELECT  1
            -- my unique text to find this query in query store views
    FROM    MyTableA 
            INNER JOIN MyTableB Alias  
                ON Alias.Column3 = 'value'
            LEFT JOIN MyTableB  
                ON MyTableB.Column3 =  'value'
    WHERE   MyTableB.Column4 IS NULL

Check the query store DMVs to see if it used the plan:

    SELECT  t.query_sql_text,
            q.query_id,
            p.plan_id,
            p.query_plan,
            p.is_forced_plan,
            p.last_force_failure_reason_desc,
            p.last_execution_time
    FROM    sys.query_store_plan p
            JOIN sys.query_store_query q
                ON q.query_id = p.query_id
            JOIN sys.query_store_query_text t
                ON t.query_text_id = q.query_text_id
    WHERE   t.query_sql_text LIKE '%-- my unique text to find this query in query store views%' AND
            t.query_sql_text NOT LIKE '%sys.query_store_plan%' /* exclude this query */

We can see a failure reason of NO_PLAN:



If I reset things by truncating the table, clearing query store and then adding only 20 rows to the table (or dropping the database and running all the above setup above but use GO 20 rather than GO 500):

    USE NO_PLAN;
    ALTER DATABASE NO_PLAN SET QUERY_STORE CLEAR;
    TRUNCATE TABLE [MyTableA];

    DECLARE @valsSQL NVARCHAR(MAX) = 'SET IDENTITY_INSERT MyTableA ON; 
    
    INSERT INTO [MyTableA] (' 
    
    SELECT  @valsSQL += c.name + ','
    FROM    sys.columns c
            JOIN sys.tables t
                ON c.object_id = t.object_id
    WHERE   t.name = 'MyTableA'
    ORDER BY column_id
    
    SET @valsSQL = STUFF(@valsSQL,LEN(@valsSQL),1,')')
    
    SET @valsSQL += ' VALUES ( '
    
    SELECT  @valsSql +=
            CASE
                WHEN c.system_type_id = 167 OR --varchar 
                        c.system_type_id = 175 -- char
                THEN '''' +  REPLICATE('a',c.max_length) + ''''
                WHEN c.system_type_id = 61
                THEN '''' +  CONVERT(NVARCHAR,GETDATE(),120) + ''''
                WHEN c.system_type_id = 56 OR --int OR
                    c.system_type_id = 47 OR -- bigint
                    c.system_type_id = 127
                THEN CONVERT(NVARCHAR(10),CONVERT(INT,FLOOR(RAND()*2147483647)))
                WHEN c.system_type_id = 40
                THEN '''' +  '1900-01-01' + ''''
            END + ','
    FROM    sys.columns c
            JOIN sys.types t
                ON c.system_type_id = t.system_type_id
    WHERE   OBJECT_NAME(object_id) = 'MyTableA'
    ORDER BY column_id
    
    SET @valsSQL = STUFF(@valsSQL,LEN(@valsSQL),1,')')
    SET @valsSQL += '; SET IDENTITY_INSERT MyTableA OFF;'
    
    EXEC sp_executesql @stmt = @valsSQL 
    GO 20

then once again run the query, I get a different plan  (note the filter operator placement has changed)

If I then repeat the process of getting the query_id and plan_id, forcing the plan and re-running the query, this time it will force the plan:



I can confirm that NO_PLAN plan cannot be forced with OPTION (RECOMPILE, USE PLAN N'') hints, I get 

> Msg 8698, Level 16, State 0, Line 5 Query processor could not produce
> query plan because USE PLAN hint contains plan that could not be
> verified to be legal for query. Remove or replace USE PLAN hint. For
> best likelihood of successful plan forcing, verify that the plan
> provided in the USE PLAN hint is one generated automatically by SQL
> Server for the same query.

A number of articles suggest that the NO_PLAN failure reason is due to changing indexes, however, as can be seen from the example above,  nothing has changed between forcing and running the query for the second time.

Article A 

Article B 

Why can SQL server not be forced to use a plan it just generated, when nothing has changed? What is it about the first plan that causes the forcing to fail an why is that not an issue for the second plan?











                                

SE1986 (2182 rep)

Sep 27, 2022, 04:07 PM • Last activity: Jan 11, 2025, 09:16 AM

2 votes

1 answers

1087 views

Strange behaviour of force plan in query store

sql-server query-store

I force plan in query store. Plan is connected to procedure in a job which run once a day. One of step of this job is just: ``` EXEC [schema].[LoadData] ``` Procedure [schema].[LoadData] looks like ``` TRUNCATE TABLE [schema].[Data]; INSERT INTO [schema].[Data] ([A1], [A2], . . ., [A49] ) SELECT * ,...

I force plan in query store. Plan is connected to procedure in a job which run once a day. One of step of this job is just:

EXEC [schema].[LoadData]

Procedure [schema].[LoadData] looks like

TRUNCATE TABLE [schema].[Data];
	 
	INSERT INTO [schema].[Data]
           ([A1],
           [A2],
          .
          .
          .,
          [A49]
	)		         
	SELECT *
		,CURRENT_TIMESTAMP AS [Insert TimeStamp] 
	FROM [schema].[View]

where view is a view which contains some CTEs and use synonyms (connect to tables from different databases). In query store execution looks like :

To test If forcing plan is working I follow below steps: 1. Run query in SSMS -> EXEC [schema].[LoadData] 2. Above execution was treated as different query so did not see anything new in Query Store for query = 7 3. DBA create a new job just with step which is running query -> EXEC [schema].[LoadData] 4. Run of above newly created job cause Plan Id = 29800 Question why execution plan was not forced ? In column "forced plan failure count" is 0.

adam.g (465 rep)

Feb 11, 2022, 04:22 PM • Last activity: Jan 10, 2025, 12:38 PM

1 votes

0 answers

15 views

Understanding plan forcing failures with Automatic Tuning enabled

execution-plan azure-sql-database query-store

I've recently had some fun with forcing plans and Automatic Tuning in Azure SQL database, and I am trying to understand what happened after the event. I had a plan that was forced and working fine, call that plan id 1, and then Automatic Tuning created an index. After the index was created the force...

                                  I've recently had some fun with forcing plans and Automatic Tuning in Azure SQL database, and I am trying to understand what happened after the event. I had a plan that was forced and working fine, call that plan id 1, and then Automatic Tuning created an index. After the index was created the forced plan failed for plan id 1 with a plan_forcing_type_desc of NO_PLAN. I was then able to force a different plan, call that plan id 2. The next day the index was reverted (i.e. dropped) by Automatic Tuning. Plan id 2 is now failing to be forced with a plan_forcing_type_desc of NO_PLAN, and I am now able to force plan id 1 (which I couldn't do when the index existed). Plan id 2 definitely references the index, so I can understand why it no longer can be forced when the index has been dropped.

There are two things that I don't understand here:

1. Creating an index shouldn't stop a plan from being forced should it? I did a basic test of my own with a simple select, and forcing a plan didn't mind if I created an index (even if that index would have improved the forced plan).
2. When plan id 2 couldn't be forced after the index was dropped, I would expect plan_forcing_type_desc to say NO_INDEX, but it definitely said NO_PLAN.

Does anyone understand what is going on here? It should be noted that this statement is reasonably complex involving joining 4 tables together, different join types, a subquery in the where clause etc. The basic test that I did was a really simple proc to retrieve data from a single table based on a single parameter. I am wondering if plan forcing behaves differently depending on how complex the statement is?
Thanks for any ideas!
                                

Andy (151 rep)

Jan 10, 2025, 11:11 AM

11 votes

3 answers

344 views

Huge log backups due to enabling querystore

sql-server sql-server-2019 database-internals query-store

We have a SQL Server 2019 CU18 where we discovered a strange issue with querystore. Normally the average size of the hourly logbackup is 40MB but as soon as we enable querystore the average size of the logbackup is 2.5GB. There are (according to querystore) 140.000 queries executed/hour. This is abo...

                                  We have a SQL Server 2019 CU18 where we discovered a strange issue with querystore.
Normally the average size of the hourly logbackup is 40MB but as soon as we enable querystore the average size of the logbackup is 2.5GB.

There are (according to querystore) 140.000 queries executed/hour. This is about 40 executions/second.

This is the config of our querystore:

    ALTER DATABASE [db_name]
    SET QUERY_STORE = ON
    	(
    		OPERATION_MODE = READ_WRITE
    		,CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 45)
    		,DATA_FLUSH_INTERVAL_SECONDS = 900
    		,MAX_STORAGE_SIZE_MB = 2048
    		,INTERVAL_LENGTH_MINUTES = 30
    		,SIZE_BASED_CLEANUP_MODE = AUTO
    		,QUERY_CAPTURE_MODE = AUTO
    );

When I open such a big logbackup file with fn_dump_dblog I see that multiple transactions happen in the same second. The transactions all have the name 'SwapPage'.

|Operation        |CONTEXT                |AllocUnitId      |Page ID      |Transaction Name |
|-----------------|-----------------------|-----------------|-------------|-----------------|
|LOP_BEGIN_XACT   |LCX_NULL               |NULL             |NULL         |SwapPage			|
|LOP_INSYSXACT    |LCX_INDEX_INTERIOR     |72057594047692800|0001:00056321|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:000a871c|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:0000041b|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:0000041c|NULL				|
|LOP_FORMAT_PAGE  |LCX_UNLINKED_REORG_PAGE|72057594047692800|0001:000a8715|NULL				|
|LOP_MODIFY_HEADER|LCX_UNLINKED_REORG_PAGE|72057594047692800|0001:000a8715|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:000a8715|NULL				|
|LOP_MODIFY_HEADER|LCX_HEAP               |72057594047692800|0001:000a871c|NULL				|
|LOP_MODIFY_HEADER|LCX_HEAP               |72057594047692800|0001:0000041c|NULL				|
|LOP_INSERT_ROWS  |LCX_CLUSTERED          |72057594047692800|0001:000a8715|NULL				|
|LOP_MODIFY_HEADER|LCX_HEAP               |72057594047692800|0001:000a8715|NULL				|
|LOP_MODIFY_HEADER|LCX_HEAP               |72057594047692800|0001:000a8715|NULL				|
|LOP_MODIFY_ROW   |LCX_INDEX_INTERIOR     |72057594047692800|0001:00056321|NULL				|
|LOP_MODIFY_HEADER|LCX_HEAP               |72057594047692800|0001:0000041b|NULL				|
|LOP_MODIFY_HEADER|LCX_HEAP               |72057594047692800|0001:0000041b|NULL				|
|LOP_MIGRATE_LOCKS|LCX_NULL               |NULL             |0001:000a8715|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:000a8715|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:0000041c|NULL				|
|LOP_INSYSXACT    |LCX_UNLINKED_REORG_PAGE|72057594047692800|0001:0000041b|NULL				|
|LOP_INSYSXACT    |LCX_CLUSTERED          |72057594047692800|0001:000a871c|NULL				|
|LOP_INSYSXACT    |LCX_INDEX_INTERIOR     |72057594047692800|0001:00056321|NULL				|
|LOP_COMMIT_XACT  |LCX_NULL               |NULL             |NULL         |NULL				|

The allocation unit points to plan_persist_runtime_stats.

After a comment of Paul White  I setup an Extended Event to capture query_store_index_rebuild_started and query_store_index_rebuild_finished. To my suprise querystore was doing index rebuilds.
This are the results of this trace:

|event                             |timestamp              |current_size_kb|
|----------------------------------|-----------------------|---------------|
|query_store_index_rebuild_started |2024-12-05 07:51:10.353|874208		   |
|query_store_index_rebuild_finished|2024-12-05 07:52:29.073|868832		   |
|query_store_index_rebuild_started |2024-12-05 08:20:58.497|873504		   |
|query_store_index_rebuild_finished|2024-12-05 08:22:18.320|869152		   |
|query_store_index_rebuild_started |2024-12-05 08:36:03.147|874528		   |
|query_store_index_rebuild_finished|2024-12-05 08:37:19.670|869664		   |
|query_store_index_rebuild_started |2024-12-05 09:06:00.943|874336		   |
|query_store_index_rebuild_finished|2024-12-05 09:07:12.750|870304		   |


It looks like the index rebuild is started around 874MB, the max size of querystore is set to 2048.

I also included the stacktrace of the query_store_index_rebuild_started event in the Extended Event.

> sqllang!XeSqlPkg::CollectClientHostnameActionInvoke
> sqllang!XeSqlPkg::CollectDatabaseIdActionInvoke
> sqllang!XeSqlPkg::CollectDatabaseNameActionInvoke
> sqllang!XeSqlPkg::CollectNtUsernameActionInvoke

> sqllang!XeSqlPkg::CollectSessionIdActionInvoke
> sqllang!XeSqlPkg::CollectTSqlStack
> sqllang!XeSqlPkg::CollectTSqlStackActionInvoke
> qds!XeQdsPkg::query_store_index_rebuild_started::Publish

> qds!CDBQDS::ReclaimFreePages
 qds!CDBQDS::DoSizeRetention

> qds!CDBQDS::ProcessQdsBackgroundTask

> qds!CQDSManager::AcquireGenericQdsDbAndProcess>

> qds!CDBQDS::ProcessQdsBackgroundTask

> sqldk!SOS_Task::Param::Execute 
 sqldk!SOS_Scheduler::RunTask 

> sqldk!SOS_Scheduler::ProcessTasks

> sqldk!SchedulerManager::WorkerEntryPoint

> sqldk!SystemThreadDispatcher::ProcessWorker

> sqldk!SchedulerManager::ThreadEntryPoint 
 KERNEL32+0x17AC4

> ntdll+0x5A8C1

I had hoped to find what is triggering the index rebuild but no such luck.

After some pointers of Zikato  I added some extra querystore related events to my trace. This shows that the index rebuild only is triggered if a query_store_size_retention_cleanup_started event has occurred.

No rebuild:



Rebuild:




Everytime the cleanup runs 0KB has been deleted but apparently a rebuild is needed. 
What confuses me is the appearance of the cleanup event. I thought this would only be triggered when querystore reaches 90% of the max storage size.



Increasing the max size of the querystore doesn't make any difference.

Did anybody experienced the same issue or can somebody explain what is happening?
Other databases on the instance don't have this problem.



                                

Frederik Vanderhaegen (2122 rep)

Dec 4, 2024, 01:58 PM • Last activity: Jan 8, 2025, 10:40 AM

2 votes

1 answers

284 views

Is there any information in Query Store that can be used to find block leaders?

sql-server sql-server-2017 locking blocking query-store

I am on a server with major blocking issues. Not deadlocks, just plain blocks. We know that it is the server's primary bottleneck. However, they do not know what the block leaders are, so I do not know which queries to tune. I could install more monitoring, but I've noticed that they have Query Stor...

                                  I am on a server with major blocking issues. Not deadlocks, just plain blocks. We know that it is the server's primary bottleneck. However, they do not know what the block leaders are, so I do not know which queries to tune. I could install more monitoring, but I've noticed that they have Query Store enabled. Is there any information in Query Store that can be used to find block leaders?
                                

J. Mini (1225 rep)

Nov 23, 2024, 08:20 PM • Last activity: Nov 23, 2024, 09:33 PM

0 votes

0 answers

88 views

Is Automatic Plan Correction the only Query Store feature that is in SQL Server 2019 Enterprise Edition but not Standard Edition?

sql-server sql-server-2019 query-store enterprise-edition feature-comparison

Hunting through [the documentation](https://learn.microsoft.com/en-us/sql/sql-server/editions-and-components-of-sql-server-2019?view=sql-server-ver15&preserve-view=true) for what is locked behind Enterprise Edition is not easy, but I gave it a good try. My conclusion was that Automatic Plan Correcti...

                                  Hunting through [the documentation](https://learn.microsoft.com/en-us/sql/sql-server/editions-and-components-of-sql-server-2019?view=sql-server-ver15&preserve-view=true)  for what is locked behind Enterprise Edition is not easy, but I gave it a good try. My conclusion was that Automatic Plan Correction is the only feature using Query Store that is in SQL Server 2019 Enterprise Edition but not Standard Edition. Am I correct?
                                

J. Mini (1225 rep)

Sep 16, 2024, 02:07 PM • Last activity: Sep 23, 2024, 06:33 AM

2 votes

3 answers

469 views

Timed out queries in Query Store (SQL Server)

sql-server query-store

I'm trying to find out timed out queries in Query store for production database . I received logs from application site where for sure there was 5 timed out queries in short time window but somehow all queries in Query store are marked as completed during that time. (I search for queries by longest...

                                  I'm trying to find out timed out queries in Query store for production database .

I received logs from application site where for sure there was 5 timed out queries in short time window but somehow all queries in Query store are marked as completed during that time.
(I search for queries by longest duration during that time)

I'm a junior database admin and if anyone has any good tips for this, or how you investigate issues like that - any information would be appreciated.

Natalia (21 rep)

Aug 27, 2024, 09:43 AM • Last activity: Aug 28, 2024, 01:54 PM

1 votes

1 answers

71 views

Does Query Store track extended stored procedures?

sql-server sql-server-2019 query-store extended-stored-procedure

My experiments suggest that Query Store does not track extended stored procedures. However, they each have their own individual behaviour, I am not even a beginner on the topic of extended stored procedures, and I have only tried a couple. I seek a definitive way of knowing if Query Store can track...

                                  My experiments suggest that Query Store does not track extended stored procedures. However, they each have their own individual behaviour, I am not even a beginner on the topic of extended stored procedures, and I have only tried a couple. I seek a definitive way of knowing if Query Store can track any extended stored procedure.

Do any of the following exist?

 - An architectural reason why Query Store cannot track any extended stored procedure at all (e.g. I find that the ones that I've tried don't have query plans, but I don't know if that's true of all extended stored procedures).
 - Documentation that clearly states that Query Store does not track any extended stored procedure. I couldn't find any.
 - Code proving that Query Store can track at least one extended stored procedure.
                                

J. Mini (1225 rep)

Aug 6, 2024, 08:12 PM • Last activity: Aug 7, 2024, 08:26 AM

15 votes

1 answers

1499 views

Broken Query Store

sql-server sql-server-2022 query-store

We have SQL Server 2022 Enterprise (16.0.4131.2) on Windows 2022 VM (8 vCPU cores) and there is an issue with the Query Store on one of the databases. Below is the sequence of steps I can proceed to reproduce the issue on one specific database and observations I have made around it: **1 -** The serv...

                                  We have SQL Server 2022 Enterprise (16.0.4131.2) on Windows 2022 VM (8 vCPU cores) and there is an issue with the Query Store on one of the databases.

Below is the sequence of steps I can proceed to reproduce the issue on one specific database and observations I have made around it:

**1 -** The server is in an idle state with just a minor user generated load and negligible CPU utilization as a starting point.

**2 -** The Query Store is turned off and empty for the problematic database as shown from the query below.

**3 -** When I switch the Query Store to the READ_WRITE mode, everything is perfectly fine for about 10 - 15 minutes. However, after this period of time the CPU utilization of the server suddenly rises to 15-20% See the Activity Monitor screenshot below.

The wait statistics shows a high increase in *SOS_SCHEDULLER_YELD* and *PREEMPTIVE_OS_QUERYREGISTRY* waits - they jump into the top server wait types in that time period. 

The CPU utilization generated seems to be generated by an internal background process (SPID < 51) related with a Service Broker activity over the problematic user database. The Service Broker feature is not turned on in the database itself.

**4 -** The high CPU utilization, the above stated waits and the above mentioned SPID activity are continuously presented until the Query Store on the database is turned off. 

There are no signs of any potential "self-recovery", if I keep the Query Store on. Once turned off, the CPU utilization goes down, waits disappear and the SPID stops generating the load and become "sleeping". Switching the Query Store into the READ_ONLY mode does not help. The Query Store must be turned off to get rid off the issue. 

There is not any significant amount of data in the Query Store when the issue triggers - see below (queries taken after the QS was turned off).

**Other related observations**

 - If I turn on (READ_WRITE) the Query Store and keep it running just a while but not long enough to trigger the issue, then switch it to READ_ONLY the issue is not triggered in the 10-15 minutes time period as described above. However, then it is triggered almost instantly if I switch the Query Store to READ_WRITE any time later.
   
 - When the user load increases at the time when the issue is triggered, it can be seen that use queries over the database that takes a few milliseconds to process under the standard conditions starts to slow down and shows also not negligible PREEMPTIVE_OS_QUERYREGISTRY waits (as per sp_WhoIsActive shown below). These queries definitely makes no queries to the Windows registry. 

 - We have the same database (meaning same structure) on several other SQL Server instances with the same load pattern (and even much bigger data and query volumes there) and the Query Store works perfectly fine there. It seems that something is broken in the Query Store just for this specific database.

- It can be important, that the Query Store for that database was originally full and "size based cleanup process" triggered and SQL Server (likely) restarted during that time. I am suspicious that there is something broken in the QS in that database in relation to that ... I have tried also the *sys.sp_query_store_consistency_check* procedure, but without success.

- Once the issue is triggered and "suspended" by turning the QS off, then when the QS is turned on again (without prior cleanup), it seems the issue is triggered again almost instantly.

The Query Store is completely unusable for the database under such conditions. Any ideas of how to fix this?

**UPDATE 2024-08-01**

Based on hint from Paul White's comment, I have made some experiments with the CE_FEEDBACK database scoped option and it seems to be the true trigger. The behavior is as described below:

I have the uncleared Query Store from previous attempts (just a few hundreds of catch queries) as the starting point.

- When I set CE_FEEDBACK = OFF and turn the QS to READ_WRITE the issue does not trigger.
- When I set CE_FEEDBACK = ON and turn the QS to READ_WRITE the issue triggers almost instantly.
- When I set CE_FEEDBACK = OFF and turn the QS to READ_WRITE the issue does not trigger. Then (with the running QS) I switch the CE_FEEDBACK = ON. The issue again triggers instantly. The interesting also is that when I set the CE_FEEDBACK = OFF again, it is not enough to stop the issue. The QS must be turned off to stop it as well. 

**WORKLOAD TYPE**

As the issue seems to be related also with the workload type, I will try to describe it here, as the pattern is quite straightforward. It is almost 99% ad-hoc with the following query patterns:

- About 85% of the queries executed during the experiment is a simple join to table variable. The number rows passed to the variable varies from +/- 10 to +/- 500. The Source DB table vary between queries but the execution plan pattern is same as shown below.

- The rest of the workload are queries that follows the pattern as below. The problem with this portion is that due to a high variability of the parameters (the number of parameters often changes), then is is capable to fill a 2GB query store in a few days with "unique" queries, but it is not the case for the current scenario

I can imagine, that likely the first portion of the workload could be the problematic one for the CE_FEEDBACK.

Martin Karouš (508 rep)

Jul 30, 2024, 12:46 PM • Last activity: Aug 1, 2024, 11:15 AM

2 votes

3 answers

515 views

High Memory Usage for SqlBulkCopy

sql-server memory sql-server-2022 query-store

I've started to use the QueryStore to monitor my App and one of the things I noticed is the unexpected high memory usage for what I thought a simple operation: [![enter image description here][1]][1] Which would mean the statement uses almost 600MB of RAM each time it is executed? The RowCount for e...

I've started to use the QueryStore to monitor my App and one of the things I noticed is the unexpected high memory usage for what I thought a simple operation:

Which would mean the statement uses almost 600MB of RAM each time it is executed? The RowCount for each execution in that timeframe is a lot less than 100. The Statement itself is run every 5 seconds. The Table is partitioned and based on a ColumnStore Index and has no other Indexes or Primary Key/Identity and has around 750k Rows:

CREATE TABLE [DataLink].[LogEntry](
  [AppInstanceId] [bigint] NOT NULL,
  [LoggedOnUtc] [datetime2](7) NOT NULL,
  [CategoryName] [nvarchar](256) NOT NULL,
  [EventCode] [int] NOT NULL,
  [EventName] [nvarchar](256) NULL,
  [LogLevel] [int] NOT NULL,
  [ScopeJson] [nvarchar](max) NULL,
  [StateJson] [nvarchar](max) NULL,
  [ExceptionJson] [nvarchar](max) NULL,
  [Message] [nvarchar](max) NULL
) ON [PSCH_Logging_DataLink_LogEntry_Daily7Of9]([LoggedOnUtc])

CREATE CLUSTERED COLUMNSTORE INDEX [CIX_LogEntry] 
ON [DataLink].[LogEntry] WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0, DATA_COMPRESSION = COLUMNSTORE) 
ON [PSCH_Logging_DataLink_LogEntry_Daily7Of9]([LoggedOnUtc])

The Code which triggers the inserts:

using var conn = connInfo.Main.GetConnection(DatabaseLoginType.User);
await conn.OpenAsync(ct).CAf();
using var sqlBulkCopy = new SqlBulkCopy((SqlConnection)conn, SqlBulkCopyOptions.CheckConstraints | SqlBulkCopyOptions.FireTriggers, null);

foreach(var toWriteItemGroup in toWriteItems.GroupBy(x => x.SchemaName)) {
...

	dataReader.Init(toWriteItemGroup, tableInfo.ColumnMappings.Length);

	sqlBulkCopy.DestinationTableName = $"{schemaName}.LogEntry";
	sqlBulkCopy.ColumnMappings.Clear();
	for(int i = 0; i < tableInfo.ColumnMappings.Length; i++) sqlBulkCopy.ColumnMappings.Add(i, tableInfo.ColumnMappings[i]);
	await sqlBulkCopy.WriteToServerAsync(dataReader, ct).CAf();
...
}

Any idea why the memory usage is so high and what I could do to fix that? **Edit4** I did some tests by changing and manually compiling Microsoft.Data.SqlClient. The changes I made included adding ROWS_PER_BATCH and/or KILOBYTES_PER_BATCH to the with options of the "insert bulk" statement. None of the options changed how much memory was used, the former changed the row count estimate though: https://www.brentozar.com/pastetheplan/?id=HkKjc9HIC It doesn't seem like "insert bulk" can be optimized for low row counts. **Edit3** Here is a short example with which I can reproduce the issue. It contains a script "Script.sql" which needs to be executed first to setup the table and add some data. After that run the program with "dotnet run" (or use an IDE). Since I can't upload files here, I've uploaded it to github gist: https://gist.github.com/DvdKhl/d042ed05e3237136265295cb39ecb4f4 The Script will: - Create a view, which shows PartitionInfo of the table - Create (or recreate) the Table and it configuration - Insert 700k rows (100k per partition) - Rebuild the index - Output the Partition Info - Commented out is - Event Session (Extended Event) to capture the query plan - Another insert Statement - Cleanup The Program will: - Open a connection on "localhost" for the DB "main" - Create a Dummy DataReader (Change Count to change the inserted row count) - Setup SqlBulkCopy like above - Call WriteToServerAsync to insert the rows This results in the following queryplan: https://www.brentozar.com/pastetheplan/?id=B1v_8bGLC **Edit2** Like Suggested by Denis Rubashkin I've set BatchSize and an order hint:

sqlBulkCopy.BatchSize = toWriteItemGroup.Count();
sqlBulkCopy.ColumnOrderHints.Add("LoggedOnUtc", SortOrder.Ascending);

BatchSize doesn't seem to change anything (the estimate stays the same). It looks like ROWS_PER_BATCH is not used even though sqlBulkCopy.BatchSize is set in the code, which might be the main problem. When the Order hint is added, the query does not show up in the QueryStore. Using extended events to get the queryplan shows an "Excessive Grant" Warning though. So I'm not sure it helped. The option KILOBYTES_PER_BATCH looks interesting but it does not seem like I can set it in the C# code. WithOrderHint / WithoutOrderHint (Different table but exactly the same problem) **Edit:** Queryplan: https://www.brentozar.com/pastetheplan/?id=SJGpBktH0 Table Properties - Storage

Disk Usage by Partition

Arokh (27 rep)

Jun 13, 2024, 10:32 AM • Last activity: Jun 24, 2024, 07:10 PM

8 votes

1 answers

1777 views

What to consider when setting flush_interval_seconds in SQL Query Store?

sql-server query-store

Most of the settings for Query Store are fairlying straight forward. But as best as I can tell the one most likely to cause performance issues (waits) and with the least clear documentation is flush_interval_seconds I have read [Erin Stellato's, SQL Server Query Store: Default Settings](https://www....

                                  Most of the settings for Query Store are fairlying straight forward.  But as best as I can tell the one most likely to cause performance issues (waits) and with the least clear documentation is flush_interval_seconds

I have read [Erin Stellato's, SQL Server Query Store: Default Settings](https://www.sqlskills.com/blogs/erin/sql-server-query-store-default-settings/) 

> I’d also probably drop  DATA_FLUSH_INTERVAL_SECONDS to something lower than 900 seconds (15 minutes) if my storage can support it.  This setting determines how often Query Store data is flushed to disk.  If it’s every 15 minutes, then I could potentially lose 15 minutes of Query Store data if my server happened to crash before it could be written to disk. 

Microsoft's [Best Practice with the Query Store](https://learn.microsoft.com/en-us/sql/relational-databases/performance/best-practice-with-the-query-store)  does not even mention it.

Microsoft's [sys.database_query_store_options (Transact-SQL)](https://learn.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-database-query-store-options-transact-sql)  says (for on Premisis);

> Defines period for regular flushing of Query Store data to disk. Default value is 900 (15 min).

The above implies (possibly) Query Store data is held in memory until a regular 15 minutes flush occurs then it is pushed/flushed to disk

But the documents for Azure are slightly different. [Operating the Query Store in Azure SQL Database](https://learn.microsoft.com/en-us/azure/sql-database/sql-database-operate-query-store) 

> Specifies **maximum** period during which captured runtime statistics are kept in memory, before flushing to disk (*emphasis is mine*)

The Azure description implies Query Store data moves to disk as regularly as normal data, but if it has not moved by the age, then it is forced to disk.

I know that Query Store data is part of the database, I am not clear if logs are updated only when the data is pushed to disk, or in real time.  

Some points I am not clear on that could impact the decision. (*Maybe some of these should be separate questions?*)

* Is query store data written to the log?  If so would a restore let us recover any Query Store data, that had not been pushed to disk?

* If I have an AlwaysOn AG, do the secondaries get the Query Store data, real time or are they dependent on it actually getting written to disk on the primary? Rephrased if my Query Store data has not been written on the primary for 15 minutes, and the primary gets unplugged will the secondary capture it?

* Can a Query Store disk write have wait conflicts with users OLPT data? If so what are they?

* What does changing the flush_interval_seconds value really do? 

I have this scenario in my mind where, there is a long running query using all the system resources for an hour+ and Query Store hit the 15 Minute flush interval and now it is in conflict with the users application for resources. I can't work out what the scenario really is and how it impacts the user application. How do I justify my decision for the value I set?

James Jenkins (6318 rep)

Jan 26, 2018, 06:33 PM • Last activity: May 5, 2024, 01:58 PM

Showing page 1 of 20 total questions