Sample Header Ad - 728x90

What could cause a SQL Server non-cached temporary table to NOT trigger a recompile when a large amount of rows have changed?

15 votes
1 answer
746 views
I have observed (and reproduced) the following scenario with SQL Server 2022. ### The pattern in use * code is executed via sp_executesql (no stored procedure is involved) * The first query selects data into a temporary table * A DDL statement then creates a clustered index on the temporary table. The temporary table is definitely NOT cacheable-- first of all this isn't a module (sproc or function), but also we're creating an index after the temp table is populated. So I would not expect statistics left behind on a cached temporary object to be involved at all here. * A query selects data from the temporary table. This query gets FULL optimization each time (not a TRIVIAL plan) This batch can run for both small and larger datasets, so that temp table can have 1 row in it or many thousands of rows. This behavior normally occurs on a readable secondary. There is no writable query store and no automatic plan forcing as a factor. I have verified that I can reproduce the behavior against the primary replica as well. (Automatic plan correction was told to ignore the query and I confirmed no plan forcing on primary when reproduced.) ### Repro script * [Setup script](https://gist.github.com/LitKnd/83130120683e669669c048a5e74c8145#file-setuprepro-sql) - I ran this on SQL Server 2022 CU15. This turns off query store and uses compat level 130. * [Repro query](https://gist.github.com/LitKnd/83130120683e669669c048a5e74c8145#file-repro-sql) - I've been running this via SQL Query Stress so I can easily run it concurrently on one or more threads * [Plan Generation Num and temp tables](https://gist.github.com/LitKnd/83130120683e669669c048a5e74c8145#file-plan_generation_num_and_temp_tables-sql) - A very simple query to observe the plan_generation_num in sys query stats (" A sequence number that can be used to distinguish between instances of plans after a recompile.") and the current list of temp tables ### What normally happens-- and the behavior I expect Normally, changing large amounts of rows in the temporary table between query executions automatically cause recompiles, and I'll see that the query selecting the data from the temporary table has a row estimate matching the rows in the temp table. When this works as expected, performance is fine. *With the repro query*: If I clear the plan cache, then run the repro query 40 iterations on a single thread in SQL Query Stress, plan_generation_number ends up being 82. When sampling query plans with sp_WhoIsActive, row estimates querying the temp table match the number of rows in the temp table, as expected. ### What sometimes happens -- and looks like a bug to me On rare occasions, I see that a plan is in use where there is a 1 row estimate plan for the temp table, but a very large amount of rows are actually in the temp table. A LOT of rows have changed, but it didn't automatically recompile: screenshot of a clustered index scan of a temporary table with a 1 row estimate, but 4.2 million rows have been scanned from it so far This leads to very slow performance because the low estimate plan decides to use a nested loop without prefetching, which makes it a CPU burner. *With the repro query*: If I clear the plan cache, then run the repro query 20 iterations on 2 threads in SQL Query Stress, plan_generation_number ends up less than 82-- it varies by run, but might be 72 or 59, indicating fewer recompiles. While this is running, I can also sample occasions with sp_WhoIsActive where there is a single rowcount estimated but many more rows in the temporary table. Screenshot: not the query plan i expected-- why only 1 rowcount? ### I can only reproduce this when running repro code on multiple concurrent sessions I have not been able to reproduce this behavior with a single session in SQL Server. The only way I can reproduce this is to set up a block of code that: * Executes at least 1 iteration of the sp_executesql statement that have 1 row in the temp table * Then executes 1 iteration of the sp_executesql statement that has a lot more rows in the temp table If I run this in a single session, I have not been able to reproduce the problems. But if I run this concurrently in four or five sessions, I'll be able to occasionally get the "THAT DIDN'T RECOMPILE LIKE IT SHOULD HAVE" issue to pop up. (Note: using SQL Query Stress, I can repro this with only 2 sessions/iterations.) This feels like a bug to me, I'm curious if anyone else has seen it. Recompile and stats behavior with temp tables is super complex tho, so there may be some nuance that I'm missing with how this works with non-cachable temp tables. PS: I do think cachable temp tables are generally better. I'm just trying to figure out why this behavior would happen in a non-cacheable temp table scenario at this point. ### Workarounds After adding an option (recompile) to the query, I can no longer reproduce the reuse of the 1 row plan querying the temp table. This is sufficient, but I'm puzzled why it is necessary.
Asked by Kendra Little (938 rep)
Dec 6, 2024, 03:32 PM
Last activity: Dec 10, 2024, 09:00 AM