Sample Header Ad - 728x90

Stored procedure running indefinitely, same statements executed in batch completes in a second; WITH RECOMPILE has no effect

1 vote
0 answers
216 views
Why would a proc just run without terminating (increasing IO and CPU in sp_whoisactive, nothing blocking) when all the exact same statements run fine by themselves in SSMS in a few seconds, but the proc never completes? I am EXECing the proc in SSMS WITH RECOMPILE. The proc's logic has changed a little and I have populated some larger test data in order to verify the performance is still acceptable - it was working fine as it was for the small amount of source data (a few hundred rows) that is normally used in the automated build/test cycle. There are other simple logging and auditing procs being called in the proc, but the statement that is apparently hanging according to sp_whoisactive is at the top level not in an inner proc and is just the primary INSERT/SELECT (717 summarized rows to be inserted from a source set of around 50k rows, whole thing runs in about a second when all the steps are run outside the proc) I thought it had to be classic parameter sniffing because the only significant parameter used in the INSERT/SELECT is a date and it normally runs in a loop of dates, and when it got to the first date with rows, that's where it hung, so I figured it must be a plan issue where the empty days had a plan that is just horrible when there are rows for that day in the table. The rows for a particular day go jumps from 0 to 53,536 on the days where I inserted my test data. (Total data in the table is ~3m rows, with that number per day for 60 days - the period I loaded to test). I tried other days in the range with similar data and they all experienced the same behavior. So seems like classic parameter sniffing where the plan is just really bad when the cardinality changes so drastically, but WITH RECOMPILE should rule that out, right? I tried running DBCC FREEPROCCACHE before EXEC the proc or the batch of statements and it makes no difference. The batch of statements completes in a second, the stored proc goes for at least 39 minutes before I cancel it. I am dropping and recreating this entire analytics database containing the proc and the table it is trying to INSERT into in my build process, so it would be clean at the start of all my testing if I run the entire build/test process - it accesses an underlying OLTP database through synonyms, but that database of test data is not changing, and I have updated the statistics on the main table being used where I added the 3m rows. There was an existing CROSS APPLY to a timezone conversion function (https://www.mssqltips.com/sqlservertip/3173/handle-conversion-between-time-zones-in-sql-server-part-1/) - I didn't think parameter sniffing applied to table-valued functions, the lookup tables it uses definitely are small and uniform and that is in a join in the query unrelated to where my most recent change was made. I'm working to try simplify it to a problem where it still happens and I can actually post the code, but that will take a while. This is SQL Server 2012. I will re-target to my SQL Server 2016 instance for testing there as well, but our product is validated against SQL Server 2012 and 2016, so I can't give up SQL Server 2012 support (yet).
Asked by Cade Roux (6684 rep)
Sep 22, 2021, 05:34 PM