Unusual column comparisons and query performance

6 votes
2 answers
683 views
sql-server performance sql-server-2016 except query-performance
                          We have some consultants working on expanding an inhouse data warehouse.  I was doing a code review and ran across this pattern in all of the load procs:

    	MERGE [EDHub].[Customer].[Class] AS TARGET
		USING (
			SELECT 
			FROM [dbo].[vw_CustomerClass]
				WHERE JHAPostingDate = @PostingDate   
			) AS SOURCE
			ON  TARGET.BankId = SOURCE.BankId       -- This join is on the business keys
				AND TARGET.Code = SOURCE.Code
		WHEN NOT MATCHED BY TARGET  
			THEN
				
		WHEN MATCHED
			AND TARGET.IsLatest = 1
			AND EXISTS (
				SELECT SOURCE.[HASH]   
				EXCEPT			
				SELECT TARGET.[Hash]
				)
			THEN 
				

The gist is, if we have a new business key, insert but if the business key exists and the hash of the attributes don't match our current row then  update the old row and insert a new one (later in the code).  It all works fine but I paused when I got to this code

    AND EXISTS (
				SELECT SOURCE.[HASH]   
				EXCEPT			
				SELECT TARGET.[Hash]
				)

It seems overly complicated compared to SOURCE.[HASH]  TARGET.[Hash].  The EXCEPT will do an accurate NULL comparison but in our case hashes will never be NULL (or we have bigger problems).  I want our code to be easy to read so that when someone has to maintain it, it doesn't confuse.  I asked our consultants about it and they speculated that it might be faster because of set operations but I decided to write a simple test (test code below).

The first thing I noticed was the EXISTS/EXCEPT had a more complicated query plan but that's not always bad

I ran each select client statistics on and the  join yielded total execution time of 12,000 vs 25,000 with the EXISTS/EXCEPT.  I want to take this to our consultants with the request to refactor that statement but wanted to get feedback here on:

 1. is this a good test? - am I missing anything?
 2. is there a case where EXISTS/EXCEPT would be a better comparison?

Test script:

    CREATE TABLE r (hash VARBINARY(8000))
    CREATE TABLE l (hash VARBINARY(8000))
    
    SET NOCOUNT ON
    DECLARE @x INT = 10000
    WHILE @x  0 BEGIN
    
       INSERT INTO dbo.r ( hash ) SELECT HASHBYTES('SHA2_256',CAST(NEWID() AS VARCHAR(200)))
       INSERT INTO dbo.l ( hash ) SELECT HASHBYTES('SHA2_256',CAST(NEWID() AS VARCHAR(200)))	
    
       SET @x = @x-1
    END
    
    INSERT INTO dbo.r ( hash ) VALUES ( NULL  )
    INSERT INTO dbo.l ( hash ) VALUES ( NULL  )
    
    SELECT COUNT(1) 
    FROM dbo.l
    CROSS JOIN dbo.r 
    WHERE ISNULL(r.hash,0)  ISNULL(l.hash,0)
    
    SELECT COUNT(1) 
    FROM dbo.l
    CROSS JOIN dbo.r 
    WHERE EXISTS(SELECT r.hash except select l.HASH)
                        
Asked by Bob Probst (267 rep)
Apr 15, 2018, 03:32 PM
Last activity: Dec 31, 2024, 09:01 AM
Unusual column comparisons and query performance

Related Questions