Sample Header Ad - 728x90

Should I worry about a serial update execution plan?

0 votes
1 answer
91 views
For a mature .Net project I support we're currently exploring options to remove Entity Framework Core from the project while still maintaining some of the EF functionality we rely on. At the top of this list is our ability to keep track of exactly which values have changed as a result of an update request that an outside system sends to us - we have no control over that system, and it's not unusual for us to receive update requests that contain identical data several times. We currently use EF's change tracker to keep a view on whether or not the update we're processing really makes any changes to a set of specific columns so that we know whether or not to inform our users of these changes. Looking into this how we might achieve this without having EF and its overhead involved led me to SQL Server's [OUTPUT Clause](https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql?view=sql-server-ver16) , which would theoretically allow us to perform an update to a table and return a view of the prior and updated state of the key columns we use for notifications triggers. So far so good, however, there is a warning under the [Parallelism](https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql?view=sql-server-ver16) section of that document that says that "[an] OUTPUT clause that returns results to the client, or table variable, will always use a serial plan.". I'm afraid that my knowledge of SQL is not strong enough at this time to be able to tell if this is likely to be a problem or not. Should this concern me from a performance or reliability standpoint? Every update that we perform is keyed on a set of columns that form (in almost every case) a compound unique key, so even if the query portion of the update is run serially would it be a noticeable impact? Take as an example the following schema:
CREATE TABLE user_profile
(
  [id] INT IDENTITY (1,1) NOT NULL PRIMARY KEY
  [username] NVARCHAR(100) NOT NULL,
  [tenancy] NVARCHAR(20) NOT NULL,
  [city] NVARCHAR(50) NULL,
  [state] NVARCHAR(50) NULL,
  [first_name] NVARCHAR(50) NULL,
  [last_name] NVARCHAR(50) NULL,
  [external_system_access_key] NVARCHAR(200) NULL,
  CONSTRAINT [UX_user] UNIQUE ([username], [tenancy])
)
In this example, a user manages their own city, state, first_name and last_name values, but an external system manages external_system_access_key through a request to our service like
-none
POST /{tenancyId}/user/{username}/profile/external

{
  "accessKey": "1224567890"
}
If we receive that same update several times without the value of accessKey changing, we're wanting to know if the value differs pre- and post-update execution, so we know whether or not to inform the user that the key has changed. Each request would result in an update like this:
DECLARE @accessKey NVARCHAR(200) = '1234567890';
DECLARE @username NVARCHAR(100) = 'username';
DECLARE @tenancy NVARCHAR(20) = 'tenancy';
UPDATE [user_profile]
SET [remote_system_access_key] = @accessKey
OUTPUT INSERTED.id, DELETED.[remote_system_access_key] AS OLD_remote_system_access_key, INSERTED.[remote_system_access_key] AS NEW_remote_system_access_key
WHERE 
[username] = @username AND [tenancy] = @tenancy;
In the case where the request gave us a new value for that column there would be different values for each of the OLD_ and NEW_ output columns, and if it's a request we've had previously then they will match, allowing us to evaluate any changes after the insert is done. But SQL Server's documentation says that this will always result in a serial execution plan. What I need to know is: is this a problem? Any assistance I could get understanding this and its potential impacts would be greatly appreciated. Generally, each update will only hit one row at a time as its where clause uses a compound unique key. We're just trying to avoid the thing that EF does and require querying the data out first - instead, since SQL server already provides a mechanism to know what the state before and after the update was, we would like to get that data back afterwards if it's not going to cause performance issues. The example I used above is quite simplistic compared to our actual cases, where we'll be updating large numbers of columns in each statement - all in the one table, but multiple columns in each. It would be prohibitively complex to try and cover every possible permutation of what might be updated in each request.
Asked by Adrian (103 rep)
Sep 4, 2023, 07:24 AM
Last activity: Sep 8, 2023, 12:19 AM