Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

343 views

Slow Cross Apply Query Needs Optimization

sql-server-2016 cross-apply

I have a stored procedure which takes about 4 min to complete. I added some logging and discovered that the reason it is slow is the Cross Apply queries, of which there are 5. Each one takes about 40 seconds to complete. I'm sure there are ways to improve the performance, which I'm hoping someone ca...

                                  I have a stored procedure which takes about 4 min to complete. I added some logging and discovered that the reason it is slow is the Cross Apply queries, of which there are 5. Each one takes about 40 seconds to complete. I'm sure there are ways to improve the performance, which I'm hoping someone can help me with. Here is the query:

    update @Data1
    set Open = b.Open_Sum
    from @Data1 a
    cross apply (Select Count(*) as Open_Sum
	from [Data] c
	where (c.Sunday > a.Week or c.Sunday is Null)
	and c.Project = a.Project
	and c.Item = a.Item
	and c.IPT = a.IPT
	and c.Product = a.Product
	and c.Center = a.Center
	and c.Name = a.Name
	and c.Project  in (select * from SplitParmList(@Project ))
	and c.Product in (select * from SplitParmList(@Product  ))
	and c.Item in  (select * from SplitParmList(@Item ))
	and c.Area in  (select * from SplitParmList(@Area ))
	and c.IPT in  (select * from SplitParmList(@IPT ))
	and c.Name in  (select * from SplitParmList(@Name ))
	and c.Center in  (select * from SplitParmList(@Center ))
	and c.Effectivity  in  (select * from SplitParmList(@Effectivity))
	and c.Planned in (select * from SplitParmList(@Planned))
	and CURRENT = 'Y'
	) as b

Query Explanation:

 1. Populates a table variable (@Data1) with the aggregate count, in this case
    records still open 
 2. The SplitParmList is a function that parses a parameter passed into the SP. ie. ('a,b,c,d,e')
 3. The Data table is indexed

I'm looking for suggestions that may help speed this query up.


As requested, here is the SplitParmList function:


    CREATE FUNCTION [dbo].[SplitParmList] (@StringList VARCHAR(MAX))

    RETURNS @TableList TABLE( StringLiteral VARCHAR(128))
    BEGIN
        DECLARE @StartPointer INT, @EndPointer INT
        SELECT @StartPointer = 1, @EndPointer = CHARINDEX(',', @StringList)
        WHILE (@StartPointer Here is the index: 


    CREATE CLUSTERED INDEX [ClusteredIndex-20210222-092308] ON [dbo].[Data]
    (
	  [Name] ASC,
	  [Center] ASC,
	  [Project] ASC,
	  [Effectivity] ASC,
	  [Product] ASC,
	  [Drawing] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [dbo]

                                

FlyFish (109 rep)

May 11, 2023, 11:24 AM • Last activity: Jul 16, 2025, 10:06 PM

1 votes

1 answers

337 views

How to use reference to another table in aggregate function

sql-server t-sql aggregate cross-apply

I have following easy-to-explain task: we have a table, and we should calculate total count of rows grouped by columns. Column number is fixed so we write: SELECT [GroupId] = cr.[GroupId], [OnCreateCount] = SUM(CASE WHEN cr.StatusOfRequest = 1 THEN 1 ELSE 0 END), [NewCount] = SUM(CASE WHEN cr.Status...

                                  I have following easy-to-explain task: we have a table, and we should calculate total count of rows grouped by columns. Column number is fixed so we write:

    SELECT  [GroupId] = cr.[GroupId], 
            [OnCreateCount] =  SUM(CASE WHEN cr.StatusOfRequest = 1 THEN 1 ELSE 0 END),
            [NewCount] = SUM(CASE WHEN cr.StatusOfRequest = 2 THEN 1 ELSE 0 END),
            [CreatedCount] = SUM(CASE WHEN cr.StatusOfRequest = 1 AND THEN 1 ELSE 0 END),
            [ReviewCount] = SUM(CASE WHEN cr.StatusOfRequest = 3 THEN 1 ELSE 0 END),
            [RejectedCount] = SUM(CASE WHEN cr.StatusOfRequest = 4 THEN 1 ELSE 0 END),
            [TotalCount] = COUNT(*)
    FROM [FK].[CertificateRequest] cr
    GROUP BY cr.[GroupId]

Here OnCreateCount and CreatedCount are the same, but it's wrong. CreatedCount is number of rows with status 1 which are referenced from table [Certificate]. OnCreate is number of rows with status 1 which are **not** referenced from table [Certificate]

Here is how I solved it:

    SELECT  [GroupId] = cr.[GroupId], 
            [OnCreateCount] =  SUM(CASE WHEN cr.StatusOfRequest = 1 AND c.ID IS NULL THEN 1 ELSE 0 END),
            [NewCount] = SUM(CASE WHEN cr.StatusOfRequest = 2 THEN 1 ELSE 0 END),
            [CreatedCount] = SUM(CASE WHEN cr.StatusOfRequest = 1 AND c.ID IS NOT NULL THEN 1 ELSE 0 END),
            [ReviewCount] = SUM(CASE WHEN cr.StatusOfRequest = 3 THEN 1 ELSE 0 END),
            [RejectedCount] = SUM(CASE WHEN cr.StatusOfRequest = 4 THEN 1 ELSE 0 END),
            [TotalCount] = COUNT(*)

    FROM [FK].[CertificateRequest] cr
    OUTER APPLY
    (
      SELECT TOP 1 *
      FROM [FK].[Certificate] c
      WHERE c.CertificateRequestID = cr.ID
    ) c
    GROUP BY cr.[GroupId]

But I really don't like OUTER APPLY here which is used just to determine whenever there is any row in referenced table.

What is the easiest way to get required result?
                                

Alex Zhukovskiy (343 rep)

Nov 2, 2016, 10:28 AM • Last activity: Mar 21, 2025, 05:04 AM

44 votes

2 answers

128187 views

Outer Apply vs Left Join Performance

sql-server join cross-apply

I am Using SQL SERVER 2008 R2 I just came across APPLY in SQL and loved how it solves query problems for so many cases, Many of the tables I was using 2 left join to get the result, I was able to get in 1 outer apply. I have small amount of data in my local DB tables and after deployment the code is...

                                  I am Using SQL SERVER 2008 R2

I just came across APPLY in SQL and loved how it solves query problems for so many cases,

Many of the tables I was using 2 left join to get the result, I was able to get in 1 outer apply.

I have small amount of data in my local DB tables and after deployment the code is supposed to run on data atleast 20 times big.

I am concerned that outer apply might take longer than the 2 left join  conditions for large amount of data,

Can anyone tell how exactly apply works and how will it effect the performance in very large data,
If possible some proportional relations with size of each table like proportional to n1^1 or n1^2 ... where n1 is number of rows in table 1.

Here is the query with 2 left join

    select EC.*,DPD.* from Table1 eC left join
      (
       select member_id,parent_gid,child_gid,LOB,group_gid,MAX(table2_sid) mdsid from Table2
       group by member_id,parent_gid,child_gid,LOB,group_gid
      
      ) DPD2 on DPD2.parent_gid = Ec.parent_gid
            AND DPD2.child_gid = EC.child_gid
            AND DPD2.member_id = EC.member_id
            AND DPD2.LOB = EC.default_lob
            AND DPD2.group_gid = EC.group_gid
      left join
      Table2 dpd on dpd.parent_gid = dpd2.parent_gid 
                and dpd.child_gid = dpd2.child_gid
                and dpd.member_id = dpd2.member_id 
                and dpd.group_gid = dpd2.group_gid 
                and dpd.LOB = dpd2.LOB
                and dpd.table2_sid = dpd2.mdsid


Here is the query with outer apply
   

    select * from Table1 ec   
    OUTER APPLY (
          select top 1 grace_begin_date,retroactive_begin_date,Isretroactive
                        from Table2 DPD 
                        where DPD.parent_gid = Ec.parent_gid
                        AND DPD.child_gid = EC.child_gid
                        AND DPD.member_id = EC.member_id
                        AND DPD.LOB = EC.default_lob
                        AND DPD.group_gid = EC.group_gid
                        order by DPD.table2_sid desc
         ) DPD 
                                

Pratyush Dhanuka (559 rep)

Aug 27, 2014, 08:41 AM • Last activity: May 17, 2024, 06:36 PM

6 votes

1 answers

626 views

Is there a more optimal way to create this view or stored procedure? Currently using a cross apply, but it's slow

sql-server t-sql optimization stored-procedures cross-apply

***I've updated this post with some test data.*** I'm creating a report for my Movies database where I'd like for the end-user to be able to select movies of a certain genre. However, some movies have multiple genres and I've normalized the database so that a movie line item that had more than one genre turns into multiple line items, each pointing to the respective genre/genreID. (A similar thing has been done for directors). **Before Normalization** |Movie |Genre | |----------------------|-------------| |Bride of Frankenstein |Horror, Drama | **After Normalization** |Movie |Genre | |----------------------|-------------| |Bride of Frankenstein |Horror | |Bride of Frankenstein |Drama | The problem that I have is, for the sake of this report, I would like to make it so that the movies do not repeat in the report if they have multiple genres. But rather, the movie title becomes one line item and the genres field is concatenated to show all of the genres within the one line (similar to the before normalization view). What I've ended up doing is creating a view where I cross apply the output of the movie selection by the genres that match the movie ID. I feel like I've overcomplicated it a bit and my stores procedure is running pretty slow, as I have a couple of other fields that I am allowing the user to filter on as well. Here's a look at the cross apply view.

ALTER VIEW [dbo].[vwMoviesJoinedGenres] AS

WITH genreMovies_CTE AS (
SELECT M.MovieID
    , M.MovieTitle
    , G.GenreName
    , G.GenreID
    , M.TitleTypeID
    , TT.TitleType
    , M.MediaID
    , M.IMDBLink
    , M.IMDBRating
    , M.ReleaseDate
    , M.Runtime
    , M.ImageURL
    , M.MovieYear
FROM [dbo].[Movies] AS M
INNER JOIN GenresMovies AS GM
	ON GM.MovieID = M.MovieID
INNER JOIN Genres AS G
	ON G.GenreID = GM.GenreID
INNER JOIN TitleType AS TT
	ON TT.TitleTypeID = M.TitleTypeID
)   
SELECT 
    DISTINCT MovieID, ImageURL, MovieYear, MovieTitle, TitleType, SUBSTRING(G.genres, 0, 
    LEN(G.genres)) AS Genres, GenreID, TitleTypeID, MediaID, IMDBLink, IMDBRating, ReleaseDate, 
    Runtime
FROM genreMovies_CTE
CROSS APPLY
(
SELECT DISTINCT GenreName + ', ' 
    FROM Genres AS G
	INNER JOIN GenresMovies AS GM ON GM.GenreID = G.GenreID
	WHERE genreMovies_CTE.MovieID = GM.MovieID
    FOR XML PATH('')        
) G(genres)
GO

I then use this view (and a similar view for directors) in the stored procedure below.

USE [Movies]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

--====================================================
ALTER   PROCEDURE [dbo].[usp_MovieByGenreUsingViews] 
	-- Add the parameters for the stored procedure here
	@GenreID nvarchar(255)
	, @MediaID nvarchar(255)
	, @TitleTypeID nvarchar(255)
WITH RECOMPILE

AS
BEGIN

	SET NOCOUNT ON;

WITH genresMovies_CTE AS
(SELECT 
	DISTINCT JG.[MovieID]
	, JG.[MovieTitle]
	, JG.Genres
	, JG.[TitleTypeName]
	, JG.[TitleTypeID]
	, JG.[MediaID]
	, Me.MediaType
	, JD.Directors
FROM [dbo].[vwMoviesJoinedGenres] AS JG --JoinedGenres
INNER JOIN [dbo].[vwMoviesJoinedDirectors] AS JD
    ON JG.MovieID = JD.movieID
INNER JOIN Media AS Me
    ON Me.MediaTypeID = JG.MediaTypeID
WHERE GenreID IN (SELECT Number FROM [fn_SplitInt](@GenreID, ','))
    AND JG.MediaTypeID IN (SELECT Number FROM [fn_SplitInt](@MediaID, ',')) 
    AND JG.TitleTypeID IN (SELECT Number FROM [fn_SplitInt](@TitleTypeID, ',')) 
)
SELECT MovieTitle, Genres, Directors, TitleType, MediaType
FROM genresMovies_CTE
ORDER BY movietitle

END

****Updating with a very small sample data set as requested. I've simplified a lot obviously to save time and have decided to focus on Genres only.** Creating Tables and Inserting Data

USE [Test Movies];
GO
--DROP TABLE IF EXISTS MovieDetails;
CREATE TABLE MovieDetails 
(
	MovieID int NOT NULL 
	, MovieTitle nvarchar(255)
	, Constraint MovieID PRIMARY KEY (MovieID)
);

--DROP TABLE IF EXISTS Genres;
CREATE TABLE Genres(
	  GenreID tinyint NOT NULL Identity(1,1)
	, GenreName varchar(50) NOT NULL 
	, CONSTRAINT GenreID PRIMARY KEY (GenreID)
)

--DROP TABLE IF EXISTS MovieGenre;
CREATE TABLE MovieGenre
(
	MovieID int NOT NULL
	, GenreID tinyint NOT NULL
	, CONSTRAINT GenresMovies PRIMARY KEY (MovieID, GenreID)
);

INSERT INTO MovieDetails (MovieID, MovieTitle)
VALUES
(1, 'Forest Gump')
, (2, 'Eyes Wide Shut')
, (3, 'Kelly''s Heroes')
, (4, 'Where Eagles Dare')
, (5, 'Star Trek: First Contact')
, (6, 'The Ten Commandments')
, (7, 'Clash of the Titans')
, (8, 'AVP: Alien vs. Predator')
, (9, 'Batman Returns')
, (10, 'Crash')

INSERT INTO Genres (GenreName)
VALUES ('Drama'), ('Adventure'), ('Family'), ('Horror'), ('Crime');

INSERT INTO MovieGenre (MovieID, GenreID)
VALUES (1,1), (2,1), (3,2), (4,2), 
(5,1), (5,2), (6,1), (6,2), (7,2), 
(7,3), (8,4), (9,5), (10,1);

**Code to splitInt Function found online**

ALTER FUNCTION [dbo].[fn_SplitInt]
(
    @List       nvarchar(4000),
    @Delimiter  char(1)= ','
)
RETURNS @Values TABLE
(
    Position int IDENTITY PRIMARY KEY,
    Number int
)

AS

  BEGIN

  -- set up working variables
  DECLARE @Index INT
  DECLARE @ItemValue nvarchar(100)
  SELECT @Index = 1 

  -- iterate until we have no more characters to work with
  WHILE @Index > 0

    BEGIN

      -- find first delimiter
      SELECT @Index = CHARINDEX(@Delimiter,@List)

      -- extract the item value
      IF @Index  > 0     -- if found, take the value left of the delimiter
        SELECT @ItemValue = LEFT(@List,@Index - 1)
      ELSE               -- if none, take the remainder as the last value
        SELECT @ItemValue = @List

      -- insert the value into our new table
      INSERT INTO @Values (Number) VALUES (CAST(@ItemValue AS int))

      -- remove the found item from the working list
      SELECT @List = RIGHT(@List,LEN(@List) - @Index)

      -- if list is empty, we are done
      IF LEN(@List) = 0 BREAK

    END

  RETURN

  END

**Joining Genres using STRING_AGG**

PROCEDURE [dbo].[usp_MovieByGenreStrAgg] 
	@GenreID nvarchar(255)
WITH RECOMPILE

AS
BEGIN
	SET NOCOUNT ON;

SELECT DISTINCT 
	  movieTitleID.movieID
	, movieTitleID.movietitle
	, movieTitleID.genres 
FROM
(SELECT	
	  MD.MovieID
	, MD.MovieTitle
	, STRING_AGG(G.GenreName, ', ') AS Genres
FROM MovieDetails AS MD
INNER JOIN MovieGenre AS MG 
	ON MG.MovieID = MD.MovieID
INNER JOIN Genres AS G 
	ON G.GenreID = MG.GenreID
GROUP BY MD.MovieID, MD.MovieTitle) 
AS movieTitleID
INNER JOIN MovieGenre AS MG 
    ON MG.MovieID = movieTitleID.MovieID
INNER JOIN Genres AS G 
    ON G.GenreID = MG.GenreID
WHERE G.GenreID IN (SELECT Number FROM [fn_SplitInt](@GenreID, ','))

END

**Joining Genres using Cross Apply**

CREATE PROCEDURE [dbo].[usp_MovieByGenreCrsApply] 
	@GenreID nvarchar(255)
WITH RECOMPILE

AS
BEGIN
	SET NOCOUNT ON;

SELECT 
	  movieTitleID.MovieID
	, movieTitleID.MovieTitle
	, SUBSTRING(G.genres, 0, LEN(G.genres)) AS genres
FROM
	(
		SELECT 
			  MovieID
			, MovieTitle
		FROM MovieDetails
	
	) 
    AS movieTitleID
	CROSS APPLY
	(
	SELECT DISTINCT G.GenreName + ', '
	FROM Genres AS G
	INNER JOIN MovieGenre AS MG
		ON MG.GenreID = G.GenreID
	WHERE movieTitleID.MovieID = MG.MovieID 
		AND G.GenreID IN (SELECT Number FROM [fn_SplitInt](@GenreID, ','))
	FOR XML PATH('')        
	) G(genres)
	WHERE G.genres IS NOT NULL; 
END

When I executed the separate statements, I noticed something I hadn't before. exec [usp_MovieByGenreStrAgg] '2,3' -- where 2 is Adventure, 3 is Family |MovieID | MovieTitle | Genres | |--------|----------------------------|-------------------------| |3 | Kelly's Heroes | Adventure | |4 | Where Eagles Dare | Adventure | |5 | Star Trek: First Contact | Drama, Adventure | |6 | The Ten Commandments | Drama, Adventure | |7 | Clash of the Titans | Adventure, Family | exec [usp_MovieByGenreCrsApply] '2,3' -- where 2 is Adventure, 3 is Family |MovieID | MovieTitle | Genres | |--------|----------------------------|-------------------------| |3 | Kelly's Heroes | Adventure | |4 | Where Eagles Dare | Adventure | |5 | Star Trek: First Contact | Adventure | |6 | The Ten Commandments | Adventure | |7 | Clash of the Titans | Adventure, Family | String_Agg returns all of the genres of a given movie whose genre has been selected, even if only one of the genres were selected as a parameter value. Cross apply returned only the genres that were selected. I realized I prefer when it displays only the genres I'm interested in. However, I noticed that when I save my cross apply in a view first and then apply the parameters, it behaves effectively like my String_Agg procedure. I think I prefer the way cross applies behaves (without having to save it in a view first), but I'm pretty sure that it's much slower this way without a view, so back at square one with the sloweness. I hope this makes sense.

ck123 (61 rep)

Jun 29, 2021, 01:03 AM • Last activity: Nov 8, 2023, 02:05 PM

3 votes

2 answers

5106 views

Two OPENJSON, return each array element in separate rows

sql-server join json cross-apply

I have a table where I save in JSON format the quantity that exit and return of a product. The data saved in the two columns `out_quantity` and `in_quantity` is in array format. ```sql CREATE TABLE products_info ( product_id int NOT NULL, out_quantity varchar(4000) DEFAULT '[]', in_quantity varchar(...

I have a table where I save in JSON format the quantity that exit and return of a product. The data saved in the two columns out_quantity and in_quantity is in array format.

CREATE TABLE products_info (
  product_id int NOT NULL,
  out_quantity varchar(4000) DEFAULT '[]',
  in_quantity varchar(4000) DEFAULT '[]'
 );

| product_id | out_quantity | in_quantity | |---- |------ | ----- | | 1 | '[{"date":"2022-03-01","quantity":10}, {"date":"2022-03-02","quantity":20}]' | '[{"date":"2022-03-15","quantity":30}]' | Using this query

SELECT product_id, o.out_date, o.out_quantity, i.in_date, i.in_quantity FROM products_info
CROSS APPLY OPENJSON (
      out_quantity
  ) WITH (
      out_date date '$.date',
      out_quantity int '$.quantity'
  ) o
CROSS APPLY OPENJSON (
      in_quantity
  ) WITH (
      in_date date '$.date',
      in_quantity int '$.quantity'
  ) i;

This is what I get | product_id | out_date | out_quantity | in_date | in_quantity | |---|---|---|---|---| | 1 | 2022-03-01 | 10 | 2022-03-15 | 30 | | 1 | 2022-03-02 | 20 | 2022-03-15 | 30 | But what I'm trying to achieve is not to have duplicated data like this | product_id | out_date | out_quantity | in_date | in_quantity | |---|---|---|---|---| | 1 | 2022-03-01 | 10 | NULL | NULL | | 1 | 2022-03-02 | 20 | NULL | NULL | | 1 | NULL | NULL | 2022-03-15 | 30 | I know this is expected behaviour using cross apply but I couldn't find any other solutions. dbfiddle I have also other columns that I get from JOINs like product_description that I get from table products. I haven't added them in this example Thanks

Mattia (33 rep)

Mar 15, 2022, 09:41 PM • Last activity: Mar 17, 2022, 03:37 AM

7 votes

2 answers

4506 views

Optimize extraction of json data via OPENJSON

sql-server azure-sql-database json sql-server-2019 cross-apply

I'm attempting to optimize extraction of values obtained from a REST API which returns json values in an array. Here's an minimal, complete, and verifiable example that reflects exactly what I'm doing. ``` USE tempdb; DROP TABLE IF EXISTS dbo.json_test; CREATE TABLE dbo.json_test ( json_test_id int...

USE tempdb;

DROP TABLE IF EXISTS dbo.json_test;

CREATE TABLE dbo.json_test
(
    json_test_id                int                 NOT NULL
        IDENTITY(1,1)
    , some_uniqueidentifier     uniqueidentifier    NULL
    , some_varchar              varchar(100)        NULL
    , the_json                  nvarchar(max)       NULL
);

INSERT INTO dbo.json_test (some_uniqueidentifier, some_varchar, the_json)
SELECT 
        some_uniqueidentifier       = NEWID()
      , some_varchar                = CONVERT(varchar(100), CRYPT_GEN_RANDOM(64), 1)
      , the_json = (
            SELECT st.* 
            FROM sys.tables st
                CROSS JOIN sys.tables st2
            WHERE st.object_id = t.object_id FOR JSON AUTO
            )
FROM sys.tables t;

;WITH src AS 
(
    SELECT jt.some_uniqueidentifier
        , jt.some_varchar
        , top_array.[key]
        , top_array.[value]
    FROM dbo.json_test jt
        CROSS APPLY OPENJSON(jt.the_json, N'lax $') top_array
),
src2 AS
(
    SELECT src.some_uniqueidentifier
        , src.some_varchar
        , src.[key]
        , src.[value]
        , inner_key = inner_array.[key]
        , inner_value = inner_array.[value]
    FROM src
        CROSS APPLY OPENJSON(src.[value], N'lax $') inner_array
)
SELECT src2.some_uniqueidentifier
    , src2.some_varchar
    , src2.[key]
    , [name]                                = MAX(CASE WHEN src2.[inner_key] = 'name'                               THEN src2.[inner_value] ELSE NULL END)
    , [object_id]                           = MAX(CASE WHEN src2.[inner_key] = 'object_id'                          THEN src2.[inner_value] ELSE NULL END)
    , [principal_id]                        = MAX(CASE WHEN src2.[inner_key] = 'principal_id'                       THEN src2.[inner_value] ELSE NULL END)
    , [schema_id]                           = MAX(CASE WHEN src2.[inner_key] = 'schema_id'                          THEN src2.[inner_value] ELSE NULL END)
    , [parent_object_id]                    = MAX(CASE WHEN src2.[inner_key] = 'parent_object_id'                   THEN src2.[inner_value] ELSE NULL END)
    , [type]                                = MAX(CASE WHEN src2.[inner_key] = 'type'                               THEN src2.[inner_value] ELSE NULL END)
    , [type_desc]                           = MAX(CASE WHEN src2.[inner_key] = 'type_desc'                          THEN src2.[inner_value] ELSE NULL END)
    , [create_date]                         = MAX(CASE WHEN src2.[inner_key] = 'create_date'                        THEN src2.[inner_value] ELSE NULL END)
    , [modify_date]                         = MAX(CASE WHEN src2.[inner_key] = 'modify_date'                        THEN src2.[inner_value] ELSE NULL END)
    , [is_ms_shipped]                       = MAX(CASE WHEN src2.[inner_key] = 'is_ms_shipped'                      THEN src2.[inner_value] ELSE NULL END)
    , [is_published]                        = MAX(CASE WHEN src2.[inner_key] = 'is_published'                       THEN src2.[inner_value] ELSE NULL END)
    , [is_schema_published]                 = MAX(CASE WHEN src2.[inner_key] = 'is_schema_published'                THEN src2.[inner_value] ELSE NULL END)
    , [lob_data_space_id]                   = MAX(CASE WHEN src2.[inner_key] = 'lob_data_space_id'                  THEN src2.[inner_value] ELSE NULL END)
    , [filestream_data_space_id]            = MAX(CASE WHEN src2.[inner_key] = 'filestream_data_space_id'           THEN src2.[inner_value] ELSE NULL END)
    , [max_column_id_used]                  = MAX(CASE WHEN src2.[inner_key] = 'max_column_id_used'                 THEN src2.[inner_value] ELSE NULL END)
    , [lock_on_bulk_load]                   = MAX(CASE WHEN src2.[inner_key] = 'lock_on_bulk_load'                  THEN src2.[inner_value] ELSE NULL END)
    , [uses_ansi_nulls]                     = MAX(CASE WHEN src2.[inner_key] = 'uses_ansi_nulls'                    THEN src2.[inner_value] ELSE NULL END)
    , [is_replicated]                       = MAX(CASE WHEN src2.[inner_key] = 'is_replicated'                      THEN src2.[inner_value] ELSE NULL END)
    , [has_replication_filter]              = MAX(CASE WHEN src2.[inner_key] = 'has_replication_filter'             THEN src2.[inner_value] ELSE NULL END)
    , [is_merge_published]                  = MAX(CASE WHEN src2.[inner_key] = 'is_merge_published'                 THEN src2.[inner_value] ELSE NULL END)
    , [is_sync_tran_subscribed]             = MAX(CASE WHEN src2.[inner_key] = 'is_sync_tran_subscribed'            THEN src2.[inner_value] ELSE NULL END)
    , [has_unchecked_assembly_data]         = MAX(CASE WHEN src2.[inner_key] = 'has_unchecked_assembly_data'        THEN src2.[inner_value] ELSE NULL END)
    , [text_in_row_limit]                   = MAX(CASE WHEN src2.[inner_key] = 'text_in_row_limit'                  THEN src2.[inner_value] ELSE NULL END)
    , [large_value_types_out_of_row]        = MAX(CASE WHEN src2.[inner_key] = 'large_value_types_out_of_row'       THEN src2.[inner_value] ELSE NULL END)
    , [is_tracked_by_cdc]                   = MAX(CASE WHEN src2.[inner_key] = 'is_tracked_by_cdc'                  THEN src2.[inner_value] ELSE NULL END)
    , [lock_escalation]                     = MAX(CASE WHEN src2.[inner_key] = 'lock_escalation'                    THEN src2.[inner_value] ELSE NULL END)
    , [lock_escalation_desc]                = MAX(CASE WHEN src2.[inner_key] = 'lock_escalation_desc'               THEN src2.[inner_value] ELSE NULL END)
    , [is_filetable]                        = MAX(CASE WHEN src2.[inner_key] = 'is_filetable'                       THEN src2.[inner_value] ELSE NULL END)
    , [is_memory_optimized]                 = MAX(CASE WHEN src2.[inner_key] = 'is_memory_optimized'                THEN src2.[inner_value] ELSE NULL END)
    , [durability]                          = MAX(CASE WHEN src2.[inner_key] = 'durability'                         THEN src2.[inner_value] ELSE NULL END)
    , [durability_desc]                     = MAX(CASE WHEN src2.[inner_key] = 'durability_desc'                    THEN src2.[inner_value] ELSE NULL END)
    , [temporal_type]                       = MAX(CASE WHEN src2.[inner_key] = 'temporal_type'                      THEN src2.[inner_value] ELSE NULL END)
    , [temporal_type_desc]                  = MAX(CASE WHEN src2.[inner_key] = 'temporal_type_desc'                 THEN src2.[inner_value] ELSE NULL END)
    , [history_table_id]                    = MAX(CASE WHEN src2.[inner_key] = 'history_table_id'                   THEN src2.[inner_value] ELSE NULL END)
    , [is_remote_data_archive_enabled]      = MAX(CASE WHEN src2.[inner_key] = 'is_remote_data_archive_enabled'     THEN src2.[inner_value] ELSE NULL END)
    , [is_external]                         = MAX(CASE WHEN src2.[inner_key] = 'is_external'                        THEN src2.[inner_value] ELSE NULL END)
    , [history_retention_period]            = MAX(CASE WHEN src2.[inner_key] = 'history_retention_period'           THEN src2.[inner_value] ELSE NULL END)
    , [history_retention_period_unit]       = MAX(CASE WHEN src2.[inner_key] = 'history_retention_period_unit'      THEN src2.[inner_value] ELSE NULL END)
    , [history_retention_period_unit_desc]  = MAX(CASE WHEN src2.[inner_key] = 'history_retention_period_unit_desc' THEN src2.[inner_value] ELSE NULL END)
    , [is_node]                             = MAX(CASE WHEN src2.[inner_key] = 'is_node'                            THEN src2.[inner_value] ELSE NULL END)
    , [is_edge]                             = MAX(CASE WHEN src2.[inner_key] = 'is_edge'                            THEN src2.[inner_value] ELSE NULL END)
FROM src2
GROUP BY src2.some_uniqueidentifier
    , src2.some_varchar
    , src2.[key]
ORDER BY src2.some_uniqueidentifier
    , src2.some_varchar
    , src2.[key];

The [query plan](https://www.brentozar.com/pastetheplan/?id=Skcotv1AO) uses a couple of nested loops joins regardless of how many rows are contained in the input table. Presumably that's an artifact of using the CROSS APPLY operator. For your fun, I've already set up a [DB Fiddle](https://dbfiddle.uk/?rdbms=sqlserver_2019&sample=adventureworks&fiddle=a548cf359ff55fa77a16677cc05050d6) . Is there a more efficient way of getting the data out of json format into a "real" set of columns? I created the code above using my on-prem SQL Server 2019 instance, however the target will be Azure SQL Database, so all the latest-and-greatest options are available.

Hannah Vernon (70988 rep)

Jul 16, 2021, 08:38 PM • Last activity: Sep 24, 2021, 04:33 PM

4 votes

1 answers

4248 views

While using Cross Apply, can I force "empty" records to appear?

sql-server t-sql ssrs pivot cross-apply

I have a query that is working fantastically, except that I would like to force some records to appear. I am using SqlServer 2016, and this query will ultimately be used in a VIEW and used by SSRS 2016. If the desired behavior can be done in SSRS, or even changing to a PIVOT or some such, I am open...

                                  I have a query that is working fantastically, except that I would like to force some records to appear. I am using SqlServer 2016, and this query will ultimately be used in a VIEW and used by SSRS 2016. If the desired behavior can be done in SSRS, or even changing to a PIVOT or some such, I am open to that option.

I will put some DDL down below to mimic the example images shown.

The data I am pulling from looks like the below:



And I am using a cross apply to make the totals appear vertically like so:




The problem is, as seen above, the records that have a state and SUI_State of 'IN', have a SUI_State of "OH". This leads to the output to not contain SUI records for "IN" because no records have an SUI_State1 of "IN"

Is there any way to force "IN Employee SUI", "IN Employer SUI", etc... to appear with values of 0?

There are many states where the State/SDI State and SUI State are different, so I cannot hard-code the logic in there.

And here is some dumbed down similar data with my current query. The values aren't important in this question, only forcing the SUI records to appear for the state, "IN'

        CREATE TABLE #EmployeeTaxes
    (	 Payroll int
        ,SDI_State char(2)
    	,SUI_State char(2)
        ,State char(2)
    	,SIT DECIMAl(19,2)
        ,Employee_SDI DECIMAL(19,2)
        ,Employer_SDI DECIMAL(19,2)
        ,Employee_SUI DECIMAL(19,2)
        ,Employer_SUI DECIMAL(19,2)
    )
    
    --Data in source table
    INSERT INTO #EmployeeTaxes
    VALUES   (4, 'IN', 'OH', 'IN', 50, 0, 0, 0, 100)
            ,(4, 'IN', 'OH', 'IN', 50, 0, 0, 0, 100)
            ,(4, 'IN', 'OH', 'IN', 50, 0, 0, 0, 100)
            ,(4, 'IN', 'OH', 'IN', 50, 0, 0, 0, 100)
            ,(4, 'IN', 'OH', 'IN', 50, 0, 0, 0, 100)
            ,(4, 'OH', 'OH', 'OH', 50, 0, 0, 0, 100)
            ,(4, 'OH', 'OH', 'OH', 50, 0, 0, 0, 100)
            ,(4, 'OH', 'OH', 'OH', 50, 0, 0, 0, 100)
            ,(4, 'OH', 'OH', 'OH', 50, 0, 0, 0, 100)
    
    
    --My Current Query
    SELECT Payroll
    	  ,v.Item						AS [Witholding]
          ,SUM(v.TaxValue)				AS Tax	
    	  ,v.OrderByNumber				AS [OrderByNumber]
    FROM #EmployeeTaxes
    CROSS APPLY (
        VALUES ([STATE]   + ' SIT', SIT, [STATE] + '110')
    		  ,(SDI_STATE + ' Employee SDI', EMPLOYEE_SDI,SDI_STATE + '111')
              ,(SDI_STATE + ' Employer SDI', EMPLOYER_SDI,SDI_STATE + '112')
    		  ,(SUI_STATE + ' Employee SUI', EMPLOYEE_SUI,SUI_STATE + '113')
              ,(SUI_STATE + ' Employer SUI', EMPLOYER_SUI,SUI_STATE + '114A')
    		  ,(SUI_STATE + ' Total', EMPLOYEE_SUI + EMPLOYER_SUI, SUI_State + '114B')
    		  ,(SDI_STATE + ' Total', EMPLOYEE_SDI + EMPLOYER_SDI , SDI_State + '114B')
    		  ,('---------------------------------',NULL,[STATE] + '121')
    		 
    ) v (Item, TaxValue, OrderByNumber)
    GROUP BY Payroll, OrderByNumber, v.Item
    ORDER BY PAYROLL, OrderByNumber 
    DROP TABLE #EmployeeTaxes



                                

Jeff.Clark (627 rep)

Feb 20, 2017, 10:11 PM • Last activity: Jul 10, 2021, 01:11 PM

10 votes

2 answers

686 views

Microsoft SQL Server 2014 Nested From Query in Cross-Apply

sql-server union cross-apply

When selecting from a nested query in an OUTER APPLY statement the nested query seems to be evaluated only once in certain circumstances. Bug reported to Azure Feedback Forum: https://feedback.azure.com/forums/908035-sql-server/suggestions/39428632-microsoft-sql-server-2014-incorrect-result-when-s Is this the expected behavior or am I missing something in the documentation or is this a bug in SQL Server? Also, is there any possibility to force evaluation of the nested query for every row? Test Case 1 > Evaluates nested FROM query for every row in VALUES (imho expected > behaviour)

SELECT
    v,
	v2
FROM
	(VALUES (1), (2), (3), (4)) AS inner_query(v)
	OUTER APPLY (
		SELECT
			MAX(inner_v2) AS v2
		FROM (
			SELECT 
				15 AS id,
				v AS inner_v2
		) AS outer_query
		GROUP BY id
	) AS outer_apply

Result:

| v | v2|
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |

Test Case 2 > It also evaluates nested FROM query for every row in VALUES (imho > expected behaviour)

SELECT
    v,
	v2
FROM
	(VALUES (1), (2), (3), (4)) AS inner_query(v)
	OUTER APPLY (
		SELECT
			MAX(inner_v2) AS v2
		FROM (
			SELECT 
				15 AS id,
				v AS inner_v2
			UNION ALL
			SELECT
				id AS id,
				TestCaseTemp2.v AS inner_v2
			FROM
				(VALUES (1337, 0)) AS TestCaseTemp2(id, v)
			WHERE TestCaseTemp2.v != 0
		) AS outer_query
		GROUP BY id
	) AS outer_apply;

Result:

| v | v2|
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |

Testcase 3 > Evaluates nested FROM query only once

CREATE TABLE TestCaseTemp
(
    id int,
	v int
);
INSERT INTO TestCaseTemp VALUES (1337, 0);

SELECT
    v,
	v2
FROM
	(VALUES (1), (2), (3), (4)) AS inner_query(v)
	OUTER APPLY (
		SELECT
			MAX(inner_v2) AS v2
		FROM (
			SELECT 
				15 AS id,
				v AS inner_v2
			UNION ALL
			SELECT
				id AS id,
				TestCaseTemp.v AS inner_v2
			FROM
				TestCaseTemp
			WHERE TestCaseTemp.v != 0
		) AS outer_query
		GROUP BY id
	) AS outer_apply;

DROP TABLE TestCaseTemp;

Result:

| v | v2|
|---|---|
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |

Phist0ne (103 rep)

Jan 13, 2020, 04:44 PM • Last activity: Jun 18, 2021, 10:05 AM

8 votes

2 answers

19153 views

CROSS APPLY on Scalar function

sql-server cross-apply

I have this: SELECT A ,B ,dbo.Func(C) ,dbo.Func(D) ,dbo.Func(E) ,F FROM abcdef WHERE 0 = dbo.Func(C) + dbo.Func(D) I have read that this is not good practice because the function is called million of times and it has a bad impact on performance. I tried to rewrite it with CROSS APPLY: SELECT * FROM...

                                  I have this:

    SELECT
    A
    ,B
    ,dbo.Func(C)
    ,dbo.Func(D)
    ,dbo.Func(E)
    ,F
    FROM abcdef
    WHERE
    0 = dbo.Func(C) + dbo.Func(D)

I have read that this is not good practice because the function is called million of times and it has a bad impact on performance.

I tried to rewrite it with CROSS APPLY:

    SELECT *
     FROM abcdef
      CROSS APPLY dbo.Func(D) as FD

but it returns this error:

    Invalid object name 'dbo.Func'

Can I use CROSS APPLY only on TABLE VALUED functions ?

And is CROSS APPLY suitable for scalar functions (converted to table valued function) anyway ? Because on my small performance test CROSS APPLY is a bit slower.

                                

Muflix (1099 rep)

Apr 8, 2015, 08:12 PM • Last activity: May 24, 2021, 08:37 PM

3 votes

1 answers

1303 views

Add query hints to a CROSS APPLY

sql-server t-sql hints cross-apply

I need to apply a query hint (like `NOWAIT` or `NOLOCK`) to the `CROSS APPLY` in this query. How do I do it: SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; SELECT * FROM sys.objects obj WITH (NOLOCK) INNER JOIN sys.stats stat WITH (NOLOCK) ON stat.object_id = obj.object_id CROSS APPLY sys.dm_db_s...

                                  I need to apply a query hint (like NOWAIT or NOLOCK) to the CROSS APPLY in this query. How do I do it:

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
    
    SELECT *
    FROM sys.objects obj WITH (NOLOCK) 
    INNER JOIN sys.stats stat WITH (NOLOCK) ON stat.object_id = obj.object_id  
    CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) sp

I don't have access to the function contents (for obvious reasons.)

I went ahead and removed the repro from the question to make it more clear. I'm less concerned about the ability to reproduce the blocking. (It's also happening at customer environments, and in the First Responder Kit in issue # 2548.)
                                

Brent Ozar (43335 rep)

Aug 27, 2020, 02:09 PM • Last activity: Aug 27, 2020, 02:44 PM

11 votes

1 answers

1172 views

Apply cardinality estimation problem in SQL Server

sql-server query-performance optimization cardinality-estimates cross-apply

Now, I am faced with the problem of the logic of cardinality estimation that is not quite clear for me in a seemingly rather simple situation. I encountered this situation at my work, therefore, for privacy reasons, I will provide only a general description of the problem below, however, for a more...

SELECT 
FROM 
CROSS APPLY(

	SELECT
	    
	FROM  p
	WHERE p.StartDate = Dates.d
) t

As you can see from the execution plan presented above, the cardinality estimator estimated the estimated number of rows in the Index Seek operation at 17,884,200 (corresponding to 2,980,700 per row from the outer part of the NL), which is quite close to the actual number. Now I will modify the query and add to CROSS APPLY LEFT OUTER JOIN:

SELECT 
FROM 
CROSS APPLY(

	SELECT
	    
    
	FROM  p
	LEFT JOIN  f ON p.key = f.key
		AND f.date = Dates.d
	WHERE p.StartDate = Dates.d
) t

This query gives the following plan:

Seeing the logical form of the query, it is logical to assume that the expected number of rows of the Index Seek operation will remain the same, although I understand that the route for finding the plan is different, however, it would seem that the part highlighted in red has not changed, the same predicates, etc. , but Index Seek's estimate is now 664,506 (corresponding to 110,751 per line from the external part of NL), which is a gross mistake and in the production environment can cause a serious tempdb spill data. The above queries were executed on an instance of Sql Server 2012 (SP4) (KB4018073) - 11.0.7001.0 (x64). To get more details and simplify the analysis, I simulated this problem in the AdventureWorksDW2017 database on an instance of SQL Server 2019 (RTM) - 15.0.2000.5 (X64), but I execute queries with the 9481 trace flag turned on to simulate a system with cardinality estimator version 70. Below is a query with left outer join.

DECLARE @db DATE = '20130720'
DECLARE @de DATE = '20130802'

;WITH Dates AS(

	SELECT [FullDateAlternateKey] AS d
	FROM [AdventureWorksDW2017].[dbo].[DimDate]
	WHERE [FullDateAlternateKey] BETWEEN @db AND @de
)
SELECT *
FROM Dates
CROSS APPLY(

	SELECT
	    p.[ProductAlternateKey]
	   ,f.[OrderQuantity]
	FROM [AdventureWorksDW2017].[dbo].[DimProduct] p
	LEFT JOIN [AdventureWorksDW2017].[dbo].[FactInternetSales] f ON f.ProductKey = p.ProductKey
	   AND f.[OrderDate] = Dates.d
	WHERE p.StartDate = Dates.d

) t
OPTION(QUERYTRACEON 9481 /*force legacy CE*/)

It is also worth noting that the following index was created on the DimProduct table:

CREATE NONCLUSTERED INDEX [Date_Indx] ON [dbo].[DimProduct]
(
	[StartDate] ASC,
	[EndDate] ASC
)
INCLUDE([ProductAlternateKey])

The query gives the following query plan: (1)

As you can see, the part of the query highlighted in red gives an estimate of 59,754 (~ 182 per row). Now I’ll demonstrate a query plan without a left outer join. (2)

As you can see the part of the query highlighted in red gives a score of 97 565 (~ 297 per row), the difference is not so great however, the cardinality score for the filter (3) operator is significantly different ~ 244 per row versus ~ 54 in the query with left outer join. (3) – Filter predicate:

isnull([AdventureWorksDW2017].[dbo].[DimProduct].[EndDate] as [p].[EndDate],'9999-12-31 00:00:00.000')>=[AdventureWorksDW2017].[dbo].[DimDate].[FullDateAlternateKey]

Trying to plunge deeper, I looked at the trees of physical operators presented above plans. Below are the most important parts of the trace of undocumented flags 8607 and 8612. For plan (2):

PhyOp_Apply lookup TBL: AdventureWorksDW2017.dbo.DimProduct
…
PhyOp_Range TBL: AdventureWorksDW2017.dbo.DimProduct(alias TBL: p)(6) ASC  Bmk ( QCOL: [p].ProductKey) IsRow: COL: IsBaseRow1002  [ Card=296.839 Cost(RowGoal 0,ReW 0,ReB 327.68,Dist 328.68,Total 328.68)= 0.174387 ](Distance = 2)
              ScaOp_Comp x_cmpLe
                 ScaOp_Identifier QCOL: [p].StartDate
                 ScaOp_Identifier QCOL: [AdventureWorksDW2017].[dbo].[DimDate].FullDateAlternateKey

For plan (1):

PhyOp_Apply (x_jtInner)
…
PhyOp_Range TBL: AdventureWorksDW2017.dbo.DimProduct(alias TBL: p)(6) ASC  Bmk ( QCOL: [p].ProductKey) IsRow: COL: IsBaseRow1002  [ Card=181.8 Cost(RowGoal 0,ReW 0,ReB 327.68,Dist 328.68,Total 328.68)= 0.132795 ](Distance = 2)


                 ScaOp_Comp x_cmpLe

                    ScaOp_Identifier QCOL: [p].StartDate

                    ScaOp_Identifier QCOL: [AdventureWorksDW2017].[dbo].[DimDate].FullDateAlternateKey

As you can see, the optimizer selects various implementations of the Apply operator, PhyOp_Apply lookup in (2) and PhyOp_Apply (x_jtInner) in (1), but I still do not understand what I can extract from this. I can get the same estimate as in plan (1) by rewriting the original query without left outer join as follows:

DECLARE @db DATE = '20130720'
DECLARE @de DATE = '20130802'

;WITH Dates AS(

	SELECT [FullDateAlternateKey] AS d
	FROM [AdventureWorksDW2017].[dbo].[DimDate]
	WHERE [FullDateAlternateKey] BETWEEN @db AND @de
)
SELECT *
FROM Dates
CROSS APPLY(

	SELECT TOP(1000000000)
		p.[ProductAlternateKey]
	FROM [AdventureWorksDW2017].[dbo].[DimProduct] p
	WHERE p.StartDate = Dates.d

) t
OPTION(QUERYTRACEON 9481 /*force legacy CE*/)

Which gives the following plan: (4)

As you can see, the estimation of the area highlighted in red coincides with the plan (1) and the PhyOp_Apply (x_jtInner) operator in the tree of physical operators. Please help me answer the question, is there a way to influence such an estimation of cardinality, possibly by hints or by changing the query form, etc., and help to understand why the optimizer gives such an estimation in this case.

Павел Ковалёв (194 rep)

Apr 9, 2020, 04:16 AM • Last activity: Aug 26, 2020, 02:18 PM

3 votes

1 answers

5450 views

Select with CROSS APPLY runs slow

sql-server execution-plan sorting cross-apply top

I am trying to optimize the query to run faster. The query is the following: SELECT grp_fk_obj_id, grp_name FROM tbl_groups as g1 CROSS APPLY (SELECT TOP 1 grp_id as gid FROM tbl_groups as g2 WHERE g1.grp_fk_obj_id = g2.grp_fk_obj_id ORDER BY g2.date_from DESC, ISNULL(date_to, '4000-01-01') DESC) as...

                                  I am trying to optimize the query to run faster.
The query is the following:

    SELECT grp_fk_obj_id, grp_name
    FROM tbl_groups as g1
             CROSS APPLY (SELECT TOP 1 grp_id as gid
                          FROM tbl_groups as g2
                          WHERE g1.grp_fk_obj_id = g2.grp_fk_obj_id
                          ORDER BY g2.date_from DESC, ISNULL(date_to, '4000-01-01') DESC) as a
    WHERE g1.grp_id = gid

grp_id is a primary key. grp_fk_obj_id is a foreign key to another object. There are indexes on both of these columns (I guess it comes as default).

It takes about half a second to complete but I need it to make work faster. I took a look at the execution plan and it shows that "Top N sort" has a cost of more than 90%. Also, I have noticed that if I remove a where clause inside the cross apply then it runs at least 5x faster, but I need that where clause in one way or another.

Do you see any possibilities to improve the performance of this query?




*EDIT:
table creation DDL:*

    create table tbl_groups
    (
        grp_id        bigint identity
            constraint PK_tbl_groups
                primary key,
        grp_fk_obj_id bigint      not null
            constraint FK_grp_fk_obj_id
                references tbl_other,
        grp_name      varchar(30) not null,
        date_from     date        not null,
        date_to       date
    )
    go
    
    create index IDX_grp_fk_obj_id
        on tbl_groups (grp_fk_obj_id)
    go
    
    create index IDX_grp_name
        on tbl_groups (grp_name)
    go
                                

EtherPaul (133 rep)

Aug 13, 2020, 11:16 AM • Last activity: Aug 13, 2020, 12:38 PM

-1 votes

1 answers

1824 views

"No column name was specified for column..." when aliasing results of CROSS APPLY

sql-server cross-apply

I have this table: [![enter image description here][1]][1] from which I create this View: [![enter image description here][2]][2] This view's `CardId` allows me to `JOIN` against the `Card` table with so I can retrieve the `Count` from any `CardId`. Here's my SQL: ``` SELECT * FROM ( SELECT si.CardI...

I have this table:

from which I create this View:

This view's CardId allows me to JOIN against the Card table with so I can retrieve the Count from any CardId. Here's my SQL:

SELECT * FROM (
    SELECT
        si.CardId SourceCardId,
        ti.CardId TargetCardId,
        (SELECT TOP 1 r.Name
        FROM dbo.Relationship r
        WHERE r.Id = rac.RelationshipId) [Name],
        Count(*) [Count]
    FROM dbo.Relationship_AcquiredCard rac
    JOIN dbo.AcquiredCard sac ON rac.SourceAcquiredCardId = sac.Id
    JOIN dbo.AcquiredCard tac ON rac.TargetAcquiredCardId = tac.Id
    JOIN dbo.CardInstance si ON sac.CardInstanceId = si.Id
    JOIN dbo.CardInstance ti ON tac.CardInstanceId = ti.Id
    GROUP BY si.CardId, ti.CardId, rac.RelationshipId
    -- you can probably ignore everything above
    ) X
CROSS APPLY
    (values (X.TargetCardId),
            (X.SourceCardId)
    ) whatdoesthisdo(CardId) --wut

What does whatdoesthisdo do? I got the CROSS APPLY from this answer . If I try to alias in the usual way, I get this error:

Thanks!

DharmaTurtle (161 rep)

Feb 16, 2020, 04:53 PM • Last activity: Feb 16, 2020, 05:16 PM

6 votes

3 answers

2154 views

How does this derived table with cross apply work?

sql-server functions cte cross-apply

I borrowed some code on how to compact and uncompact a string of diagnosis codes in order to build a bridge table for my data warehouse. It works well. However, I just do not understand how it is doing the uncompacting. [Here Is the SQL Fiddle for the below code][1] create table dimDiagnosisGroup (d...

                                  I borrowed some code on how to compact and uncompact a string of diagnosis codes in order to build a bridge table for my data warehouse. It works well. However, I just do not understand how it is doing the uncompacting. Here Is the SQL Fiddle for the below code 

    create table dimDiagnosisGroup (dxGroupKey int, dxCodeList nvarchar(1024))
    insert into dimDiagnosisGroup
    values  (1,'042, 070.70, 722.10'),
            (2,'042, 070.70, 780.52, 496, 716.90, 581.9'),
            (3,'042, 070.70, 782.1, V58.69'),
            (4,'042, 070.70, 782.3, V58.69')
    
    
     WITH XMLTaggedList AS (
    	 SELECT dxGroupKey,
    		CAST('' + REPLACE(dxCodeList, ', ', '') + '' AS XML)
    			AS Diagnosis_Code_List
    	 FROM dimDiagnosisGroup
     )
    	 SELECT dxGroupKey,
    			ExtractedDiagnosisList.X.value('.', 'VARCHAR(MAX)') AS dxCodeList2
    		  FROM XMLTaggedList
    		CROSS APPLY Diagnosis_Code_List.nodes('//I') AS ExtractedDiagnosisList(X)
    		
I understand the **XMLTaggedList** part fine. What I am not understanding is the Cross Apply to the **ExtractedDiagnosisList(X)**, then the **ExtractedDiagnosisList.X.value('.', 'VARCHAR(MAX)')**. 

When I hover over **ExtractedDiagnosisList** in the select statement, SSMS says it is a derived table. However, it kind of looks like a function to me. I am not understanding how **Diagnosis_Code.List** gets the .nodes('//I') function. Finally, the **ExtractedDiagnosisList.X.value** section just looks foreign to me, in SQL. It looks like syntax from a language like C#.
                                

Anthony Genovese (2067 rep)

Jun 28, 2017, 06:44 PM • Last activity: Jul 11, 2019, 03:52 PM

19 votes

1 answers

3531 views

CROSS APPLY produces outer join

sql-server execution-plan cross-apply

In answer to https://dba.stackexchange.com/q/239788 Erik Darling posted this code to work around for the lack of `COUNT(DISTINCT) OVER ()`: ``` SELECT * FROM #MyTable AS mt CROSS APPLY ( SELECT COUNT(DISTINCT mt2.Col_B) AS dc FROM #MyTable AS mt2 WHERE mt2.Col_A = mt.Col_A -- GROUP BY mt2.Col_A ) AS...

In answer to https://dba.stackexchange.com/q/239788 Erik Darling posted this code to work around for the lack of COUNT(DISTINCT) OVER ():

SELECT      *
FROM        #MyTable AS mt
CROSS APPLY (   SELECT COUNT(DISTINCT mt2.Col_B) AS dc
                FROM   #MyTable AS mt2
                WHERE  mt2.Col_A = mt.Col_A
                -- GROUP BY mt2.Col_A 
            ) AS ca;

The query uses CROSS APPLY (not OUTER APPLY) so why is there an **outer** join in the execution plan instead of an **inner** join?

Also why does uncommenting the group by clause result in an inner join?

I don't think the data is important but copying from that given by kevinwhat on the other question:

create table #MyTable (
Col_A varchar(5),
Col_B int
)

insert into #MyTable values ('A',1)
insert into #MyTable values ('A',1)
insert into #MyTable values ('A',2)
insert into #MyTable values ('A',2)
insert into #MyTable values ('A',2)
insert into #MyTable values ('A',3)

insert into #MyTable values ('B',4)
insert into #MyTable values ('B',4)
insert into #MyTable values ('B',5)

Paul White (95060 rep)

Jun 5, 2019, 02:08 PM • Last activity: Jun 8, 2019, 01:24 PM

1 votes

1 answers

1560 views

Refactor SQL to left join on TVF requiring input variable

sql-server set-returning-functions cross-apply

I would like to join two tables and a table-valued function. The TVF take a date as input variable. The expected output is: Date, LimitName, Exposure, ProductValueEligible, ProductValueNonEligible The Tables: 1. TABLE_A: Date, LimitName, LimitValue, Limit >> main table that specify limits I want to...

                                  I would like to join two tables and a table-valued function. The TVF take a date as input variable.

The expected output is:
Date, LimitName, Exposure, ProductValueEligible, ProductValueNonEligible

The Tables:

 1. TABLE_A: Date, LimitName, LimitValue, Limit >> main table that
    specify limits I want to check
 2. TABLE_B: Date, LimitName, ProductType (one-to-many) >> helper table that specify which products types map into each limit
 3. TABLE_C (TVF): (several identification columns, like ID value, dates) + Date, ProductType, ProductValue, IsProductEligible >> the
    main product data to be aggregated and mapped into the limits

I am able to get the expected result using a query as below.

However, because the TVF takes a date as input variable, this query requires specifying the date 3 times. **I would like to refactor the query to specify the date once, and have that condition applied to the TVF.**

Reading up on this, I think that cross apply and outer apply should be able to do this, but I cannot seem to get syntax that works. I end up having to put where conditions in place, which drops rows (works like inner join rather than left join). I would appreciate some direction on how to make this work.

    select
      a.Date
    , a.LimitName
    , sum(c_elig.ProductValue) as ProductValueEligible
    , sum(c_non.ProductValue) as ProductValueNonEligible
    
    -- TABLE_A: Date, LimitName, LimitValue, Limit
    from Limits a
    
    -- TABLE_B: Date, LimitName, ProductType (one-to-many)
    left join LimitsProductTypes b
    on b.Date = a.Date
    and b.LimitName = a.LimitName
    
    -- TABLE_C (TVF):  + Date, ProductType, ProductValue, IsProductEligible
    left join (
        -- Find eligible rows, and group by Product
        select Date, ProductType, sum(ProductValue) as SumProductValue
        from Function_DataSet('2019-05-15') --<------------------------------------- TVF, takes a date input
        where IsProductEligible = 'TRUE'
        group by Date, ProductType
    ) c_elig
    on c_elig.Date = a.Date
    and c_elig.ProductType = b.ProductType
    
    left join (
        -- Find non-eligible rows, and group by Product
        select Date, ProductType, sum(ProductValue) as SumProductValue
        from Function_DataSet('2019-05-15') --<------------------------------------- TVF, takes a date input
        where IsProductEligible = 'FALSE'
        group by Date, ProductType
    ) c_non
    on c_non.Date = a.Date
    and c_non.ProductType = b.ProductType
    
    where a.Date = '2019-05-15'   --<------------------------------------------- want to just specify date here
    
    group by a.Date, a.LimitName
                                

Roberto (119 rep)

May 17, 2019, 06:25 PM • Last activity: May 20, 2019, 01:41 AM

2 votes

1 answers

1504 views

Refactoring subquery to JOIN and CROSS APPLY, get only one row for each record in parent table

sql-server sql-server-2008-r2 sql-server-2014 join cross-apply

Given the following query: SELECT p.ProductName, CASE WHEN EXISTS(SELECT 1 FROM Product WHERE ProductSuperID = p.ProductSuperID AND HasImage = 1) THEN 1 ELSE 0 END AS HasImage, (SELECT Sum(StockBalance) FROM Product WHERE ProductSuperID = p.ProductSuperID) AS StockBalance, CASE WHEN EXISTS(SELECT 1...

                                  Given the following query:

    SELECT
      p.ProductName,
      CASE
        WHEN EXISTS(SELECT 1 FROM Product WHERE ProductSuperID = p.ProductSuperID AND HasImage = 1)
        THEN
          1
    	ELSE
    	  0
    	END
      AS HasImage,
      (SELECT Sum(StockBalance) FROM Product WHERE ProductSuperID = p.ProductSuperID) AS StockBalance,
      CASE
        WHEN EXISTS(SELECT 1 FROM Product WHERE ProductSuperID = p.ProductSuperID AND Price IS NULL)
        AND EXISTS(SELECT 1 FROM Product WHERE ProductSuperID = p.ProductSuperID AND DiscountPrice IS NULL)
        THEN
          0
        ELSE
          1
        END
        AS HasPrice
    FROM ProductSuper p
    
    -- SCHEMA
    CREATE TABLE ProductSuper
    (
      ProductSuperID int,
      ProductName varchar(255)
    )
    CREATE TABLE Product
    (
      ProdID int,
      ProductSuperID int,
      HasImage bit,
      StockBalance int,
      Price decimal(10,2),
      DiscountPrice decimal(10,2)
    )
    INSERT INTO ProductSuper
      (ProductSuperID, ProductName)
    VALUES
      (1, 'Product 1'),
      (2, 'Product 2')
    INSERT INTO Product
      (ProductSuperID, HasImage, StockBalance, Price, DiscountPrice)
    VALUES
      (1, 0, 10, 10.00, 9.00),
      (1, 0, 0, 10.00, 9.00),
      (2, 0, 10, 10.00, 9.00),
      (2, 0, 2, 10.00, 9.00),
      (2, 1, 5, 10.00, 9.00)



I want to learn how could I best rewrite it to use either JOIN or CROSS APPLY, if anything to avoid some code duplication. I tried writing a JOIN-based version (and one with APPLY) but I'm getting more than one result for each row in the ProductSuper table, whereas I only want one row.

i.e. expected result:

    +-------------+-----+-----+-----+
    |  Product 1  |  0  |  10 |  1  |
    +-------------+-----+-----+-----+
    |  Product 2  |  1  |  17 |  1  |
    +-------------+-----+-----+-----+


(I'm aware that for this particular piece of code there is little benefit from rewriting, since the subqueries are fast. But still, this is only an example.)

Thanks.
                                

Marc.2377 (187 rep)

Jan 21, 2019, 03:05 PM • Last activity: Mar 26, 2019, 07:49 PM

7 votes

1 answers

2102 views

How to get cross apply to operate row by row on a view?

sql-server execution-plan view cross-apply

We have a view that is optimized for single item queries (200ms no parallelism): select * from OptimizedForSingleObjectIdView e2i where ObjectId = 3374700 It also works on small sets of static ids (~5). select * from OptimizedForSingleObjectIdView e2i where ObjectId in (3374700, 3374710, 3374720, 33...

                                  We have a view that is optimized for single item queries (200ms no parallelism):

    select * 
        from OptimizedForSingleObjectIdView e2i
       where ObjectId = 3374700

It also works on small sets of static ids (~5).    

    select * 
        from OptimizedForSingleObjectIdView e2i
       where ObjectId in (3374700, 3374710, 3374720, 3374730, 3374740);

However if the objects come from an external source, then it generates a slow plan.  The execution plan shows that the execution branch for the view part is ignoring the predicate on ObjectId whereas in the original case it uses them to perform index seeks.

    select v.*
      from 
         (
           select top 1 ObjectId from Objects
            where ObjectId % 10 = 0
            order by ObjectId
         ) o  
      join OptimizedForSingleObjectIdView v -- (also tried inner loop join)
        on v.ObjectId = o.ObjectId;

We don't wish to invest in "dual" optimizing the view for non-singular cases.  Rather, the solution we "seek" is to repetitively call the view once per object **without resorting to an SP**.

Most of the time the following solution calls the view row by row. However not this time and not even for just 1 object:

    select v.*
      from
         (
           select top 1 ObjectId 
             from Objects 
            where ObjectId % 10 = 0 -- non-trivial predicate
            order by ObjectId
         ) o
       cross apply
        (
          select top 2000000000 *
            from OptimizedForSingleObjectIdView v_
           where ObjectId = o.ObjectId 
           order by v_.SomeField
        ) v;

At one time I thought there was a claim  that cross apply was guaranteed for row by row execution when it calls a UDF but this also failed:

    create function FunctionCallingView(@pObjectId bigint)
    returns table
    as 
    return select *
      from OptimizedForSingleObjectIdView 
     where ObjectId = @pObjectId;             
    
    select v.*
      from
         (
           select top 1 ObjectId 
             from Objects 
            where ObjectId % 10 = 0
            order by ObjectId
         ) o
     cross apply FunctionCallingView(o.ObjectId) v

Adding option(force order) did not help--however there are two hash hints in the view already. Temporarily removing them did not help and slows down the single case.

Here is a snippet of the estimated plan for the function based slow case.  The estimation of 1 row is correct.  Far off to the right (not shown) is where there is a seek predicate that does not include that top 1 result.  This seems similar to other cases we have where singular probed values from table seeks are not used as seek predicates elsewhere.




                                

crokusek (2110 rep)

Jan 25, 2019, 01:15 AM • Last activity: Jan 26, 2019, 10:12 AM

1 votes

1 answers

76 views

is there a way to improve this query by not re calculating functions? cross apply?

sql-server sql-server-2016 transactional-replication scripting cross-apply

when I set up replication I like to take a count of each table in the publisher, then I go to the subscriber I do the same thing in order to compare both pictures. this is the picture from the publisher: [![enter image description here][1]][1] in order to generate this picture I run the following qu...

                                  

when I set up replication I like to take a count of each table in the publisher, then I go to the subscriber I do the same thing in order to compare both pictures.

this is the picture from the publisher:




in order to generate this picture I run the following query:

    USE ORCASTG_CA18_Repl
    go
    SELECT 
    		 Publication=P.name 
    		--,Publication_description=p.description
    		,p.immediate_sync
    		,p.allow_anonymous
    		,p.replicate_ddl
    		,the_schema=OBJECT_SCHEMA_NAME(a.objid)
    		,TableName = a.name
    		,DestinationServer=s.srvname
    		,DestinationDB=s.dest_db
    		,DestinationSchema= a.dest_owner
    		,DestinationTable = A.dest_table  
    		,Radhe='exec sp_count ' +'''' + QUOTENAME(OBJECT_SCHEMA_NAME(a.objid)) + '.' + QUOTENAME(a.name) + ''''
    
    FROM dbo.syspublications P 
    
    INNER JOIN dbo.sysarticles A 
            ON P.pubid = A.pubid
    
    INNER JOIN dbo.syssubscriptions s 
            ON a.artid = s.artid 
    
    WHERE 1=1
      AND s.dest_db  'virtual'


and that produces the following script:

    exec sp_count '[dbo].[repl_application_placement]'
    exec sp_count '[dbo].[repl_ApplicationCalendarRequest]'
    exec sp_count '[dbo].[repl_ApplicationChecklist]'
    exec sp_count '[dbo].[repl_ApplicationFlightDetail]'
    exec sp_count '[dbo].[repl_ApplicationFlightLegDetail]'
    exec sp_count '[dbo].[repl_ApplicationFlightReference]'
    exec sp_count '[dbo].[repl_ApplicationProfile]'
    exec sp_count '[dbo].[repl_ApplicationRequestFlightOption]'
    exec sp_count '[dbo].[repl_ApplicationSkill]'
    exec sp_count '[dbo].[repl_ApplicationVisaDetail]'
    exec sp_count '[dbo].[repl_camp_profile]'


My question is:

Is there a way I can calculate the following line without re-running those functions:

    ,Radhe='exec sp_count ' +'''' + QUOTENAME(OBJECT_SCHEMA_NAME(a.objid)) + '.' + QUOTENAME(a.name) + ''''
    

I thought about cross apply! 




                                

Marcello Miorelli (17274 rep)

Sep 26, 2018, 05:52 PM • Last activity: Sep 27, 2018, 05:45 AM

6 votes

3 answers

4265 views

Using CROSS APPLY OPENJSON causes Azure to hang

performance azure-sql-database sql-server-2017 cross-apply

I have a table with around 8 million rows with a schema of: CREATE TABLE [dbo].[Documents]( [Id] [uniqueidentifier] NOT NULL, [RemoteId] [int] NOT NULL, [Json] [nvarchar](max) NULL, [WasSuccessful] [bit] NOT NULL, [StatusCode] [int] NULL, [Created] [datetime2](7) NULL, CONSTRAINT [PK_Documents] PRIM...

                                  I have a table with around 8 million rows with a schema of:

    CREATE TABLE [dbo].[Documents](
    	[Id] [uniqueidentifier] NOT NULL,
    	[RemoteId] [int] NOT NULL,
    	[Json] [nvarchar](max) NULL,
    	[WasSuccessful] [bit] NOT NULL,
    	[StatusCode] [int] NULL,
    	[Created] [datetime2](7) NULL,
     CONSTRAINT [PK_Documents] PRIMARY KEY CLUSTERED 
    (
    	[Id] ASC
    )WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
    GO
    
    ALTER TABLE [dbo].[Documents] ADD  CONSTRAINT [DF_Documents_Id]  DEFAULT (newsequentialid()) FOR [Id]
    GO
    
    ALTER TABLE [dbo].[Documents] ADD  CONSTRAINT [DF_Documents_Created]  DEFAULT (getdate()) FOR [Created]
    GO

The json documents are of a structure of

    {
        "Id": 1,
        "Data": [
            {
                "Id": 99,
                "Name": "Person 1"
            },
            {
                "Id": 100,
                "Name": "Person 2"
            }
        ]
    }

I am attempting to extract the JSON array from within the document and insert it into a new table using the following query:

    ;WITH CTE (Json) AS 
    (
    	SELECT TOP 10
    		JSON_QUERY([Json], '$.Data')
    	FROM 
            Documents
    	WHERE 
    		ISJSON([Json]) > 0
    )
    INSERT INTO [dbo].[ParsedDocuments] (Id, Name)
    SELECT
    	JSON_VALUE([Value], '$.Id') AS [Id],
    	JSON_VALUE([Value], '$.Name') AS [Name],
    FROM 
    	CTE
    CROSS APPLY 
    	OPENJSON([Json]) as X

What I'm finding is that if I extract and query a sample of the data, say 1000 rows.  Everything works as expected and the data is inserted into the destination table.  However, when I query the main table the server appears to just hang and become unresponsive.

I suspect that it is attempting to run a cross apply across all the rows before inserting data.  Is there any way to improve performance?  Or allow the job to begin "streaming" results into the destination table rather than attempting to batch them up?

Finally, as you can see I am employing the use of "TOP 10" results.  I am however still experiencing the performance hanging.  And I am unsure as to why.

Query plans for large table are never generated how when running against a sample of the data the query plan can be found [here](https://www.brentozar.com/pastetheplan/?id=SJZS1VJmm) .
                                

Darren (175 rep)

Jul 7, 2018, 09:25 PM • Last activity: Jul 8, 2018, 02:22 PM

Showing page 1 of 20 total questions