Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

115 views

Did performance when querying partitioned tables using min/max functions or TOP improve after SQL Server 2022?

sql-server partitioning max top sql-server-2025

With partitioned tables in SQL Server, there is a [notorious major performance issue](https://web.archive.org/web/20130412223317/https://connect.microsoft.com/SQLServer/feedback/details/240968/partition-table-using-min-max-functions-and-top-n-index-selection-and-performance) when using using min/max...

                                  With partitioned tables in SQL Server, there is a [notorious major performance issue](https://web.archive.org/web/20130412223317/https://connect.microsoft.com/SQLServer/feedback/details/240968/partition-table-using-min-max-functions-and-top-n-index-selection-and-performance)  when using using min/max functions or TOP. Microsoft document workarounds for it [here](https://learn.microsoft.com/en-us/troubleshoot/sql/database-engine/performance/decreased-performance-run-aggregating-clause) . I am confident that this was not fixed in SQL Server 2022. Microsoft surely would have updated the workaround list if giving them more money was a workaround.

However, was this changed after SQL Server 2022? I am sure that I saw a working link to [this Connect item](https://connect.microsoft.com/SQLServer/feedback/details/240968/partition-table-using-min-max-functions-and-top-n-index-selection-and-performance)  in 2024. Today, I cannot find it even on the modern [Azure suggestions thing](https://feedback.azure.com/d365community/forum/04fe6ee0-3b25-ec11-b6e6-000d3a4f0da0)  that all of the Connect items were migrated to. This suggests to me that something has happened with this decade-old bug in the last few years.

I cannot answer this myself, since I do not have access to SQL Server 2025 or any bleeding-edge Azure stuff, I hear that preview builds for SQL Server 2025 have been released.

J. Mini (1225 rep)

Feb 2, 2025, 12:47 PM • Last activity: Aug 2, 2025, 10:54 PM

17 votes

2 answers

1627 views

Unnecessary sort with TOP PERCENT?

sql-server index execution-plan top sort-operator

### Setup ``` -- Create a heap table of numbers from 1 to 100 SELECT TOP (100) i = IDENTITY(int, 1, 1) INTO #T FROM master.dbo.spt_values; -- Add a clustered primary key ALTER TABLE #T ADD PRIMARY KEY CLUSTERED (i); ``` ### Test query Display 9% of the rows in primary key order: ``` SELECT TOP (9e)...

### Setup

-- Create a heap table of numbers from 1 to 100
SELECT TOP (100)
    i = IDENTITY(int, 1, 1)
INTO #T
FROM master.dbo.spt_values;

-- Add a clustered primary key
ALTER TABLE #T
    ADD PRIMARY KEY CLUSTERED (i);

### Test query Display 9% of the rows in primary key order:

SELECT TOP (9e) PERCENT 
    i 
FROM #T 
ORDER BY 
    i ASC;

DROP TABLE #T;

dbfiddle demo ### Results

### Execution plan

--- ## Question Why does SQL Server sort the column when the clustered index provides exactly that order? --- ### More rows If I increase the number of rows in the table, I get an Eager Spool instead of a Sort and the index is scanned in order:

-- Create a heap table of numbers from 1 to 1,000
SELECT TOP (1000)
    i = IDENTITY(int, 1, 1)
INTO #T
FROM master.dbo.spt_values;

-- Add a clustered primary key
ALTER TABLE #T
    ADD PRIMARY KEY CLUSTERED (i);

-- 0.9% now
SELECT TOP (9e-1) PERCENT 
    i 
FROM #T 
ORDER BY 
    i ASC;

DROP TABLE #T;

Paul White (95060 rep)

Feb 18, 2025, 08:34 AM • Last activity: Feb 20, 2025, 08:52 AM

4 votes

2 answers

2819 views

Pulling Top queries returns NULL in query plan and sql text

sql-server dmv top

I'm using the following code to pull our top 20 queries (ordered by CPU): SELECT TOP 20 qs.sql_handle ,qs.execution_count ,qs.total_worker_time AS [Total CPU] ,qs.total_worker_time / 1000000 AS [Total CPU in Seconds] ,(qs.total_worker_time / 1000000) / qs.execution_count AS [Average CPU in Seconds]...

                                  I'm using the following code to pull our top 20 queries (ordered by CPU):

    SELECT TOP 20 qs.sql_handle
    	,qs.execution_count
    	,qs.total_worker_time AS [Total CPU]
    	,qs.total_worker_time / 1000000 AS [Total CPU in Seconds]
    	,(qs.total_worker_time / 1000000) / qs.execution_count AS [Average CPU in Seconds]
    	,qs.total_elapsed_time
    	,qs.total_elapsed_time / 1000000 AS [Total Elapsed Time in Seconds]
    	,st.TEXT
    	,qp.query_plan
    FROM sys.dm_exec_query_stats AS qs
    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
    CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
    ORDER BY qs.total_worker_time DESC

However, I'm seeing this:


Can anyone shed some light as to why the Query Plan and SQL Text are showing up NULL?  Are they some sort of system process or external application?  We are running SQL 2008 R2.

Thanks, as always, everyone!
                                

Kris Gruttemeyer (3879 rep)

Sep 10, 2014, 02:11 PM • Last activity: Jan 6, 2023, 11:04 AM

2 votes

1 answers

1132 views

Adding 'top 20' massively slows down query!

sql-server sql-server-2016 execution-plan top table-spool

I have a query: select * from Aview where field=20 order by id desc This returns 2700 rows from the view in about 1 second. Adding 'top 20' to the query makes MSSQL return in 43 seconds!! This has been a HARD to reproduce issue, and doing a rebuild of statistics fixes the issue for a couple of days,...

                                  I have a query:

    select * from Aview where field=20
    order by id desc

This returns 2700 rows from the view in about 1 second.

Adding 'top 20' to the query makes MSSQL return in 43 seconds!!

This has been a HARD to reproduce issue, and doing a rebuild of statistics fixes the issue for a couple of days, but then it come back.

I've been working with SQL for decades and I've never once seen a situation where adding a 'top' causes the time to increase.

Looking at the execution plan, it seems to be doing a lazy spool of 960 MILLION rows if you do the top 20, but not if you don't.

Traderhut Games (173 rep)

Jun 29, 2022, 03:58 PM • Last activity: Jul 5, 2022, 02:01 PM

0 votes

1 answers

179 views

Select Top item from group

oracle group-by top

I have a simple table | ID |Product |Version|Price|Notes| | -- | -- |--| --| --| | 1| Pizza| 1| 10.00| It is good| | 2| Pizza| 2| 11.00| The best yet| | 3| Soda| 1| 3:00| | | 4| Hot dog| 1| 8.00| 20% meat| | 5| Soda| 2| 3.00| Rebranded| I want to get out the ID of the latest version of the product....

                                  I have a simple table

| ID |Product |Version|Price|Notes|
| -- | -- |--| --| --|
| 1| Pizza| 1| 10.00| It is good|
| 2| Pizza| 2| 11.00| The best yet|
| 3| Soda| 1| 3:00| |
| 4| Hot dog| 1| 8.00| 20% meat|
| 5| Soda| 2| 3.00| Rebranded|


I want to get out the ID of the latest version of the product. Product is a text value (I have no control on this)

| ID |
| -- |
| 2| 
| 4|
| 5|

I will then be using this in an IN clause hence just wanting the ID

Normally I would look to use a group by product, then use max(version), however, I am not sure how to get the ID as it is not a grouped field.

I can't guarantee that the highest version will also have the highest ID (I do not control the DB)
                                

Davey (1 rep)

Jul 8, 2021, 03:08 PM • Last activity: Jul 14, 2021, 09:53 PM

0 votes

2 answers

846 views

Strange behavior of TOP 1 clause sql server

sql-server query order-by collation top

I have a SQL Server database and restored it on another server on a different machine. Now I run following same query on both DBs which produces same below result set. ``` select person_id, visit_id, verification_code, mobileNo, created_date from person order by verification_code, created_date DESC...

I have a SQL Server database and restored it on another server on a different machine. Now I run following same query on both DBs which produces same below result set.

select person_id, visit_id, verification_code, mobileNo, created_date
from person
order by verification_code, created_date DESC

But when I add TOP 1 clause to the above query, original DB shows the 2nd row (ignores first row)

select TOP 1 person_id, visit_id, verification_code, mobileNo, created_date
from person
order by verification_code, created_date DESC

while restored DB shows first row

I thought it might be due to different collation, but I checked both servers have same collation.

user1543848 (75 rep)

Apr 20, 2021, 09:03 AM • Last activity: Apr 20, 2021, 11:38 AM

3 votes

1 answers

5450 views

Select with CROSS APPLY runs slow

sql-server execution-plan sorting cross-apply top

I am trying to optimize the query to run faster. The query is the following: SELECT grp_fk_obj_id, grp_name FROM tbl_groups as g1 CROSS APPLY (SELECT TOP 1 grp_id as gid FROM tbl_groups as g2 WHERE g1.grp_fk_obj_id = g2.grp_fk_obj_id ORDER BY g2.date_from DESC, ISNULL(date_to, '4000-01-01') DESC) as...

                                  I am trying to optimize the query to run faster.
The query is the following:

    SELECT grp_fk_obj_id, grp_name
    FROM tbl_groups as g1
             CROSS APPLY (SELECT TOP 1 grp_id as gid
                          FROM tbl_groups as g2
                          WHERE g1.grp_fk_obj_id = g2.grp_fk_obj_id
                          ORDER BY g2.date_from DESC, ISNULL(date_to, '4000-01-01') DESC) as a
    WHERE g1.grp_id = gid

grp_id is a primary key. grp_fk_obj_id is a foreign key to another object. There are indexes on both of these columns (I guess it comes as default).

It takes about half a second to complete but I need it to make work faster. I took a look at the execution plan and it shows that "Top N sort" has a cost of more than 90%. Also, I have noticed that if I remove a where clause inside the cross apply then it runs at least 5x faster, but I need that where clause in one way or another.

Do you see any possibilities to improve the performance of this query?




*EDIT:
table creation DDL:*

    create table tbl_groups
    (
        grp_id        bigint identity
            constraint PK_tbl_groups
                primary key,
        grp_fk_obj_id bigint      not null
            constraint FK_grp_fk_obj_id
                references tbl_other,
        grp_name      varchar(30) not null,
        date_from     date        not null,
        date_to       date
    )
    go
    
    create index IDX_grp_fk_obj_id
        on tbl_groups (grp_fk_obj_id)
    go
    
    create index IDX_grp_name
        on tbl_groups (grp_name)
    go
                                

EtherPaul (133 rep)

Aug 13, 2020, 11:16 AM • Last activity: Aug 13, 2020, 12:38 PM

3 votes

1 answers

3133 views

Using TOP and getting different result sets

sql-server t-sql top

I am trying to write an efficient query for deleting chunks of data. To this end I hoped to avoid an index scan by using the primary key to get the oldest records. However I'm seeing some unexpected results returned. I hoped this SELECT TOP 15 OrderID FROM [Order] Would give me the oldest 15 records...

                                  I am trying to write an efficient query for deleting chunks of data. To this end I hoped to avoid an index scan by using the primary key to get the oldest records. However I'm seeing some unexpected results returned.

I hoped this

    SELECT TOP 15 OrderID FROM [Order]
Would give me the oldest 15 records because I could rely on the primary key incrementing and therefore the order of storage would be low to high in the table.

However, this returns a different result set

    SELECT TOP 15 OrderID FROM [Order] ORDER BY DateCreated ASC

Which seems to be a more accurate but more expensive way of getting the result I need.

Confusingly, this

    SELECT TOP 15 * FROM [Order]
Gives a different set of OrderID s (PK) to this

    SELECT TOP 15 OrderID FROM [Order]

I understand that http://msdn.microsoft.com/en-gb/library/ms189463.aspx  explains that order cannot be guaranteed without an ORDER BY clause but expected the PK to order for me and can't explain the differences between the final two select clauses.

reticentKoala (629 rep)

Sep 10, 2014, 03:24 PM • Last activity: May 7, 2020, 07:24 PM

1 votes

1 answers

100 views

Next 10% of rows

sybase sybase-sql-anywhere top

I know TOP can return first 10% of row results, but how would I then get the next 10%? I am trying to break up a query that crashes due to insufficient memory (java.lang.OutOfMemoryError), and want to return the first 10% of results, then edit it so it returns the next 10%, and so on. I am using SQL...

                                  I know TOP can return first 10% of row results, but how would I then get the next 10%?
I am trying to break up a query that crashes due to insufficient memory (java.lang.OutOfMemoryError), and want to return the first 10% of results, then edit it so it returns the next 10%, and so on.
I am using SQL Anywhere 12.
                                

KAE (143 rep)

Apr 6, 2020, 04:15 PM • Last activity: Apr 6, 2020, 06:26 PM

0 votes

1 answers

685 views

How to show multiple record if has top 1

sql-server join query subquery top

Hi currently having difficulty to query the result sorry a newbie here. i am trying to find the result of these two tables. i want to get all the records on table 2 with an effective date < 2019-12-05. Table 1 Key 2 Name ---------- ---------- 78 Test 1 222 Test 2 824 Test 3 1980 Test 4 3747 Test 5 P...

                                  Hi currently having difficulty to query the result sorry a newbie here. i am trying to find the result of these two tables. i want to get all the records on table 2 with an effective date < 2019-12-05. 

Table 1

Key 2          Name
---------- ---------- 
    78     Test 1 
    222    Test 2            
    824    Test 3  
    1980   Test 4
    3747   Test 5

Please see image for table 2

    select top 1 (md2.table2) from table2 md2
    where md2.EFFECTIVEDATE < '2019-12-05'
    and md2.key2 in (select key2 from table1)
    order by EFFECTIVEDATE desc

my query only getting one record i want to get all the record on key 2

Nathan (29 rep)

Jan 21, 2020, 01:38 PM • Last activity: Jan 21, 2020, 05:51 PM

4 votes

1 answers

1542 views

How to get tsql TOP PERCENT with OFFSET

sql-server t-sql top offset-fetch

select top 10 percent * from sys.databases order by database_id I want to get the same 10 percent result using offset but this query works wrong offset 0 rows fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only

                                      select top 10 percent *
    from sys.databases
    order by database_id
    
I want to get the same 10 percent result using offset but this query works wrong

    offset 0 rows
    fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only

igelr (2162 rep)

Mar 13, 2019, 08:45 AM • Last activity: Mar 14, 2019, 08:16 AM

0 votes

1 answers

155 views

Top 10 each year

sql-server top

I need to write a query to show the number of tracks sold per artist per year for the top 10 best selling artists. Take the overall top 10 for all years. This is what I have: SELECT TOP 10 a.Name, YEAR(i.InvoiceDate), count(t.TrackId) FROM Artist a INNER JOIN Album al on a.ArtistId = al.ArtistId INN...

                                  I need to write a query to show the number of tracks sold per artist per year for the top 10 best selling artists. Take the overall top 10 for all years.

This is what I have:

    SELECT  TOP 10 a.Name, YEAR(i.InvoiceDate), count(t.TrackId)
    FROM Artist a 
    INNER JOIN Album al on a.ArtistId = al.ArtistId
    INNER JOIN Track t on al.AlbumId = t.AlbumId
    INNER JOIN InvoiceLine il on t.TrackId = il.TrackId
    INNER JOIN invoice i on  il.InvoiceId = i.InvoiceId
    Group by YEAR(i.InvoiceDate), a.Name
    ORDER by  count(t.TrackId) DESC;



I don't only the top 10 of all the tracks sold by on group and not by each year the top 10.
                                

joachim soetinck (1 rep)

Feb 11, 2019, 08:17 PM • Last activity: Feb 11, 2019, 11:38 PM

1 votes

1 answers

37 views

Top N earliest punched in Employees across departments in a month

oracle top

Find two of the earliest punched in Employees across departments in a month The dataset. Id EmpName EmpPunchInTime EmpDept 1 Dharma 26-JAN-2018 08:45 Engineering 2 Gumpina 26-JAN-2018 08:46 Finance 3 Kumar 26-JAN-2018 08:47 HR 4 Kamal 26-JAN-2018 08:48 Engineering 5 Muruges 26-JAN-2018 08:49 Finance...

                                  Find two of the earliest punched in Employees across departments in a month The dataset.

    Id   EmpName         EmpPunchInTime        EmpDept
    1      Dharma        26-JAN-2018 08:45     Engineering
    2      Gumpina       26-JAN-2018 08:46     Finance
    3      Kumar         26-JAN-2018 08:47     HR
    4      Kamal         26-JAN-2018 08:48     Engineering
    5      Muruges       26-JAN-2018 08:49     Finance
    6      Gumpina       25-JAN-2018 09:01     Finance
    7      Kumar         25-JAN-2018 08:42     HR
    8      Biju          25-JAN-2018 08:40     Engineering
    9      Divya         25-JAN-2018 08:49     HR
    10     Vali          24-JAN-2018 08:52     Engineering

The expected output is,

    Id   EmpName       EmpPunchInTime        EmpDept
    8      Biju          25-JAN-2018 08:40     Engineering
    1      Dharma        26-JAN-2018 08:45     Engineering
    2      Gumpina       26-JAN-2018 08:46     Finance
    5      Muruges       26-JAN-2018 08:49     Finance
    7      Kumar         25-JAN-2018 08:42     HR
    9      Divya         25-JAN-2018 08:49     HR

FYI I the query which I used

    SELECT *
    FROM (SELECT *
                ,DENSE_RANK() OVER ( PARTITION BY EmpDept ORDER BY EmpPunchInTime) RANK
          FROM EMPLOYEE_PUNCH) emp_pun
    WHERE RANK <= 2

I have tried to solve by using dense rank(partition by EmpDept, order by EmpPunchInTime) but am not seeing desired results.

*Note: If the same employee in a department punches early in most of the days(like Kumar of HR dept here in this dataset) then pick the earliest date among his dates and the employee who punches next to him.*
                                

Lakshmana Gumpina (13 rep)

Aug 1, 2018, 01:36 PM • Last activity: Aug 1, 2018, 02:32 PM

1 votes

1 answers

1048 views

Stored Procedure With an Optional Parameter Limiting Rows Returned if Specified

sql-server stored-procedures azure-sql-database top

How would one go about having a sproc that allows a consumer to **optionally** specify the number of rows returned? If no row count is specified, then all rows returned.

                                  How would one go about having a sproc that allows a consumer to **optionally** specify the number of rows returned? 

If no row count is specified, then all rows returned.

ΩmegaMan (409 rep)

Mar 28, 2018, 12:43 AM • Last activity: Mar 28, 2018, 01:52 AM

1 votes

2 answers

11700 views

How to select TOP 1 value from each column where value is not null

sql-server coalesce top

I am trying to select many columns from a table where there can be multiple rows with the same ID. Each row contains data from columns that I want to return in one row. Any time a column has a value other than NULL, I want to show that. If all values in that column are NULL, it is OK to show NULL. E...

                                  I am trying to select many columns from a table where there can be multiple rows with the same ID.  Each row contains data from columns that I want to return in one row.  Any time a column has a value other than NULL, I want to show that.  If all values in that column are NULL, it is OK to show NULL.

Example)

    SELECT * FROM Table1 WHERE UnitID='73355'

This returns multiple rows.  Each row contains data in columns that I want to merge together.  I don't want to keep my WHERE clause.  In this example I am just using it to narrow down my results for troubleshooting purposes.  I want every row to return in the table but merge the data where the UnitID's match.  There could be 1, 2, 3, or more rows returned with matching UnitID's.  Each Row will have data in it I will want merged into 1 row.

I tried TOP 1 but it only returns the first row.  I want TOP 1 but each column I want the value that is NOT NULL.

I have this now...If i put in my WHERE clause a specific UnitID, it works.  However, I want to put other columns in my WHERE clause and remove UnitID.  I want to filter based off LRUnitStatus and Model.  As soon as I do that, it does not work.

        SELECT UnitID, 
               MAX(LRUnitStatus) as 'Status', 
               MAX([UnitNumber]) as 'Unit #', 
               MAX(Make) as Make, MAX(Model) as Model, 
               MAX([ProductionYear]) as 'Year', 
               MAX(Length) as 'Length', 
               MIN([Rear Door Type]) as 'Rear Door', 
               MAX([Suspension Type]) as 'Suspension', 
               MAX([LocationBranchCode]) as 'Location', 
               MAX(HUBO) as 'HUBO',
               MIN([Wheel Type]) as 'Wheels', 
               MAX([Tire Size]) as 'Tires', 
               MAX([Floor Type]) as 'Floor', 
               MAX([Tire Inflation]) as 'Tire Inflation', 
               MAX([Side Skirts]) as 'Skirts', 
               MAX([Side Panels]) as 'Panels', 
               MAX([Roof Type]) as 'Roof', 
               MAX([Logistic Posts]) as 'Log. Posts',
               MAX([E-Track]) as 'ETrack', 
               MAX([Axle Spread]) as 'Axle'
        FROM   [ProfitMaster].[dbo].[vwUKTS_LR_jtg] LR
        WHERE  UnitID='73316'
        GROUP BY UnitID
                                

Jordon Griffith (11 rep)

Feb 7, 2018, 04:54 PM • Last activity: Feb 7, 2018, 10:21 PM

4 votes

2 answers

195 views

Access not properly translating TOP predicate to ODBC/Oracle SQL

optimization ms-access odbc top

I have a MS Access query that is based on a linked ODBC table (Oracle). The DDL for the underlying Oracle table is: create table road_insp ( insp_id integer, road_id integer, insp_date date, length number(10,2) ); INSERT INTO road_insp (insp_id, road_id, insp_date, length) VALUES (1, 100, to_date('1...

                                  I have a MS Access query that is based on a linked ODBC table (Oracle).

The DDL for the underlying Oracle table is:

    create table road_insp
    (
        insp_id integer,
        road_id integer,
        insp_date date,
        length number(10,2)
    );
    
    INSERT INTO road_insp (insp_id, road_id, insp_date, length)  VALUES (1, 100, to_date('1/1/2017','MM/DD/YY'), 20);
    INSERT INTO road_insp (insp_id, road_id, insp_date, length)  VALUES (2, 101, to_date('2/1/2017','MM/DD/YY'), 40);
    INSERT INTO road_insp (insp_id, road_id, insp_date, length)  VALUES (3, 101, to_date('3/1/2017','MM/DD/YY'), 60);
    INSERT INTO road_insp (insp_id, road_id, insp_date, length)  VALUES (4, 102, to_date('4/1/2018','MM/DD/YY'), 80);
    INSERT INTO road_insp (insp_id, road_id, insp_date, length)  VALUES (5, 102, to_date('5/1/2018','MM/DD/YY'), 100);
    INSERT INTO road_insp (insp_id, road_id, insp_date, length)  VALUES (6, 102, to_date('5/1/2018','MM/DD/YY'), 120);
    
    select * from road_insp
    
       INSP_ID    ROAD_ID   INSP_DAT     LENGTH
    ---------- ----------   -------- ----------
             1        100   17-01-01         20
             2        101   17-02-01         40
             3        101   17-03-01         60
             4        102   18-04-01         80

             5        102   18-05-01        100
             6        102   18-05-01        120


And the local MS Access Query  is:

    SELECT ri.*
    FROM user1_road_insp AS ri
    WHERE ri.insp_id = (
                        select 
                            top 1 ri2.insp_id   
                        from 
                            user1_road_insp ri2 
                        where 
                            ri2.road_id = ri.road_id 
                            and year(insp_date) between  [Enter a START year:] and [Enter a END year:]
                        order by 
                            ri2.insp_date desc, 
                            ri2.length desc,
                            ri2.insp_id
                       );

**The Problem:**

The performance of the query is quite poor. Unlike the table in the sample DDL, the real table has about 10,000 records. Similar queries execute instantly, but this query takes several minutes to run.

I've checked the SQLOut.txt log , and the problem seems to be that it is executing way too many individual statements:


    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri" 
    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri2" WHERE ({fn year("INSP_DATE" )}BETWEEN  ? AND  ? ) 
    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri2" WHERE ({fn year("INSP_DATE" )}BETWEEN  ? AND  ? ) 
    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri2" WHERE ({fn year("INSP_DATE" )}BETWEEN  ? AND  ? ) 
    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri2" WHERE ({fn year("INSP_DATE" )}BETWEEN  ? AND  ? ) 
    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri2" WHERE ({fn year("INSP_DATE" )}BETWEEN  ? AND  ? ) 
    SQLExecDirect: SELECT "INSP_ID" ,"ROAD_ID" ,"INSP_DATE" ,"LENGTH"  FROM "USER1"."ROAD_INSP" "ri2" WHERE ({fn year("INSP_DATE" )}BETWEEN  ? AND  ? )

 

**Question:**


Unlike my other local MS Access queries, this query uses the TOP predicate. MS Access seems to be attempting to translate TOP to SQL that the ODBC driver and/or Oracle can understand. However, it seems to be doing a very poor job of it.

How can I improve the performance of this query?

                                

User1974 (1527 rep)

Oct 26, 2017, 07:56 PM • Last activity: Nov 7, 2017, 12:37 AM

21 votes

1 answers

29923 views

Why is "select " faster than "select top 500 " in SQL Server?

sql-server performance select execution-plan top query-performance

I have a view, `complicated_view` -- there are a few joins and where clauses. Now, select * from complicated_view (9000 records) is faster, *much faster*, than select top 500 * from complicated_view We're talking 19 seconds vs. 5+ minutes. The first query returns all 9000 records. How is only pickin...

                                  I have a view, complicated_view -- there are a few joins and where clauses. Now,

    select * from complicated_view (9000 records)

is faster, *much faster*, than

    select top 500 * from complicated_view

We're talking 19 seconds vs. 5+ minutes.

The first query returns all 9000 records. How is only picking up the top 500 ridiculously longer?

Obviously, I'm going to look at the execution plans here ---- but once I figure out *why* SQL Server is running the "top 500" in suboptimal fashion, how do I actually tell it to run the plan the quick way, like taking the full table?

Of course, I may have to rewrite the view entirely --- but quite odd.

Basically, I'm connecting this data table to a 3rd party software that pre-checks tables with a default select top 500 * query that cannot be modified. So other than dumping this view into an actual table (quite sloppy) - I can't get around their "top 500" addendum either.

This is SQL Server 2012.

EDIT: Disagree on the duplicate flag. The other question, the top was FASTER than all. This would be the EXPECTED behavior, returning less rows. My case is the opposite. Also, my understanding is that Top 100 is a different algorithm than Top 100+. I don't even think the duplicate question has the correct answer. Which is, the TOP X query will SORT potentially massive tables very early on, not AFTER they are aggregated/ filtered/ etc. The why is a mystery, but the how is plainly there.

user45867 (1739 rep)

Oct 17, 2017, 10:00 PM • Last activity: Oct 20, 2017, 03:48 AM

17 votes

1 answers

7803 views

SELECT TOP 1 from a very large table on an index column is very slow, but not with reverse order ("desc")

sql-server sql-server-2014 performance select top query-performance

We have a large database, about 1TB, running SQL Server 2014 on a powerful server. Everything worked fine for a few years. About 2 weeks ago, we did a full maintenance, which included: Install all software updates; rebuild all indexes and compact DB files. However, we did not expect that at certain...

                                  We have a large database, about 1TB, running SQL Server 2014 on a powerful server. Everything worked fine for a few years.  About 2 weeks ago, we did a full maintenance, which included: Install all software updates; rebuild all indexes and compact DB files. However, we did not expect that at certain stage the DB's CPU usage increased by over 100% to 150% when the actual load was the same. 

After a lot of troubleshooting, we have narrowed it down to a very simple query, but we could not find a solution. The query is extremely simple:

    select top 1 EventID from EventLog with (nolock) order by EventID

It always takes about 1.5 seconds! However, a similar query with "desc" always takes about 0 ms:

    select top 1 EventID from EventLog with (nolock) order by EventID desc

PTable has about 500 million rows; EventID is the primary clustered index column (ordered ASC) with the data type of bigint (Identity column). There are multiple threads inserting data into the table at the top (larger EventIDs), and there is 1 thread deleting data from the bottom (smaller EventIDs). 

In SMSS, we verified that the two queries always use the same execution plan:

 - Clustered index scan;

 - Estimated and actual row numbers are both 1; 

 - Estimated and actual number of executions are both 1;

 - Estimate I/O cost is 8500 (Seems to be high)

 - If run consecutively, the Query cost is the same 50% for both.

I updated index statistics with fullscan, the problem persisted; I rebuilt the index again, and the problem seemed to be gone for half a day, but came back. 

I turned on IO statistics with:

    set statistics io on

then ran the two queries consecutively and found the following info:

(For the first query, the slow one)  
> Table 'PTable'. Scan count 1, logical reads 407670, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(For the second query, the fast one)  
> Table 'PTable'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Note the huge difference in logical reads. The index is used in both cases. 

Index fragmentation could explain a little bit, but I believe the impact is very small; and the problem never happened before. Another proof is if I run a query like:

    select * from EventLog with (nolock) where EventID=xxxx   

Even if I set xxxx to the smallest EventIDs in the table, the query is always lightning fast.

We checked and there is no locking/blocking issue.

Note: I just tried to simplify the issue above. The "PTable" is actually "EventLog"; the PID is EventID.

I get the same result testing without the NOLOCK hint.

Can anybody help?

More detailed query execution plans in XML as follows:

https://www.brentozar.com/pastetheplan/?id=SJ3eiVnob 

https://www.brentozar.com/pastetheplan/?id=r1rOjVhoZ 

I don't think it matters to provide the create table statement. It is an old database and has been running perfectly fine for a long time until the maintenance.  We have done a lot of research ourselves and narrowed it down to the info provided in my question.

The table was created normally with the EventID column as the primary key, which is an identity column of type bigint. At this time, I guess the problem is with the index fragmentation. Right after index rebuild, the problem seemed to be gone for half a day; but why it came back so quickly...?

TiffanyP (171 rep)

Sep 29, 2017, 05:40 PM • Last activity: Oct 6, 2017, 05:21 PM

9 votes

3 answers

30580 views

DISTINCT on one column and return TOP rows

sql-server distinct top

How do you query for three unique customers with the largest `Purchase_Cost`? I want to apply the `DISTINCT` only on `Customer_Name`, but the query below applies the distinct on all three columns. How should I modify the query to obtain the desired output? SELECT DISTINCT TOP 3 customer_name, order_...

                                  How do you query for three unique customers with the largest Purchase_Cost?

I want to apply the DISTINCT only on Customer_Name, but the query below applies the distinct on all three columns. How should I modify the query to obtain the desired output?

    SELECT DISTINCT TOP 3 customer_name, order_no, Purchase_Cost  
    FROM PurchaseTable 
    ORDER BY Purchase_Cost

Neal (93 rep)

May 5, 2017, 03:43 PM • Last activity: Jun 29, 2017, 12:27 PM

2 votes

1 answers

2751 views

Using WITH TIES to retrieve multiple rows with same value

sqlite top

I am relatively new to SQLite. I have a database called "Clothes Catalog" that contains info on each data point's item type (shirts, pants, etc), catalog ID, and price. I want to retrieve the maximum price of an item of each type, the item type, and the catalog ID. If two items within a type have th...

                                  I am relatively new to SQLite. I have a database called "Clothes Catalog" that contains info on each data point's item type (shirts, pants, etc), catalog ID, and price. 

I want to retrieve the maximum price of an item of each type, the item type, and the catalog ID. If two items within a type have the same price, I want both catalog ID's to be included.

I tried this query but it does not run, any ideas why?


      SELECT TOP (1) WITH TIES itemtype, catalogID, price
      from clothes_catalog
      group by itemtype
      order by catalogID ASC;

Lets say my table looks like this:
       
     shirt, 20456, $16
     shirt, 23456, $18
     pant, 2222, $20
     pant, 4444, $20
     pant, 5656, $15

I want the following to show, and ALSO have catalogID in ascending order:

     pant, 2222, $20
     pant, 4444, $20
     shirt, 23456, $18

any suggestions?

                                

SRag (21 rep)

Apr 26, 2017, 09:55 PM • Last activity: Apr 27, 2017, 11:40 AM

Showing page 1 of 20 total questions