Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

5 votes

2 answers

8683 views

How to deal with large offsets in select?

postgresql postgresql-9.4 limits offset-fetch

Table `jtest` with 200k rows, each row contains jsonb `{ id: " ", key: }` (` ` is integer 1-200k incremented per each row). There is also btree index on `data->'key'`. create extension if not exists pgcrypto; create table jtest (data jsonb not null default '{}'); insert into jtest (data) select json...

                                  Table jtest with 200k rows, each row contains jsonb { id: "", key:  } (` is integer 1-200k incremented per each row). There is also btree index on data->'key'`.

    create extension if not exists pgcrypto;
    create table jtest (data jsonb not null default '{}');
    insert into jtest (data)
    select json_build_object('id', gen_random_uuid(), 'key', i)::jsonb
    FROM generate_series(1,200000) i;
    create index jtest_key on jtest ((data->'key'));

First query (fast):

    EXPLAIN ANALYZE
    select j.data
    from jtest j
    order by j.data->'key' 
    limit 20;

    -- "Limit  (cost=0.42..1.43 rows=20 width=74) (actual time=0.023..0.044 rows=20 loops=1)"
    -- "  ->  Index Scan using jtest_key on jtest j  (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.023..0.039 rows=20 loops=1)"
    -- "Planning time: 0.082 ms"
    -- "Execution time: 0.066 ms"

Second query with large offset (slow):

    EXPLAIN ANALYZE
    select j.data
    from jtest j
    order by j.data->'key' 
    offset 100000
    limit 20;

    -- "Limit  (cost=5075.42..5076.44 rows=20 width=74) (actual time=112.861..112.896 rows=20 loops=1)"
    -- "  ->  Index Scan using jtest_key on jtest j  (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.027..106.378 rows=100020 loops=1)"
    -- "Planning time: 0.105 ms"
    -- "Execution time: 112.931 ms"

In PG docs I found:

> The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient.

But they don't say anything about how to deal with large offsets. How I can improve this query? Also is this behaviour (inefficiency for large offsets) common to all RDMS or just PG (using 9.4)?
                                

user606521 (1415 rep)

Aug 23, 2015, 06:48 PM • Last activity: Mar 4, 2025, 06:09 AM

2 votes

2 answers

1494 views

Why OFFSET / FETCH causing Performance issues with Queries in MS SQL 2019?

sql-server query-performance sql-server-2019 offset-fetch

I have this select query: ``` SELECT ITEM.ID ITEM_ID, ITEM.CODE ITEM_CODE, ITEM.DESCRIPTION ITEM_NAME, ITEM.CODE_CATEGORY_ID CATEGORY_ID, ITEM_CAT.CODE CATEGORY_CODE, ITEM_CAT.NAME CATEGORY_NAME FROM B1.SP_ITEM_MF ITEM INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID...

I have this select query:

SELECT ITEM.ID ITEM_ID,
       ITEM.CODE ITEM_CODE,
       ITEM.DESCRIPTION ITEM_NAME,
       ITEM.CODE_CATEGORY_ID CATEGORY_ID,
       ITEM_CAT.CODE CATEGORY_CODE,
       ITEM_CAT.NAME CATEGORY_NAME
FROM   B1.SP_ITEM_MF ITEM
       INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT
            ON  ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID
            AND ITEM_CAT.RECORD_TYPE = 'CD'
            AND ITEM_CAT.ACTIVE = 'Y'
       INNER JOIN [A1].SP_REQ_CATEGORY_MAPPING MAP
            ON  MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
       INNER JOIN B1.SP_ADMIN_STRUCTURE REQ_CAT
            ON  REQ_CAT.ID = MAP.REQ_CATEGORY_ID
            AND REQ_CAT.RECORD_TYPE = 'RQ'
            AND REQ_CAT.ACTIVE = 'Y'
            AND REQ_CAT.CODE IN ('CRW', 'SCHF')
ORDER BY
       ITEM.CODE
OFFSET 20 ROWS
     FETCH NEXT 20 ROWS ONLY;

Which is returning 20 records and taking more than **5 minutes** to complete execution. When I remove the FETCH NEXT 20 ROWS ONLY it returns 2000 records and complete execution in just **0 seconds**. Also, if I rewrite the query to insert the result set into a temp table (added below) and then use OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY on that result set that also complete execution in **0 seconds**.

CREATE TABLE #TempPagination (
    RowNum INT,
    ITEM_ID BIGINT,
    ITEM_CODE VARCHAR(64),
    ITEM_NAME VARCHAR(256),
    CATEGORY_ID INT,
    CATEGORY_CODE VARCHAR(8),
    CATEGORY_NAME VARCHAR(64)
);


INSERT INTO #TempPagination ( ITEM_ID, ITEM_CODE, ITEM_NAME, CATEGORY_ID, CATEGORY_CODE, CATEGORY_NAME)
SELECT 
    
    ITEM.ID ITEM_ID,ITEM.CODE  ITEM_CODE ,ITEM.DESCRIPTION ITEM_NAME,ITEM.CODE_CATEGORY_ID CATEGORY_ID,ITEM_CAT.CODE CATEGORY_CODE,ITEM_CAT.NAME CATEGORY_NAME
FROM
    B1.SP_ITEM_MF ITEM
    INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID AND ITEM_CAT.RECORD_TYPE ='CD' and ITEM_CAT.ACTIVE='Y'

    INNER JOIN A1.SP_REQ_CATEGORY_MAPPING MAP ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID

    INNER JOIN B1.SP_ADMIN_STRUCTURE  REQ_CAT ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID and REQ_CAT.RECORD_TYPE ='RQ' and REQ_CAT.ACTIVE='Y' AND REQ_CAT.CODE IN ('CRW','SCHF');

SELECT 
    ITEM_ID,
    ITEM_CODE,
    ITEM_NAME,
    CATEGORY_ID,
    CATEGORY_CODE,
    CATEGORY_NAME
FROM 
    #TempPagination
 order by ITEM_CODE
     OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY;

DROP TABLE #TempPagination;

Can someone explain why this happening? I have indexes on the columns and the version I am using is 2019. 1. Why does the query take more time without the temp table, and not taking more time when inserted into temp table first? 2. Why is FETCH NEXT 20 ROWS ONLY causing so much performance degradation? --- Here are the plans for both queries, I changed the 20 rows to 1 rows * https://www.brentozar.com/pastetheplan/?id=H1_VmgyAa * https://www.brentozar.com/pastetheplan/?id=ByLhQeJAp This is a big database, and there are too many indexes for each table (at least 20 for each) to list, that might be added based on different queries. Even if you can't pinpoint the issue, a general understanding of what might be causing the performance issues will be helpful. 17,412,610 & 52,697,525 this is the physical and logical read count on the first query. The rewritten query has 7 & 10,438 respectively.

Subin Benny (77 rep)

Mar 13, 2024, 08:41 AM • Last activity: Mar 13, 2024, 03:25 PM

0 votes

1 answers

326 views

'Limit INT OFFSET @variable' throw runtime error in mysql server

query limits offset-fetch

I'm trying to solve [this problem in Hackerrand][1] The problem asks for the median in a specific column my code: SET @VAE = (SELECT COUNT(*) FROM STATION); SET @VAE1 = ROUND(@VAE/2)-1; SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT @VAE1; But I get syntax error. When I write the query as following...

                                  I'm trying to solve this problem in Hackerrand 

The problem asks for the median in a specific column

my code:

    SET @VAE = (SELECT COUNT(*) FROM STATION);
    SET @VAE1 = ROUND(@VAE/2)-1;
    SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT @VAE1;

But I get syntax error.

When I write the query as following it runs ok, but I don't want it in such way

    SET @VAE = (SELECT COUNT(*) FROM STATION);
    SET @VAE1 = ROUND(@VAE/2)-1;
    SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT 5;

what is the problem ?
                                

ahmad alghadban (1 rep)

Jun 14, 2023, 09:59 AM • Last activity: Nov 12, 2023, 09:32 AM

2 votes

1 answers

1285 views

What's the difference between FETCH FIRST and FETCH NEXT in PostgreSQL?

postgresql offset-fetch

What's the difference between `FETCH FIRST` and `FETCH NEXT` in PostgreSQL? The docs simply [say][1] > NEXT > Fetch the next row > > FIRST > Fetch the first row of the query which isn't much of an explanation. I created a sample table with sample rows, executed ```sql SELECT * from users FETCH FIRST...

What's the difference between FETCH FIRST and FETCH NEXT in PostgreSQL? The docs simply say > NEXT > Fetch the next row > > FIRST > Fetch the first row of the query which isn't much of an explanation. I created a sample table with sample rows, executed

SELECT * from users
FETCH FIRST 2 ROWS ONLY;

SELECT * from users
FETCH NEXT 2 ROWS ONLY;

and got exactly the same output. It's almost as if FETCH FIRST and FETCH NEXT are merely synonyms for LIMIT (which, I read, is not part of the SQL standard, unlike FETCH [FIRST|NEXT]) If the two are the same, why wouldn't the documentation say so? Or are there some differences after all? This Stack Overflow answer implies they may function the same way in at least some RDBMSes, specifically in MS Server

Sergey Zolotarev (243 rep)

Nov 5, 2023, 11:23 AM • Last activity: Nov 6, 2023, 03:54 PM

5 votes

1 answers

3911 views

Difference between LIMIT N and FETCH NEXT N ROWS?

offset-fetch

The two queries below seem equivalent. Is one better than the other? e.g. in terms of performance. ``` select * from some_table offset 10 limit 10; ``` ``` select * from some_table offset 10 fetch next 10 rows; ```

The two queries below seem equivalent. Is one better than the other? e.g. in terms of performance.

select * from some_table offset 10 limit 10;

select * from some_table offset 10 fetch next 10 rows;

James Hiew (183 rep)

Mar 1, 2019, 02:19 PM • Last activity: Jun 2, 2023, 02:24 PM

0 votes

1 answers

833 views

Why does my LIMITed SELECT fetch the last record of the table, not the first?

mysql order-by limits offset-fetch

I [read][1] that to fetch the first table record, you may use `LIMIT 1`. I tried to apply that knowledge to my `JOIN` statement and [got][2] a diametrically opposite result. It joined the *last* record of the joined table ```sql SELECT * FROM users; ``` | user_id | username | [more columns] | |-----...

I read that to fetch the first table record, you may use LIMIT 1. I tried to apply that knowledge to my JOIN statement and got a diametrically opposite result. It joined the *last* record of the joined table

SELECT * FROM users;

| user_id | username | [more columns] | |--------:|----------:|--------:| | 1 | mickey_m | [...] | | [...] | [...] | [...] | | 6 | daisy_d | [...] |

SELECT * FROM roles;

| role_id | role | |--------:|----------:| | 1 | USER | | 2 | ADMIN |

INSERT IGNORE INTO user_role (user_id, username, role_id, role)
SELECT users.id, users.username, roles.id, roles.role
FROM users
JOIN roles ON roles.role = 'ADMIN' LIMIT 1;

SELECT * FROM user_role;

| user_id | username | role_id | role | |--------:|----------:|--------:|------:| | 6 | daisy_d | 2 | ADMIN | **Why did it happen?** It seems unnatural for LIMIT to limit the result set to the last record as opposed to the first one Even this

INSERT IGNORE INTO user_role (user_id, username, role_id, role)
SELECT first_row.id, first_row.username, roles.id, roles.role
FROM (SELECT id, username FROM users LIMIT 1) AS first_row
JOIN roles ON roles.role = 'ADMIN';

SELECT * FROM user_role;

produces the same In both cases, I expected to get this instead | user_id | username | role_id | role | |--------:|----------:|--------:|------:| | 1 | mickey_m | 2 | ADMIN | Does SELECT select from the bottom up in some cases? It seems so because plain SELECTs like this one

SELECT * FROM users LIMIT 1;

do indeed retrieve the first row. However, if you specify columns

SELECT id, username FROM users LIMIT 1;

you observe the same issue (the last row gets retrieved instead) Sage (a chat bot) says when it comes to LIMIT, no order is guaranteed so I should use ORDER BY (it does help, by the way). Is it right? Does LIMIT 1 limit the result set to a random row that satisfies the criteria? Then why is it always the last one? The likelihood of that is infinitesimally small > The behavior you are observing is likely due to the use of the LIMIT clause. When you use LIMIT without an explicit ORDER BY clause, the database will return an arbitrary set of rows that satisfy the query conditions, but the order of the rows is not guaranteed. In other words, the database may return the first row, the last row, or any other row that satisfies the query conditions. > > *–Sage*

Sergey Zolotarev (243 rep)

Apr 18, 2023, 06:50 AM • Last activity: Apr 19, 2023, 09:40 AM

1 votes

0 answers

1685 views

Best way to implement Pagination in SQL Server

sql-server query-performance offset-fetch

I have a complex stored procedure that takes a customerId and some other parameters and returns thousands of rows. We want to implement pagination in this SP since the number of rows being returned is pretty huge. There is a main table that has all the product information (more than 100 million rows...

                                  I have a complex stored procedure that takes a customerId and some other parameters and returns thousands of rows. We want to implement pagination in this SP since the number of rows being returned is pretty huge. There is a main table that has all the product information (more than 100 million rows) and then I have to join it to other tables to get more information about the product. ProductId is the primary key in the Product table. We have a non-clustered index on date with productId in the INCLUDE section. There are two approaches that I have in mind. 


First Method:

 Step 1: Get all the rows for the product id and order by date and stored them in a temp table (#temp1) that will have an identity column as the primary key.

 Step 2: Pick the rows based on @first and @last parameters where @first = (@page - 1) * @display and @last = (@page * @display) + 1 from the temp table created in the previous step and store them in another temp table (#temp2).

  Step 3: Join the rows from #temp2 to all the other tables and get the required data.



    /* @display and @page are passed in along with other input parameters */ 
    Declare @first Int = (@page - 1) * @display
    Declare @last Int = (@page * @display) + 1
    
    Create Table #temp1 (Id int identity(1,1) primary key, productId uniqueidentifier)
    Create Table #temp2 (productId uniqueidentifier)
    
    Insert Into #temp1
    Select productId
    From dbo.product 
    Order by date
    
    Insert into #temp2
    Select productId 
    From #temp1 
    Where id > @first 
      And id Second Method:

Step 1: Get the rows for the product id ordered by date and have an offset and fetch by clause and store them in a temp table (#temp1).

Step 2: Join the rows from #temp1 to all the other tables and get the required data.


    /* @display and @page are passed in along with other input parameters */ 
    Declare @first Int = (@page - 1) * @display
    
    Create Table #temp1 (productId uniqueidentifier)
    
    Insert Into #temp1
    Select productId
    From dbo.product 
    Order by date
    Offset @first rows
    fetch next @display rows only
    
    Select *
      From #temp1
      Join some other product tables on product id

With method 1, I'll have to load all the data everytime the procedure is called and then filter the rows that I need.

With method 2, I can only fetch the rows i'm interested in but I've read stuff online that offset and fetch can actually slow the query a lot.

Which is usually the best approach for pagination in SQL Server? Of the two methods I described which one would be a better option or is there a better way? Thanks!
                                

lifeisajourney (751 rep)

Aug 31, 2022, 03:49 PM

1 votes

0 answers

1034 views

limit user's SELECT query to a fixed number of rows

postgresql query row-level-security offset-fetch

i have read the following questions: https://dba.stackexchange.com/questions/233307/limit-postgres-results-for-a-user and also noted the recommendation of limiting a user's statement_timeout, however i would like something a little more consistent. noticed one poster on the question mentions that 'r...

                                  i have read the following questions:

https://dba.stackexchange.com/questions/233307/limit-postgres-results-for-a-user 

and also noted the recommendation of limiting a user's statement_timeout, however i would like something a little more consistent.

noticed one poster on the question mentions that 'row level security' feature would allow someone to enforce this feature.

i have read some tutorials that use row-level security to limit what the user can see by enforcing some column relationship, but i don't know how to extend this to a case of limiting rows since it does not impose any column value constraint.

i also saw this about using FETCH_COUNT: 

https://stackoverflow.com/questions/30369100/how-to-handle-large-result-sets-with-psql 

as an _alternative_ (key word here):

if row-level security does not allow us to limit the rows returned per-query for a user, it seems a combination of fetch_count and statement_timeout for a specific user may achieve my goal.

is there a way to set the FETCH_COUNT for a user account without having them to specify it in their client/session?

thanks in advance.

gagan (11 rep)

Jan 17, 2022, 01:21 AM

5 votes

2 answers

2856 views

Why doesn't an OFFSET query benefit more from the index?

postgresql index-tuning offset-fetch

I've got a simple example `books` table with an integer `id` primary key (`"books_pkey" PRIMARY KEY, btree (id)`) and 100,000,000 random rows. If I run: ``` EXPLAIN SELECT * FROM books ORDER BY id OFFSET 99999999 LIMIT 1; ``` I see a query plan like: ``` Limit (cost=3137296.54..3137296.57 rows=1 wid...

I've got a simple example books table with an integer id primary key ("books_pkey" PRIMARY KEY, btree (id)) and 100,000,000 random rows. If I run:

EXPLAIN
SELECT *
FROM books
ORDER BY id
OFFSET 99999999
LIMIT 1;

I see a query plan like:

Limit  (cost=3137296.54..3137296.57 rows=1 width=14)
   ->  Index Scan using books_pkey on books  (cost=0.57..3137296.57 rows=100000000 width=14)

Do I understand correctly that PostgreSQL is loading 100000000 rows into memory, only for the OFFSET to discard all but 1? If so, why can't it do the "load and discard" step using the index and only load one row into memory? I understand that the typical solution to this is to use keyset pagination - to say WHERE id > x. I'm just trying to understand why an index alone doesn't solve this. Adding another index which is explicitly sorted the same way as this query (CREATE INDEX books_id_ordered ON books (id ASC)) makes no difference to EXPLAIN.

Nathan Long (1005 rep)

Aug 13, 2021, 05:54 PM • Last activity: Aug 16, 2021, 07:22 PM

1 votes

1 answers

867 views

LIMIT and OFFSET a complex query

mysql postgresql limits offset-fetch

I have a query generated by a ORM (Sequelize), i ran into an issue where sequelize fails, look at those issues. https://github.com/sequelize/sequelize/issues/7344 https://github.com/sequelize/sequelize/issues/12200 Postgres Query: ````sql SELECT "feeds"."id", "feeds"."title", "feeds"."likes", "feeds...

`sql

SELECT "feeds"."id",
    "feeds"."title",
    "feeds"."likes",
    "feeds"."description",
    "feeds"."files",
    "feeds"."allowComments",
    "feeds"."readConfirmation",
    "feeds"."isDraft",
    "feeds"."createdAt",
    "feeds"."updatedAt",
    "feeds"."companyId",
    "feeds"."createdById",
    "reads"."id" AS "reads.id",
    "reads->feeds_reads"."createdAt" AS "reads.feeds_reads.createdAt",
    "reads->feeds_reads"."updatedAt" AS "reads.feeds_reads.updatedAt",
    "reads->feeds_reads"."feedId" AS "reads.feeds_reads.feedId",
    "reads->feeds_reads"."userId" AS "reads.feeds_reads.userId",
    "createdBy"."id" AS "createdBy.id",
    "createdBy"."firstName" AS "createdBy.firstName",
    "createdBy"."jobTitle" AS "createdBy.jobTitle",
    "createdBy"."lastName" AS "createdBy.lastName",
    "createdBy"."profilePicture" AS "createdBy.profilePicture",
    "bookmarks"."id" AS "bookmarks.id",
    "bookmarks->feeds_bookmarks"."createdAt" AS "bookmarks.feeds_bookmarks.createdAt",
    "bookmarks->feeds_bookmarks"."updatedAt" AS "bookmarks.feeds_bookmarks.updatedAt",
    "bookmarks->feeds_bookmarks"."feedId" AS "bookmarks.feeds_bookmarks.feedId",
    "bookmarks->feeds_bookmarks"."userId" AS "bookmarks.feeds_bookmarks.userId",
    "units"."id" AS "units.id",
    "units"."parentId" AS "units.parentId",
    "units->feeds_units"."createdAt" AS "units.feeds_units.createdAt",
    "units->feeds_units"."updatedAt" AS "units.feeds_units.updatedAt",
    "units->feeds_units"."feedId" AS "units.feeds_units.feedId",
    "units->feeds_units"."unitId" AS "units.feeds_units.unitId",
    "units->users"."id" AS "units.users.id",
    "units->users->users_units"."createdAt" AS "units.users.users_units.createdAt",
    "units->users->users_units"."updatedAt" AS "units.users.users_units.updatedAt",
    "units->users->users_units"."userId" AS "units.users.users_units.userId",
    "units->users->users_units"."unitId" AS "units.users.users_units.unitId",
    "units->descendents"."id" AS "units.descendents.id",
    "units->descendents"."parentId" AS "units.descendents.parentId",
    "units->descendents->unitsancestor"."unitsId" AS "units.descendents.unitsancestor.unitsId",
    "units->descendents->unitsancestor"."ancestorId" AS "units.descendents.unitsancestor.ancestorId",
    "units->descendents->users"."id" AS "units.descendents.users.id",
    "units->descendents->users->users_units"."createdAt" AS "units.descendents.users.users_units.createdAt",
    "units->descendents->users->users_units"."updatedAt" AS "units.descendents.users.users_units.updatedAt",
    "units->descendents->users->users_units"."userId" AS "units.descendents.users.users_units.userId",
    "units->descendents->users->users_units"."unitId" AS "units.descendents.users.users_units.unitId",
    "teams"."id" AS "teams.id",
    "teams->feeds_teams"."createdAt" AS "teams.feeds_teams.createdAt",
    "teams->feeds_teams"."updatedAt" AS "teams.feeds_teams.updatedAt",
    "teams->feeds_teams"."feedId" AS "teams.feeds_teams.feedId",
    "teams->feeds_teams"."teamId" AS "teams.feeds_teams.teamId",
    "teams->peoples->teams_users"."createdAt" AS "teams.peoples.teams_users.createdAt",
    "teams->peoples->teams_users"."updatedAt" AS "teams.peoples.teams_users.updatedAt",
    "teams->peoples->teams_users"."userId" AS "teams.peoples.teams_users.userId",
    "teams->peoples->teams_users"."teamId" AS "teams.peoples.teams_users.teamId",
    "comments"."text" AS "comments.text",
    "comments"."id" AS "comments.id",
    "comments"."likes" AS "comments.likes",
    "comments"."parentId" AS "comments.parentId",
    "comments"."createdById" AS "comments.createdById",
    "comments"."createdAt" AS "comments.createdAt",
    "comments"."updatedAt" AS "comments.updatedAt",
    "comments->createdBy"."id" AS "comments.createdBy.id",
    "comments->createdBy"."firstName" AS "comments.createdBy.firstName",
    "comments->createdBy"."lastName" AS "comments.createdBy.lastName",
    "comments->createdBy"."jobTitle" AS "comments.createdBy.jobTitle",
    "comments->createdBy"."profilePicture" AS "comments.createdBy.profilePicture",
    "peoples->feeds_peoples"."createdAt" AS "peoples.feeds_peoples.createdAt",
    "peoples->feeds_peoples"."updatedAt" AS "peoples.feeds_peoples.updatedAt",
    "peoples->feeds_peoples"."feedId" AS "peoples.feeds_peoples.feedId",
    "peoples->feeds_peoples"."userId" AS "peoples.feeds_peoples.userId"
FROM "feeds" AS "feeds"
    LEFT OUTER JOIN (
        "feeds_reads" AS "reads->feeds_reads"
        INNER JOIN "users" AS "reads" ON "reads"."id" = "reads->feeds_reads"."userId"
    ) ON "feeds"."id" = "reads->feeds_reads"."feedId"
    LEFT OUTER JOIN "users" AS "createdBy" ON "feeds"."createdById" = "createdBy"."id"
    LEFT OUTER JOIN (
        "feeds_bookmarks" AS "bookmarks->feeds_bookmarks"
        INNER JOIN "users" AS "bookmarks" ON "bookmarks"."id" = "bookmarks->feeds_bookmarks"."userId"
    ) ON "feeds"."id" = "bookmarks->feeds_bookmarks"."feedId"
    LEFT OUTER JOIN (
        "feeds_units" AS "units->feeds_units"
        INNER JOIN "units" AS "units" ON "units"."id" = "units->feeds_units"."unitId"
    ) ON "feeds"."id" = "units->feeds_units"."feedId"
    LEFT OUTER JOIN (
        "users_units" AS "units->users->users_units"
       LEFT OUTER JOIN "users" AS "units->users" ON "units->users"."id" = "units->users->users_units"."userId"
    ) ON "units"."id" = "units->users->users_units"."unitId"
    LEFT OUTER JOIN (
        "unitsancestor" AS "units->descendents->unitsancestor"
        LEFT OUTER JOIN "units" AS "units->descendents" ON "units->descendents"."id" = "units->descendents->unitsancestor"."unitsId"
    ) ON "units"."id" = "units->descendents->unitsancestor"."ancestorId"
    LEFT OUTER JOIN (
        "users_units" AS "units->descendents->users->users_units"
        LEFT OUTER JOIN "users" AS "units->descendents->users" ON "units->descendents->users"."id" = "units->descendents->users->users_units"."userId"
    ) ON "units->descendents"."id" = "units->descendents->users->users_units"."unitId"
    LEFT OUTER JOIN (
        "feeds_teams" AS "teams->feeds_teams"
        INNER JOIN "teams" AS "teams" ON "teams"."id" = "teams->feeds_teams"."teamId"
    ) ON "feeds"."id" = "teams->feeds_teams"."feedId"
    LEFT OUTER JOIN (
        "teams_users" AS "teams->peoples->teams_users"
        INNER JOIN "users" AS "teams->peoples" ON "teams->peoples"."id" = "teams->peoples->teams_users"."userId"
    ) ON "teams"."id" = "teams->peoples->teams_users"."teamId"
    LEFT OUTER JOIN "comments" AS "comments" ON "feeds"."id" = "comments"."feedId"
    LEFT OUTER JOIN "users" AS "comments->createdBy" ON "comments"."createdById" = "comments->createdBy"."id"
    LEFT OUTER JOIN (
        "feeds_peoples" AS "peoples->feeds_peoples"
        INNER JOIN "users" AS "peoples" ON "peoples"."id" = "peoples->feeds_peoples"."userId"
    ) ON "feeds"."id" = "peoples->feeds_peoples"."feedId"
WHERE (
        "peoples"."id" = 11
        OR "feeds"."createdById" = 11
        OR "teams->peoples"."id" = 11
        OR "units->users"."id" = 11
        OR "units->descendents->users"."id" = 11
    )
    AND "feeds"."companyId" = 4
    AND "feeds"."isDraft" = false
    AND "feeds"."createdAt" < '2020-12-09 12:59:34.017 +00:00'
LIMIT 20;

` Here the limit is not applying to the feeds, i want 20 feeds but it's giving me same feed 20 times.

Raman Shekhawat (113 rep)

Dec 10, 2020, 07:24 AM • Last activity: Dec 10, 2020, 07:35 AM

0 votes

0 answers

21 views

Analysis of Conditions with no results and high offset number - PostgreSQL JSONB

postgresql performance index json offset-fetch

I am asked to record down the analysis on PostgreSQL JSONB query having Conditions with high offset number and query that provide no results considering the below. ``` select * from sub_test df, jsonb_array_elements(json_data) jba where (jba->>'name')::text = 'SubYear' and (jba->>'value')::numeric >...

I am asked to record down the analysis on PostgreSQL JSONB query having Conditions with high offset number and query that provide no results considering the below.

select
    *
from
    sub_test df,
    jsonb_array_elements(json_data) jba
where
    (jba->>'name')::text = 'SubYear'
    and (jba->>'value')::numeric > '2050';

1. SubYear > 2200 (has no results) Explanations about the query plan, are full scans of tables necessary with JSON? 2. SubYear > 1970 (has a lot of results) and offset 30k Explanations about the query plan, are full scans of tables necessary with JSON? Note that there are about 35k+ records in the table "sub_test" with 500+ keys in the JSONB column "json_data" Could you help me to answer them ?

Lingamoorthy Bheeman Balan (1 rep)

Oct 5, 2020, 07:20 AM • Last activity: Oct 5, 2020, 07:26 AM

4 votes

1 answers

4670 views

MSSQL efficient paging including a join on big tables

sql-server sql-server-2014 pagination offset-fetch

We have a table with products and a table with a current selection/assortment of products. Now need to join them create an export file by some .net code. Currently using query like this: SELECT * FROM ASSORTMENT a INNER JOIN PRODUCTS p on a.clientID = 38 and a.productID = p.productID ORDER BY a.ID_R...

                                  We have a table with products and a table with a current selection/assortment of products. Now need to join them create an export file by some .net code. 

Currently using query like this: 

	SELECT * FROM ASSORTMENT a
    INNER JOIN PRODUCTS p on a.clientID = 38 and a.productID = p.productID
    ORDER BY a.ID_ROW OFFSET 100000 ROWS FETCH NEXT 1000 ROWS ONLY

Problem is that the assortment can contain discontinued products, that are not existing in products table anymore, so need a join to verify. Also need each time exactly same amount of rows per select, e.g. 1000. 

The query gets slower if the offset gets higher because it does a full join of products and assortment table each time. On 3 Million products and an assortment of 200.000 articles it takes multiple seconds to fetch next 1000 if the offset is over 50.0000

Is there a more efficient way of doing this?

Svn342 (41 rep)

Dec 31, 2017, 10:38 PM • Last activity: May 12, 2020, 03:13 PM

1 votes

1 answers

960 views

Distinct result with row_id for limit query

postgresql subquery limits offset-fetch

There are 2 tables: ```sql CREATE TABLE sample1 ( id int, name varchar, score int, mode int ) CREATE TABLE sample2 ( id int, mode int ... ) ``` I need to select distinct `name` with `MIN(score)` and `ROW_NUMBER()`. Must sort by `ROW_NUMBER() ASC`. The `mode` must exist in table `sample2` for a row f...

There are 2 tables:

CREATE TABLE sample1 (
  id int,
  name varchar,
  score int,
  mode int
)

CREATE TABLE sample2 (
  id int,
  mode int
  ...
)

I need to select distinct name with MIN(score) and ROW_NUMBER(). Must sort by ROW_NUMBER() ASC. The mode must exist in table sample2 for a row from sample1, so there is INNER JOIN. ROW_NUMBER() is needed for sequential LIMIT queries so I can select data in batches. I don't care about consistency of that LIMIT+ROW_NUMBER() trick. My query:

WITH sample1 AS (
    WITH sample1 AS (
        SELECT a.name, MIN(a.score) s
        FROM sample1 a
            JOIN sample2 USING (mode)
            GROUP BY a.name
    )
    SELECT *, ROW_NUMBER() OVER (ORDER BY s ASC) AS n
    FROM sample1
) SELECT * FROM sample1 WHERE n > $1 LIMIT $2

I am concerned about efficiency of that select, in fact it's 3 level deep subquery. Perhaps there are less ugly and more efficient ways to achieve same result? The number of rows in this table will be approximately 1,000,000 and distinct set is expected to be 100,000. Queries are frequent, actually it's a hot table for select, only score column will be updated in batches of the same size as limit select.

Alex (113 rep)

Dec 14, 2019, 10:43 PM • Last activity: Dec 16, 2019, 02:28 AM

16 votes

3 answers

4548 views

Why are there execution plan differences between OFFSET ... FETCH and the old-style ROW_NUMBER scheme?

sql-server sql-server-2012 execution-plan optimization offset-fetch

The new `OFFSET ... FETCH` model introduces with SQL Server 2012 offers simple and faster paging. **Why are there any differences at all considering that the two forms are semantically identical and very common?** One would assume that the optimizer recognizes both and optimizes them (trivially) to...

                                  The new OFFSET ... FETCH model introduces with SQL Server 2012 offers simple and faster paging. **Why are there any differences at all considering that the two forms are semantically identical and very common?**

One would assume that the optimizer recognizes both and optimizes them (trivially) to the fullest.

Here is a very simple case where OFFSET ... FETCH is ~2x faster according to the cost estimate.

    SELECT * INTO #objects FROM sys.objects
    
    SELECT *
    FROM (
    	SELECT *, ROW_NUMBER() OVER (ORDER BY object_id) r
    	FROM #objects
    ) x
    WHERE r >= 30 AND r < (30 + 10)
    	ORDER BY object_id
    
    SELECT *
    FROM #objects
    ORDER BY object_id
    OFFSET 30 ROWS FETCH NEXT 10 ROWS ONLY



One can vary this test case by creating a CI on object_id or adding filters but it is impossible to remove all plan differences. OFFSET ... FETCH is always faster because it does less work at execution time.
                                

usr (7390 rep)

Dec 11, 2012, 07:39 PM • Last activity: Nov 15, 2019, 09:27 AM

5 votes

1 answers

10852 views

How to take make OFFSET & LIMIT with COUNT (OVER?) when having a subquery?

sql-server subquery window-functions offset-fetch

It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too. In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as t...

                                  It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too.

In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as the one presented here but without OFFSET and the output is just a COUNT(*) of all of the rows.

I have two objectives:

1. Write the query so that COUNT(*) is returned in the same query. Indeed I have been looking at help pieces such as the excellent [SQL SERVER – How to get total row count from OFFSET / FETCH NEXT (Paging)](https://raresql.com/2015/03/30/sql-server-how-to-get-total-row-count-from-offset-fetch-next-paging/)  with different ways of solving the problem, but then there's another piece...
2. Rewrite the join with a window function (e.g. OVER(PARTITION BY) or some more performant way as that query seem to generate an INDEX SCAN and INDEX SEEK on the table. The real query is a bit more complicated in the WHERE part, but it looks to me even one scan could be enough if the query were a bit more straightforward so that the COUNT and MAX could be had simultaneously with the outer query. Even this would be a win, but combined with having the overall COUNT would be even bigger.

Maybe I'm trying to chew a teeny bit more than I can chew currently, but on the other hand, maybe there is now a chance to learn something.

Here are the table and data

    CREATE TABLE Temp
    (
	    Id INT NOT NULL PRIMARY KEY,
	    Created INT NOT NULL,
	    ParentId INT,
	    SomeInfo INT NOT NULL,
	    GroupId INT NOT NULL
	
	    CONSTRAINT FK_Temp FOREIGN KEY(ParentId) REFERENCES Temp(Id)
    );

    -- Some root levels nodes.
    INSERT INTO Temp VALUES(1, 1, NULL, 1, 1);
    INSERT INTO Temp VALUES(2, 2, NULL, 2, 2);
    INSERT INTO Temp VALUES(3, 3, NULL, 1, 3);
    INSERT INTO Temp VALUES(13, 13, NULL, 1, 1);

    -- First order child nodes.
    INSERT INTO Temp VALUES(4, 4, 1, 2, 1);
    INSERT INTO Temp VALUES(5, 5, 2, 1, 2);
    INSERT INTO Temp VALUES(6, 6, 3, 2, 3);

    -- Second order child nodes.
    INSERT INTO Temp VALUES(7, 7, 4, 1, 1);
    INSERT INTO Temp VALUES(8, 8, 5, 2, 2);
    INSERT INTO Temp VALUES(9, 9, 6, 1, 3);

    SELECT
	    Id,
	    newestTable.SomeInfo,
	    newestTable.Created,
	    CASE WHEN newestTable.RootCount > 1 THEN 1 ELSE 0 END AS IsMulti
    FROM
	   Temp as originalTable
	   INNER JOIN
	   (
		    SELECT
			    SomeInfo,
			    Max(Created) AS Created,
			    Count(*) AS RootCount
		    FROM
			    Temp
		    WHERE ParentId IS NULL AND GroupId = 1
		    GROUP BY SomeInfo
	    ) AS newestTable ON originalTable.SomeInfo = newestTable.SomeInfo AND originalTable.Created = newestTable.Created
    /*WHERE
    (
	    originalTable.SomeInfo = 1
    )*/
    ORDER BY newestTable.Created ASC
    OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;

P.S. Also https://dba.stackexchange.com/questions/229214/how-to-apply-outer-limit-offset-and-filters-in-the-subquery-to-avoid-grouping-ov  looks interesting.

 1 THEN 1 ELSE 0 END AS IsMulti
    FROM
        Temp
    WHERE
    (
	    GroupId = 1 AND ParentId IS NULL
    )
    ORDER BY Created ASC
    OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;

gets close to there. The problem is, though, there are now two result rows and it appears to me this is due to the original INNER JOIN joining back to Temp that cull it to one row. I wonder if there is a way to apply the conditions somehow either before or after the windowing to match more closely the original query. (And this isn't the same query, to be clear. There's just so little data, hence the queries look like being close to each other.)
                                

Veksi (269 rep)

Jun 11, 2019, 09:25 PM • Last activity: Jun 17, 2019, 03:54 PM

4 votes

1 answers

1542 views

How to get tsql TOP PERCENT with OFFSET

sql-server t-sql top offset-fetch

select top 10 percent * from sys.databases order by database_id I want to get the same 10 percent result using offset but this query works wrong offset 0 rows fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only

                                      select top 10 percent *
    from sys.databases
    order by database_id
    
I want to get the same 10 percent result using offset but this query works wrong

    offset 0 rows
    fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only

igelr (2162 rep)

Mar 13, 2019, 08:45 AM • Last activity: Mar 14, 2019, 08:16 AM

3 votes

1 answers

1170 views

Slow Query with LIMIT

postgresql select performance postgresql-9.2 offset-fetch postgresql-performance

We have this query: SELECT "messages".* FROM "messages" WHERE "messages"."account_id" = 1 AND "messages"."status" = 'unread' AND (created_at >= '2014-04-24 01:00:00.000000') ORDER BY id DESC It runs fine under 0.1ms. But when we add **LIMIT**: SELECT "messages".* FROM "messages" WHERE "messages"."ac...

                                  We have this query:

    SELECT "messages".*
    FROM "messages"
    WHERE "messages"."account_id" = 1
    AND "messages"."status" = 'unread'
    AND (created_at >= '2014-04-24 01:00:00.000000')
    ORDER BY id DESC

It runs fine under 0.1ms. But when we add **LIMIT**:

    SELECT "messages".*
    FROM "messages"
    WHERE "messages"."account_id" = 1
    AND "messages"."status" = 'unread'
    AND (created_at >= '2014-04-24 01:00:00.000000')
    ORDER BY id DESC
    LIMIT 5

It runs over 20,000ms. Some notes:

- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The **account_id** and **created_at** conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that **account_id**, there are only 3 unread messages. 

So we isolated each **WHERE** condition and found out about a certain behaviour -- we tweaked the **LIMIT** to correspond with the amount of unread messages:

    SELECT "messages".*
    FROM "messages"
    WHERE "messages"."account_id" = 1
    AND "messages"."status" = 'unread'
    AND (created_at >= '2014-04-24 01:00:00.000000')
    ORDER BY id DESC
    LIMIT 3

It runs under 0.100ms. Any explanation for the difference in performance?
                                

Ace Subido (31 rep)

Oct 24, 2014, 01:32 AM • Last activity: Feb 10, 2019, 07:01 AM

2 votes

1 answers

3616 views

Get total number of rows when offset is greater than max

postgresql select count offset-fetch

I have a table `users` that has, say, 42 records. To get the total rows and some other details, I use a window function: select count(*) over() as count, name, id from users And since the consumer app is paginated, I can utilise `limit` and `offset` to get the desired data: select count(*) over() as...

                                  I have a table users that has, say, 42 records. To get the total rows and some other details, I use a window function:

    select count(*) over() as count, name, id from users

And since the consumer app is paginated, I can utilise limit and offset to get the desired data:

    select count(*) over() as count, name, id from users limit 5 offset 20

This gives me a (sample) result:

     count |  name   | id
    -------+---------+----
        42 | Dummy05 | 27
        42 | Dummy06 | 30
        42 | Dummy07 | 31
        42 | Dummy08 | 32
        42 | Dummy09 | 33
    (5 rows)

Of course, this can be optimised as described in this answer .

However when the offset value overshoots, I get 0 rows which is correct. 

    select count(*) over() as count, name, id from users limit 5 offset 50

Is there a way that gives me the total count despite the offset returning 0 rows? Something on the similar lines:

     count |  name   | id
    -------+---------+----
        42 |         | 
    (0 rows)
                                

Slartibartfast (123 rep)

Jan 31, 2019, 01:26 PM • Last activity: Jan 31, 2019, 03:02 PM

5 votes

1 answers

11993 views

Reduce query time for higher offset in sql server

sql-server performance pagination offset-fetch query-performance

Currently, I have table `base_voter` with data around 100M of dummy data. I have stored procedure as follows: CREATE Procedure [dbo].[spTestingBaseVoter] @SortColumn NVARCHAR(128) = N'name_voter', @SortDirection VARCHAR(4) = 'asc', @offset INT, @limit INT As Begin SET NOCOUNT ON; -- reject any inval...

                                  Currently, I have table base_voter with data around 100M of dummy data. I have stored procedure as follows:

    CREATE Procedure [dbo].[spTestingBaseVoter]    
       @SortColumn    NVARCHAR(128) = N'name_voter',    
       @SortDirection VARCHAR(4)    = 'asc',    
       @offset INT,      
       @limit INT    
    As    
    Begin    
      SET NOCOUNT ON;    
         
      -- reject any invalid sort directions:    
      IF LOWER(@SortDirection) NOT IN ('asc','desc')    
      BEGIN    
        RAISERROR('Invalid parameter for @SortDirection: %s', 11, 1, @SortDirection);    
        RETURN -1;    
      END     
         
      -- reject any unexpected column names:    
      IF LOWER(@SortColumn) NOT IN (N'name_voter', N'home_street_address_1', N'home_address_city')    
      BEGIN    
        RAISERROR('Invalid parameter for @SortColumn: %s', 11, 1, @SortColumn);    
        RETURN -1;    
      END     
         
      --SET @SortColumn = QUOTENAME(@SortColumn);    
         
      DECLARE @sql NVARCHAR(MAX);    
         
      SET @sql = N'SELECT id, name_voter, home_street_address_1, home_address_city    
                 FROM dbo.base_voter    
        WITH(NOLOCK)    
        WHERE deleted_at IS NULL'    
                     
      SET @sql = @sql + N' ORDER BY ' + @SortColumn + ' ' + @SortDirection +    
        ' OFFSET @OF ROWS     
        FETCH NEXT @LIM ROWS ONLY ';    
         
        
        
      EXEC sp_executesql @sql,    
         N'@OF int,@LIM int',      
         @OF=@offset, @LIM=@limit     
    End

To make query faster, I have also created index as :

    CREATE NONCLUSTERED INDEX FIBaseVoterWithDeletedAt
      ON dbo.base_voter (name_voter asc,home_street_address_1, home_address_city)
      WHERE deleted_at IS NULL ;

By creating this non-clustered index I have reduced the query time drastically. However, it does not applied same for the query with higher offset.

For example with:

    Execute spTestingBaseVoter name_voter,asc,9999950,50

Is there something I am doing wrong, which is causing this performance issue? Or, it is more compulsory to create another index in descending order.

Let, me know if there is any better way to tackle this situation, which might decrease query time drastically.

**Update:**

Estimated Execution Plan


                                

Saroj Shrestha (195 rep)

Oct 17, 2018, 07:58 AM • Last activity: Dec 12, 2018, 11:34 AM

1 votes

0 answers

797 views

MySQL: Limit number of rows returned with a given column value

group-by offset-fetch

My table has a list of activities, each associated with a customer and each with a date when the activity happened. I want to group by the customer ID and the month and the year, then sort by the month, the year and the count to generate a list of time activity generators by month. SELECT `original_...

                                  My table has a list of activities, each associated with a customer and each with a date when the activity happened. I want to group by the customer ID and the month and the year, then sort by the month, the year and the count to generate a list of time activity generators by month.

    SELECT
	    original_actor_id,
	    MONTH(activities.created_at) as month,
	    YEAR(activities.created_at) as year,
	    COUNT(*) as count
    FROM activities activities
    WHERE activity_type = 'product_video_completed'
    GROUP BY
    	original_actor_id,
    	MONTH(activities.created_at),
    	YEAR(activities.created_at)
    ORDER BY
	    YEAR(activities.created_at) DESC,
	    MONTH(activities.created_at) DESC,
	    count DESC

However, I am only interested in the top 5 counts for each month/year combo. What is the best way to limit this to the top 5 higest counts for each month/year combo?
                                

user41527 (111 rep)

Oct 31, 2017, 09:27 PM • Last activity: Aug 8, 2018, 08:01 PM

Showing page 1 of 20 total questions