Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

5 votes
2 answers
8683 views
How to deal with large offsets in select?
Table `jtest` with 200k rows, each row contains jsonb `{ id: " ", key: }` (` ` is integer 1-200k incremented per each row). There is also btree index on `data->'key'`. create extension if not exists pgcrypto; create table jtest (data jsonb not null default '{}'); insert into jtest (data) select json...
Table jtest with 200k rows, each row contains jsonb { id: "", key: } (` is integer 1-200k incremented per each row). There is also btree index on data->'key'`. create extension if not exists pgcrypto; create table jtest (data jsonb not null default '{}'); insert into jtest (data) select json_build_object('id', gen_random_uuid(), 'key', i)::jsonb FROM generate_series(1,200000) i; create index jtest_key on jtest ((data->'key')); First query (fast): EXPLAIN ANALYZE select j.data from jtest j order by j.data->'key' limit 20; -- "Limit (cost=0.42..1.43 rows=20 width=74) (actual time=0.023..0.044 rows=20 loops=1)" -- " -> Index Scan using jtest_key on jtest j (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.023..0.039 rows=20 loops=1)" -- "Planning time: 0.082 ms" -- "Execution time: 0.066 ms" Second query with large offset (slow): EXPLAIN ANALYZE select j.data from jtest j order by j.data->'key' offset 100000 limit 20; -- "Limit (cost=5075.42..5076.44 rows=20 width=74) (actual time=112.861..112.896 rows=20 loops=1)" -- " -> Index Scan using jtest_key on jtest j (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.027..106.378 rows=100020 loops=1)" -- "Planning time: 0.105 ms" -- "Execution time: 112.931 ms" In PG docs I found: > The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient. But they don't say anything about how to deal with large offsets. How I can improve this query? Also is this behaviour (inefficiency for large offsets) common to all RDMS or just PG (using 9.4)?
user606521 (1415 rep)
Aug 23, 2015, 06:48 PM • Last activity: Mar 4, 2025, 06:09 AM
2 votes
2 answers
1494 views
Why OFFSET / FETCH causing Performance issues with Queries in MS SQL 2019?
I have this select query: ``` SELECT ITEM.ID ITEM_ID, ITEM.CODE ITEM_CODE, ITEM.DESCRIPTION ITEM_NAME, ITEM.CODE_CATEGORY_ID CATEGORY_ID, ITEM_CAT.CODE CATEGORY_CODE, ITEM_CAT.NAME CATEGORY_NAME FROM B1.SP_ITEM_MF ITEM INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID...
I have this select query:
SELECT ITEM.ID ITEM_ID,
       ITEM.CODE ITEM_CODE,
       ITEM.DESCRIPTION ITEM_NAME,
       ITEM.CODE_CATEGORY_ID CATEGORY_ID,
       ITEM_CAT.CODE CATEGORY_CODE,
       ITEM_CAT.NAME CATEGORY_NAME
FROM   B1.SP_ITEM_MF ITEM
       INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT
            ON  ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID
            AND ITEM_CAT.RECORD_TYPE = 'CD'
            AND ITEM_CAT.ACTIVE = 'Y'
       INNER JOIN [A1].SP_REQ_CATEGORY_MAPPING MAP
            ON  MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
       INNER JOIN B1.SP_ADMIN_STRUCTURE REQ_CAT
            ON  REQ_CAT.ID = MAP.REQ_CATEGORY_ID
            AND REQ_CAT.RECORD_TYPE = 'RQ'
            AND REQ_CAT.ACTIVE = 'Y'
            AND REQ_CAT.CODE IN ('CRW', 'SCHF')
ORDER BY
       ITEM.CODE
OFFSET 20 ROWS
     FETCH NEXT 20 ROWS ONLY;
Which is returning 20 records and taking more than **5 minutes** to complete execution. When I remove the FETCH NEXT 20 ROWS ONLY it returns 2000 records and complete execution in just **0 seconds**. Also, if I rewrite the query to insert the result set into a temp table (added below) and then use OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY on that result set that also complete execution in **0 seconds**.
CREATE TABLE #TempPagination (
    RowNum INT,
    ITEM_ID BIGINT,
    ITEM_CODE VARCHAR(64),
    ITEM_NAME VARCHAR(256),
    CATEGORY_ID INT,
    CATEGORY_CODE VARCHAR(8),
    CATEGORY_NAME VARCHAR(64)
);


INSERT INTO #TempPagination ( ITEM_ID, ITEM_CODE, ITEM_NAME, CATEGORY_ID, CATEGORY_CODE, CATEGORY_NAME)
SELECT 
    
    ITEM.ID ITEM_ID,ITEM.CODE  ITEM_CODE ,ITEM.DESCRIPTION ITEM_NAME,ITEM.CODE_CATEGORY_ID CATEGORY_ID,ITEM_CAT.CODE CATEGORY_CODE,ITEM_CAT.NAME CATEGORY_NAME
FROM
    B1.SP_ITEM_MF ITEM
    INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID AND ITEM_CAT.RECORD_TYPE ='CD' and ITEM_CAT.ACTIVE='Y'

    INNER JOIN A1.SP_REQ_CATEGORY_MAPPING MAP ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID

    INNER JOIN B1.SP_ADMIN_STRUCTURE  REQ_CAT ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID and REQ_CAT.RECORD_TYPE ='RQ' and REQ_CAT.ACTIVE='Y' AND REQ_CAT.CODE IN ('CRW','SCHF');

SELECT 
    ITEM_ID,
    ITEM_CODE,
    ITEM_NAME,
    CATEGORY_ID,
    CATEGORY_CODE,
    CATEGORY_NAME
FROM 
    #TempPagination
 order by ITEM_CODE
     OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY;

DROP TABLE #TempPagination;
Can someone explain why this happening? I have indexes on the columns and the version I am using is 2019. 1. Why does the query take more time without the temp table, and not taking more time when inserted into temp table first? 2. Why is FETCH NEXT 20 ROWS ONLY causing so much performance degradation? --- Here are the plans for both queries, I changed the 20 rows to 1 rows * https://www.brentozar.com/pastetheplan/?id=H1_VmgyAa * https://www.brentozar.com/pastetheplan/?id=ByLhQeJAp This is a big database, and there are too many indexes for each table (at least 20 for each) to list, that might be added based on different queries. Even if you can't pinpoint the issue, a general understanding of what might be causing the performance issues will be helpful. 17,412,610 & 52,697,525 this is the physical and logical read count on the first query. The rewritten query has 7 & 10,438 respectively.
Subin Benny (77 rep)
Mar 13, 2024, 08:41 AM • Last activity: Mar 13, 2024, 03:25 PM
0 votes
1 answers
326 views
'Limit INT OFFSET @variable' throw runtime error in mysql server
I'm trying to solve [this problem in Hackerrand][1] The problem asks for the median in a specific column my code: SET @VAE = (SELECT COUNT(*) FROM STATION); SET @VAE1 = ROUND(@VAE/2)-1; SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT @VAE1; But I get syntax error. When I write the query as following...
I'm trying to solve this problem in Hackerrand The problem asks for the median in a specific column my code: SET @VAE = (SELECT COUNT(*) FROM STATION); SET @VAE1 = ROUND(@VAE/2)-1; SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT @VAE1; But I get syntax error. When I write the query as following it runs ok, but I don't want it in such way SET @VAE = (SELECT COUNT(*) FROM STATION); SET @VAE1 = ROUND(@VAE/2)-1; SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT 5; what is the problem ?
ahmad alghadban (1 rep)
Jun 14, 2023, 09:59 AM • Last activity: Nov 12, 2023, 09:32 AM
2 votes
1 answers
1285 views
What's the difference between FETCH FIRST and FETCH NEXT in PostgreSQL?
What's the difference between `FETCH FIRST` and `FETCH NEXT` in PostgreSQL? The docs simply [say][1] > NEXT > Fetch the next row > > FIRST > Fetch the first row of the query which isn't much of an explanation. I created a sample table with sample rows, executed ```sql SELECT * from users FETCH FIRST...
What's the difference between FETCH FIRST and FETCH NEXT in PostgreSQL? The docs simply say > NEXT > Fetch the next row > > FIRST > Fetch the first row of the query which isn't much of an explanation. I created a sample table with sample rows, executed
SELECT * from users
FETCH FIRST 2 ROWS ONLY;
SELECT * from users
FETCH NEXT 2 ROWS ONLY;
and got exactly the same output. It's almost as if FETCH FIRST and FETCH NEXT are merely synonyms for LIMIT (which, I read, is not part of the SQL standard, unlike FETCH [FIRST|NEXT]) If the two are the same, why wouldn't the documentation say so? Or are there some differences after all? This Stack Overflow answer implies they may function the same way in at least some RDBMSes, specifically in MS Server
Sergey Zolotarev (243 rep)
Nov 5, 2023, 11:23 AM • Last activity: Nov 6, 2023, 03:54 PM
5 votes
1 answers
3911 views
Difference between LIMIT N and FETCH NEXT N ROWS?
The two queries below seem equivalent. Is one better than the other? e.g. in terms of performance. ``` select * from some_table offset 10 limit 10; ``` ``` select * from some_table offset 10 fetch next 10 rows; ```
The two queries below seem equivalent. Is one better than the other? e.g. in terms of performance.
select * from some_table offset 10 limit 10;
select * from some_table offset 10 fetch next 10 rows;
James Hiew (183 rep)
Mar 1, 2019, 02:19 PM • Last activity: Jun 2, 2023, 02:24 PM
0 votes
1 answers
833 views
Why does my LIMITed SELECT fetch the last record of the table, not the first?
I [read][1] that to fetch the first table record, you may use `LIMIT 1`. I tried to apply that knowledge to my `JOIN` statement and [got][2] a diametrically opposite result. It joined the *last* record of the joined table ```sql SELECT * FROM users; ``` | user_id | username | [more columns] | |-----...
I read that to fetch the first table record, you may use LIMIT 1. I tried to apply that knowledge to my JOIN statement and got a diametrically opposite result. It joined the *last* record of the joined table
SELECT * FROM users;
| user_id | username | [more columns] | |--------:|----------:|--------:| | 1 | mickey_m | [...] | | [...] | [...] | [...] | | 6 | daisy_d | [...] |
SELECT * FROM roles;
| role_id | role | |--------:|----------:| | 1 | USER | | 2 | ADMIN |
INSERT IGNORE INTO user_role (user_id, username, role_id, role)
SELECT users.id, users.username, roles.id, roles.role
FROM users
JOIN roles ON roles.role = 'ADMIN' LIMIT 1;

SELECT * FROM user_role;
| user_id | username | role_id | role | |--------:|----------:|--------:|------:| | 6 | daisy_d | 2 | ADMIN | **Why did it happen?** It seems unnatural for LIMIT to limit the result set to the last record as opposed to the first one Even this
INSERT IGNORE INTO user_role (user_id, username, role_id, role)
SELECT first_row.id, first_row.username, roles.id, roles.role
FROM (SELECT id, username FROM users LIMIT 1) AS first_row
JOIN roles ON roles.role = 'ADMIN';

SELECT * FROM user_role;
produces the same In both cases, I expected to get this instead | user_id | username | role_id | role | |--------:|----------:|--------:|------:| | 1 | mickey_m | 2 | ADMIN | Does SELECT select from the bottom up in some cases? It seems so because plain SELECTs like this one
SELECT * FROM users LIMIT 1;
do indeed retrieve the first row. However, if you specify columns
SELECT id, username FROM users LIMIT 1;
you observe the same issue (the last row gets retrieved instead) Sage (a chat bot) says when it comes to LIMIT, no order is guaranteed so I should use ORDER BY (it does help, by the way). Is it right? Does LIMIT 1 limit the result set to a random row that satisfies the criteria? Then why is it always the last one? The likelihood of that is infinitesimally small > The behavior you are observing is likely due to the use of the LIMIT clause. When you use LIMIT without an explicit ORDER BY clause, the database will return an arbitrary set of rows that satisfy the query conditions, but the order of the rows is not guaranteed. In other words, the database may return the first row, the last row, or any other row that satisfies the query conditions. > > *–Sage*
Sergey Zolotarev (243 rep)
Apr 18, 2023, 06:50 AM • Last activity: Apr 19, 2023, 09:40 AM
1 votes
0 answers
1685 views
Best way to implement Pagination in SQL Server
I have a complex stored procedure that takes a customerId and some other parameters and returns thousands of rows. We want to implement pagination in this SP since the number of rows being returned is pretty huge. There is a main table that has all the product information (more than 100 million rows...
I have a complex stored procedure that takes a customerId and some other parameters and returns thousands of rows. We want to implement pagination in this SP since the number of rows being returned is pretty huge. There is a main table that has all the product information (more than 100 million rows) and then I have to join it to other tables to get more information about the product. ProductId is the primary key in the Product table. We have a non-clustered index on date with productId in the INCLUDE section. There are two approaches that I have in mind.

First Method:
Step 1: Get all the rows for the product id and order by date and stored them in a temp table (#temp1) that will have an identity column as the primary key.
Step 2: Pick the rows based on @first and @last parameters where @first = (@page - 1) * @display and @last = (@page * @display) + 1 from the temp table created in the previous step and store them in another temp table (#temp2).
Step 3: Join the rows from #temp2 to all the other tables and get the required data.

/* @display and @page are passed in along with other input parameters */ Declare @first Int = (@page - 1) * @display Declare @last Int = (@page * @display) + 1 Create Table #temp1 (Id int identity(1,1) primary key, productId uniqueidentifier) Create Table #temp2 (productId uniqueidentifier) Insert Into #temp1 Select productId From dbo.product Order by date Insert into #temp2 Select productId From #temp1 Where id > @first And id Second Method:
Step 1: Get the rows for the product id ordered by date and have an offset and fetch by clause and store them in a temp table (#temp1).
Step 2: Join the rows from #temp1 to all the other tables and get the required data.
/* @display and @page are passed in along with other input parameters */ Declare @first Int = (@page - 1) * @display Create Table #temp1 (productId uniqueidentifier) Insert Into #temp1 Select productId From dbo.product Order by date Offset @first rows fetch next @display rows only Select * From #temp1 Join some other product tables on product id With method 1, I'll have to load all the data everytime the procedure is called and then filter the rows that I need.
With method 2, I can only fetch the rows i'm interested in but I've read stuff online that offset and fetch can actually slow the query a lot. Which is usually the best approach for pagination in SQL Server? Of the two methods I described which one would be a better option or is there a better way? Thanks!
lifeisajourney (751 rep)
Aug 31, 2022, 03:49 PM
1 votes
0 answers
1034 views
limit user's SELECT query to a fixed number of rows
i have read the following questions: https://dba.stackexchange.com/questions/233307/limit-postgres-results-for-a-user and also noted the recommendation of limiting a user's statement_timeout, however i would like something a little more consistent. noticed one poster on the question mentions that 'r...
i have read the following questions: https://dba.stackexchange.com/questions/233307/limit-postgres-results-for-a-user and also noted the recommendation of limiting a user's statement_timeout, however i would like something a little more consistent. noticed one poster on the question mentions that 'row level security' feature would allow someone to enforce this feature. i have read some tutorials that use row-level security to limit what the user can see by enforcing some column relationship, but i don't know how to extend this to a case of limiting rows since it does not impose any column value constraint. i also saw this about using FETCH_COUNT: https://stackoverflow.com/questions/30369100/how-to-handle-large-result-sets-with-psql as an _alternative_ (key word here): if row-level security does not allow us to limit the rows returned per-query for a user, it seems a combination of fetch_count and statement_timeout for a specific user may achieve my goal. is there a way to set the FETCH_COUNT for a user account without having them to specify it in their client/session? thanks in advance.
gagan (11 rep)
Jan 17, 2022, 01:21 AM
5 votes
2 answers
2856 views
Why doesn't an OFFSET query benefit more from the index?
I've got a simple example `books` table with an integer `id` primary key (`"books_pkey" PRIMARY KEY, btree (id)`) and 100,000,000 random rows. If I run: ``` EXPLAIN SELECT * FROM books ORDER BY id OFFSET 99999999 LIMIT 1; ``` I see a query plan like: ``` Limit (cost=3137296.54..3137296.57 rows=1 wid...
I've got a simple example books table with an integer id primary key ("books_pkey" PRIMARY KEY, btree (id)) and 100,000,000 random rows. If I run:
EXPLAIN
SELECT *
FROM books
ORDER BY id
OFFSET 99999999
LIMIT 1;
I see a query plan like:
Limit  (cost=3137296.54..3137296.57 rows=1 width=14)
   ->  Index Scan using books_pkey on books  (cost=0.57..3137296.57 rows=100000000 width=14)
Do I understand correctly that PostgreSQL is loading 100000000 rows into memory, only for the OFFSET to discard all but 1? If so, why can't it do the "load and discard" step using the index and only load one row into memory? I understand that the typical solution to this is to use keyset pagination - to say WHERE id > x. I'm just trying to understand why an index alone doesn't solve this. Adding another index which is explicitly sorted the same way as this query (CREATE INDEX books_id_ordered ON books (id ASC)) makes no difference to EXPLAIN.
Nathan Long (1005 rep)
Aug 13, 2021, 05:54 PM • Last activity: Aug 16, 2021, 07:22 PM
1 votes
1 answers
867 views
LIMIT and OFFSET a complex query
I have a query generated by a ORM (Sequelize), i ran into an issue where sequelize fails, look at those issues. https://github.com/sequelize/sequelize/issues/7344 https://github.com/sequelize/sequelize/issues/12200 Postgres Query: ````sql SELECT "feeds"."id", "feeds"."title", "feeds"."likes", "feeds...
I have a query generated by a ORM (Sequelize), i ran into an issue where sequelize fails, look at those issues. https://github.com/sequelize/sequelize/issues/7344 https://github.com/sequelize/sequelize/issues/12200 Postgres Query:
`sql

SELECT "feeds"."id",
    "feeds"."title",
    "feeds"."likes",
    "feeds"."description",
    "feeds"."files",
    "feeds"."allowComments",
    "feeds"."readConfirmation",
    "feeds"."isDraft",
    "feeds"."createdAt",
    "feeds"."updatedAt",
    "feeds"."companyId",
    "feeds"."createdById",
    "reads"."id" AS "reads.id",
    "reads->feeds_reads"."createdAt" AS "reads.feeds_reads.createdAt",
    "reads->feeds_reads"."updatedAt" AS "reads.feeds_reads.updatedAt",
    "reads->feeds_reads"."feedId" AS "reads.feeds_reads.feedId",
    "reads->feeds_reads"."userId" AS "reads.feeds_reads.userId",
    "createdBy"."id" AS "createdBy.id",
    "createdBy"."firstName" AS "createdBy.firstName",
    "createdBy"."jobTitle" AS "createdBy.jobTitle",
    "createdBy"."lastName" AS "createdBy.lastName",
    "createdBy"."profilePicture" AS "createdBy.profilePicture",
    "bookmarks"."id" AS "bookmarks.id",
    "bookmarks->feeds_bookmarks"."createdAt" AS "bookmarks.feeds_bookmarks.createdAt",
    "bookmarks->feeds_bookmarks"."updatedAt" AS "bookmarks.feeds_bookmarks.updatedAt",
    "bookmarks->feeds_bookmarks"."feedId" AS "bookmarks.feeds_bookmarks.feedId",
    "bookmarks->feeds_bookmarks"."userId" AS "bookmarks.feeds_bookmarks.userId",
    "units"."id" AS "units.id",
    "units"."parentId" AS "units.parentId",
    "units->feeds_units"."createdAt" AS "units.feeds_units.createdAt",
    "units->feeds_units"."updatedAt" AS "units.feeds_units.updatedAt",
    "units->feeds_units"."feedId" AS "units.feeds_units.feedId",
    "units->feeds_units"."unitId" AS "units.feeds_units.unitId",
    "units->users"."id" AS "units.users.id",
    "units->users->users_units"."createdAt" AS "units.users.users_units.createdAt",
    "units->users->users_units"."updatedAt" AS "units.users.users_units.updatedAt",
    "units->users->users_units"."userId" AS "units.users.users_units.userId",
    "units->users->users_units"."unitId" AS "units.users.users_units.unitId",
    "units->descendents"."id" AS "units.descendents.id",
    "units->descendents"."parentId" AS "units.descendents.parentId",
    "units->descendents->unitsancestor"."unitsId" AS "units.descendents.unitsancestor.unitsId",
    "units->descendents->unitsancestor"."ancestorId" AS "units.descendents.unitsancestor.ancestorId",
    "units->descendents->users"."id" AS "units.descendents.users.id",
    "units->descendents->users->users_units"."createdAt" AS "units.descendents.users.users_units.createdAt",
    "units->descendents->users->users_units"."updatedAt" AS "units.descendents.users.users_units.updatedAt",
    "units->descendents->users->users_units"."userId" AS "units.descendents.users.users_units.userId",
    "units->descendents->users->users_units"."unitId" AS "units.descendents.users.users_units.unitId",
    "teams"."id" AS "teams.id",
    "teams->feeds_teams"."createdAt" AS "teams.feeds_teams.createdAt",
    "teams->feeds_teams"."updatedAt" AS "teams.feeds_teams.updatedAt",
    "teams->feeds_teams"."feedId" AS "teams.feeds_teams.feedId",
    "teams->feeds_teams"."teamId" AS "teams.feeds_teams.teamId",
    "teams->peoples->teams_users"."createdAt" AS "teams.peoples.teams_users.createdAt",
    "teams->peoples->teams_users"."updatedAt" AS "teams.peoples.teams_users.updatedAt",
    "teams->peoples->teams_users"."userId" AS "teams.peoples.teams_users.userId",
    "teams->peoples->teams_users"."teamId" AS "teams.peoples.teams_users.teamId",
    "comments"."text" AS "comments.text",
    "comments"."id" AS "comments.id",
    "comments"."likes" AS "comments.likes",
    "comments"."parentId" AS "comments.parentId",
    "comments"."createdById" AS "comments.createdById",
    "comments"."createdAt" AS "comments.createdAt",
    "comments"."updatedAt" AS "comments.updatedAt",
    "comments->createdBy"."id" AS "comments.createdBy.id",
    "comments->createdBy"."firstName" AS "comments.createdBy.firstName",
    "comments->createdBy"."lastName" AS "comments.createdBy.lastName",
    "comments->createdBy"."jobTitle" AS "comments.createdBy.jobTitle",
    "comments->createdBy"."profilePicture" AS "comments.createdBy.profilePicture",
    "peoples->feeds_peoples"."createdAt" AS "peoples.feeds_peoples.createdAt",
    "peoples->feeds_peoples"."updatedAt" AS "peoples.feeds_peoples.updatedAt",
    "peoples->feeds_peoples"."feedId" AS "peoples.feeds_peoples.feedId",
    "peoples->feeds_peoples"."userId" AS "peoples.feeds_peoples.userId"
FROM "feeds" AS "feeds"
    LEFT OUTER JOIN (
        "feeds_reads" AS "reads->feeds_reads"
        INNER JOIN "users" AS "reads" ON "reads"."id" = "reads->feeds_reads"."userId"
    ) ON "feeds"."id" = "reads->feeds_reads"."feedId"
    LEFT OUTER JOIN "users" AS "createdBy" ON "feeds"."createdById" = "createdBy"."id"
    LEFT OUTER JOIN (
        "feeds_bookmarks" AS "bookmarks->feeds_bookmarks"
        INNER JOIN "users" AS "bookmarks" ON "bookmarks"."id" = "bookmarks->feeds_bookmarks"."userId"
    ) ON "feeds"."id" = "bookmarks->feeds_bookmarks"."feedId"
    LEFT OUTER JOIN (
        "feeds_units" AS "units->feeds_units"
        INNER JOIN "units" AS "units" ON "units"."id" = "units->feeds_units"."unitId"
    ) ON "feeds"."id" = "units->feeds_units"."feedId"
    LEFT OUTER JOIN (
        "users_units" AS "units->users->users_units"
       LEFT OUTER JOIN "users" AS "units->users" ON "units->users"."id" = "units->users->users_units"."userId"
    ) ON "units"."id" = "units->users->users_units"."unitId"
    LEFT OUTER JOIN (
        "unitsancestor" AS "units->descendents->unitsancestor"
        LEFT OUTER JOIN "units" AS "units->descendents" ON "units->descendents"."id" = "units->descendents->unitsancestor"."unitsId"
    ) ON "units"."id" = "units->descendents->unitsancestor"."ancestorId"
    LEFT OUTER JOIN (
        "users_units" AS "units->descendents->users->users_units"
        LEFT OUTER JOIN "users" AS "units->descendents->users" ON "units->descendents->users"."id" = "units->descendents->users->users_units"."userId"
    ) ON "units->descendents"."id" = "units->descendents->users->users_units"."unitId"
    LEFT OUTER JOIN (
        "feeds_teams" AS "teams->feeds_teams"
        INNER JOIN "teams" AS "teams" ON "teams"."id" = "teams->feeds_teams"."teamId"
    ) ON "feeds"."id" = "teams->feeds_teams"."feedId"
    LEFT OUTER JOIN (
        "teams_users" AS "teams->peoples->teams_users"
        INNER JOIN "users" AS "teams->peoples" ON "teams->peoples"."id" = "teams->peoples->teams_users"."userId"
    ) ON "teams"."id" = "teams->peoples->teams_users"."teamId"
    LEFT OUTER JOIN "comments" AS "comments" ON "feeds"."id" = "comments"."feedId"
    LEFT OUTER JOIN "users" AS "comments->createdBy" ON "comments"."createdById" = "comments->createdBy"."id"
    LEFT OUTER JOIN (
        "feeds_peoples" AS "peoples->feeds_peoples"
        INNER JOIN "users" AS "peoples" ON "peoples"."id" = "peoples->feeds_peoples"."userId"
    ) ON "feeds"."id" = "peoples->feeds_peoples"."feedId"
WHERE (
        "peoples"."id" = 11
        OR "feeds"."createdById" = 11
        OR "teams->peoples"."id" = 11
        OR "units->users"."id" = 11
        OR "units->descendents->users"."id" = 11
    )
    AND "feeds"."companyId" = 4
    AND "feeds"."isDraft" = false
    AND "feeds"."createdAt" < '2020-12-09 12:59:34.017 +00:00'
LIMIT 20;
` Here the limit is not applying to the feeds, i want 20 feeds but it's giving me same feed 20 times.
Raman Shekhawat (113 rep)
Dec 10, 2020, 07:24 AM • Last activity: Dec 10, 2020, 07:35 AM
0 votes
0 answers
21 views
Analysis of Conditions with no results and high offset number - PostgreSQL JSONB
I am asked to record down the analysis on PostgreSQL JSONB query having Conditions with high offset number and query that provide no results considering the below. ``` select * from sub_test df, jsonb_array_elements(json_data) jba where (jba->>'name')::text = 'SubYear' and (jba->>'value')::numeric >...
I am asked to record down the analysis on PostgreSQL JSONB query having Conditions with high offset number and query that provide no results considering the below.
select
    *
from
    sub_test df,
    jsonb_array_elements(json_data) jba
where
    (jba->>'name')::text = 'SubYear'
    and (jba->>'value')::numeric > '2050';
1. SubYear > 2200 (has no results) Explanations about the query plan, are full scans of tables necessary with JSON? 2. SubYear > 1970 (has a lot of results) and offset 30k Explanations about the query plan, are full scans of tables necessary with JSON? Note that there are about 35k+ records in the table "sub_test" with 500+ keys in the JSONB column "json_data" Could you help me to answer them ?
Lingamoorthy Bheeman Balan (1 rep)
Oct 5, 2020, 07:20 AM • Last activity: Oct 5, 2020, 07:26 AM
4 votes
1 answers
4670 views
MSSQL efficient paging including a join on big tables
We have a table with products and a table with a current selection/assortment of products. Now need to join them create an export file by some .net code. Currently using query like this: SELECT * FROM ASSORTMENT a INNER JOIN PRODUCTS p on a.clientID = 38 and a.productID = p.productID ORDER BY a.ID_R...
We have a table with products and a table with a current selection/assortment of products. Now need to join them create an export file by some .net code. Currently using query like this: SELECT * FROM ASSORTMENT a INNER JOIN PRODUCTS p on a.clientID = 38 and a.productID = p.productID ORDER BY a.ID_ROW OFFSET 100000 ROWS FETCH NEXT 1000 ROWS ONLY Problem is that the assortment can contain discontinued products, that are not existing in products table anymore, so need a join to verify. Also need each time exactly same amount of rows per select, e.g. 1000. The query gets slower if the offset gets higher because it does a full join of products and assortment table each time. On 3 Million products and an assortment of 200.000 articles it takes multiple seconds to fetch next 1000 if the offset is over 50.0000 Is there a more efficient way of doing this?
Svn342 (41 rep)
Dec 31, 2017, 10:38 PM • Last activity: May 12, 2020, 03:13 PM
1 votes
1 answers
960 views
Distinct result with row_id for limit query
There are 2 tables: ```sql CREATE TABLE sample1 ( id int, name varchar, score int, mode int ) CREATE TABLE sample2 ( id int, mode int ... ) ``` I need to select distinct `name` with `MIN(score)` and `ROW_NUMBER()`. Must sort by `ROW_NUMBER() ASC`. The `mode` must exist in table `sample2` for a row f...
There are 2 tables:
CREATE TABLE sample1 (
  id int,
  name varchar,
  score int,
  mode int
)

CREATE TABLE sample2 (
  id int,
  mode int
  ...
)
I need to select distinct name with MIN(score) and ROW_NUMBER(). Must sort by ROW_NUMBER() ASC. The mode must exist in table sample2 for a row from sample1, so there is INNER JOIN. ROW_NUMBER() is needed for sequential LIMIT queries so I can select data in batches. I don't care about consistency of that LIMIT+ROW_NUMBER() trick. My query:
WITH sample1 AS (
    WITH sample1 AS (
        SELECT a.name, MIN(a.score) s
        FROM sample1 a
            JOIN sample2 USING (mode)
            GROUP BY a.name
    )
    SELECT *, ROW_NUMBER() OVER (ORDER BY s ASC) AS n
    FROM sample1
) SELECT * FROM sample1 WHERE n > $1 LIMIT $2
I am concerned about efficiency of that select, in fact it's 3 level deep subquery. Perhaps there are less ugly and more efficient ways to achieve same result? The number of rows in this table will be approximately 1,000,000 and distinct set is expected to be 100,000. Queries are frequent, actually it's a hot table for select, only score column will be updated in batches of the same size as limit select.
Alex (113 rep)
Dec 14, 2019, 10:43 PM • Last activity: Dec 16, 2019, 02:28 AM
16 votes
3 answers
4548 views
Why are there execution plan differences between OFFSET ... FETCH and the old-style ROW_NUMBER scheme?
The new `OFFSET ... FETCH` model introduces with SQL Server 2012 offers simple and faster paging. **Why are there any differences at all considering that the two forms are semantically identical and very common?** One would assume that the optimizer recognizes both and optimizes them (trivially) to...
The new OFFSET ... FETCH model introduces with SQL Server 2012 offers simple and faster paging. **Why are there any differences at all considering that the two forms are semantically identical and very common?** One would assume that the optimizer recognizes both and optimizes them (trivially) to the fullest. Here is a very simple case where OFFSET ... FETCH is ~2x faster according to the cost estimate. SELECT * INTO #objects FROM sys.objects SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY object_id) r FROM #objects ) x WHERE r >= 30 AND r < (30 + 10) ORDER BY object_id SELECT * FROM #objects ORDER BY object_id OFFSET 30 ROWS FETCH NEXT 10 ROWS ONLY offset-fetch.png One can vary this test case by creating a CI on object_id or adding filters but it is impossible to remove all plan differences. OFFSET ... FETCH is always faster because it does less work at execution time.
usr (7390 rep)
Dec 11, 2012, 07:39 PM • Last activity: Nov 15, 2019, 09:27 AM
5 votes
1 answers
10852 views
How to take make OFFSET & LIMIT with COUNT (OVER?) when having a subquery?
It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too. In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as t...
It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too. In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as the one presented here but without OFFSET and the output is just a COUNT(*) of all of the rows. I have two objectives: 1. Write the query so that COUNT(*) is returned in the same query. Indeed I have been looking at help pieces such as the excellent [SQL SERVER – How to get total row count from OFFSET / FETCH NEXT (Paging)](https://raresql.com/2015/03/30/sql-server-how-to-get-total-row-count-from-offset-fetch-next-paging/) with different ways of solving the problem, but then there's another piece... 2. Rewrite the join with a window function (e.g. OVER(PARTITION BY) or some more performant way as that query seem to generate an INDEX SCAN and INDEX SEEK on the table. The real query is a bit more complicated in the WHERE part, but it looks to me even one scan could be enough if the query were a bit more straightforward so that the COUNT and MAX could be had simultaneously with the outer query. Even this would be a win, but combined with having the overall COUNT would be even bigger. Maybe I'm trying to chew a teeny bit more than I can chew currently, but on the other hand, maybe there is now a chance to learn something. Here are the table and data CREATE TABLE Temp ( Id INT NOT NULL PRIMARY KEY, Created INT NOT NULL, ParentId INT, SomeInfo INT NOT NULL, GroupId INT NOT NULL CONSTRAINT FK_Temp FOREIGN KEY(ParentId) REFERENCES Temp(Id) ); -- Some root levels nodes. INSERT INTO Temp VALUES(1, 1, NULL, 1, 1); INSERT INTO Temp VALUES(2, 2, NULL, 2, 2); INSERT INTO Temp VALUES(3, 3, NULL, 1, 3); INSERT INTO Temp VALUES(13, 13, NULL, 1, 1); -- First order child nodes. INSERT INTO Temp VALUES(4, 4, 1, 2, 1); INSERT INTO Temp VALUES(5, 5, 2, 1, 2); INSERT INTO Temp VALUES(6, 6, 3, 2, 3); -- Second order child nodes. INSERT INTO Temp VALUES(7, 7, 4, 1, 1); INSERT INTO Temp VALUES(8, 8, 5, 2, 2); INSERT INTO Temp VALUES(9, 9, 6, 1, 3); SELECT Id, newestTable.SomeInfo, newestTable.Created, CASE WHEN newestTable.RootCount > 1 THEN 1 ELSE 0 END AS IsMulti FROM Temp as originalTable INNER JOIN ( SELECT SomeInfo, Max(Created) AS Created, Count(*) AS RootCount FROM Temp WHERE ParentId IS NULL AND GroupId = 1 GROUP BY SomeInfo ) AS newestTable ON originalTable.SomeInfo = newestTable.SomeInfo AND originalTable.Created = newestTable.Created /*WHERE ( originalTable.SomeInfo = 1 )*/ ORDER BY newestTable.Created ASC OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY; P.S. Also https://dba.stackexchange.com/questions/229214/how-to-apply-outer-limit-offset-and-filters-in-the-subquery-to-avoid-grouping-ov looks interesting. 1 THEN 1 ELSE 0 END AS IsMulti FROM Temp WHERE ( GroupId = 1 AND ParentId IS NULL ) ORDER BY Created ASC OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY; gets close to there. The problem is, though, there are now two result rows and it appears to me this is due to the original INNER JOIN joining back to Temp that cull it to one row. I wonder if there is a way to apply the conditions somehow either before or after the windowing to match more closely the original query. (And this isn't the same query, to be clear. There's just so little data, hence the queries look like being close to each other.)
Veksi (269 rep)
Jun 11, 2019, 09:25 PM • Last activity: Jun 17, 2019, 03:54 PM
4 votes
1 answers
1542 views
How to get tsql TOP PERCENT with OFFSET
select top 10 percent * from sys.databases order by database_id I want to get the same 10 percent result using offset but this query works wrong offset 0 rows fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only
select top 10 percent * from sys.databases order by database_id I want to get the same 10 percent result using offset but this query works wrong offset 0 rows fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only
igelr (2162 rep)
Mar 13, 2019, 08:45 AM • Last activity: Mar 14, 2019, 08:16 AM
3 votes
1 answers
1170 views
Slow Query with LIMIT
We have this query: SELECT "messages".* FROM "messages" WHERE "messages"."account_id" = 1 AND "messages"."status" = 'unread' AND (created_at >= '2014-04-24 01:00:00.000000') ORDER BY id DESC It runs fine under 0.1ms. But when we add **LIMIT**: SELECT "messages".* FROM "messages" WHERE "messages"."ac...
We have this query: SELECT "messages".* FROM "messages" WHERE "messages"."account_id" = 1 AND "messages"."status" = 'unread' AND (created_at >= '2014-04-24 01:00:00.000000') ORDER BY id DESC It runs fine under 0.1ms. But when we add **LIMIT**: SELECT "messages".* FROM "messages" WHERE "messages"."account_id" = 1 AND "messages"."status" = 'unread' AND (created_at >= '2014-04-24 01:00:00.000000') ORDER BY id DESC LIMIT 5 It runs over 20,000ms. Some notes: - The messages table has over 300,000 rows, not that big. - We have indexes on all 3 columns in the WHERE condition. - The **account_id** and **created_at** conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that **account_id**, there are only 3 unread messages. So we isolated each **WHERE** condition and found out about a certain behaviour -- we tweaked the **LIMIT** to correspond with the amount of unread messages: SELECT "messages".* FROM "messages" WHERE "messages"."account_id" = 1 AND "messages"."status" = 'unread' AND (created_at >= '2014-04-24 01:00:00.000000') ORDER BY id DESC LIMIT 3 It runs under 0.100ms. Any explanation for the difference in performance?
Ace Subido (31 rep)
Oct 24, 2014, 01:32 AM • Last activity: Feb 10, 2019, 07:01 AM
2 votes
1 answers
3616 views
Get total number of rows when offset is greater than max
I have a table `users` that has, say, 42 records. To get the total rows and some other details, I use a window function: select count(*) over() as count, name, id from users And since the consumer app is paginated, I can utilise `limit` and `offset` to get the desired data: select count(*) over() as...
I have a table users that has, say, 42 records. To get the total rows and some other details, I use a window function: select count(*) over() as count, name, id from users And since the consumer app is paginated, I can utilise limit and offset to get the desired data: select count(*) over() as count, name, id from users limit 5 offset 20 This gives me a (sample) result: count | name | id -------+---------+---- 42 | Dummy05 | 27 42 | Dummy06 | 30 42 | Dummy07 | 31 42 | Dummy08 | 32 42 | Dummy09 | 33 (5 rows) Of course, this can be optimised as described in this answer . However when the offset value overshoots, I get 0 rows which is correct. select count(*) over() as count, name, id from users limit 5 offset 50 Is there a way that gives me the total count despite the offset returning 0 rows? Something on the similar lines: count | name | id -------+---------+---- 42 | | (0 rows)
Slartibartfast (123 rep)
Jan 31, 2019, 01:26 PM • Last activity: Jan 31, 2019, 03:02 PM
5 votes
1 answers
11993 views
Reduce query time for higher offset in sql server
Currently, I have table `base_voter` with data around 100M of dummy data. I have stored procedure as follows: CREATE Procedure [dbo].[spTestingBaseVoter] @SortColumn NVARCHAR(128) = N'name_voter', @SortDirection VARCHAR(4) = 'asc', @offset INT, @limit INT As Begin SET NOCOUNT ON; -- reject any inval...
Currently, I have table base_voter with data around 100M of dummy data. I have stored procedure as follows: CREATE Procedure [dbo].[spTestingBaseVoter] @SortColumn NVARCHAR(128) = N'name_voter', @SortDirection VARCHAR(4) = 'asc', @offset INT, @limit INT As Begin SET NOCOUNT ON; -- reject any invalid sort directions: IF LOWER(@SortDirection) NOT IN ('asc','desc') BEGIN RAISERROR('Invalid parameter for @SortDirection: %s', 11, 1, @SortDirection); RETURN -1; END -- reject any unexpected column names: IF LOWER(@SortColumn) NOT IN (N'name_voter', N'home_street_address_1', N'home_address_city') BEGIN RAISERROR('Invalid parameter for @SortColumn: %s', 11, 1, @SortColumn); RETURN -1; END --SET @SortColumn = QUOTENAME(@SortColumn); DECLARE @sql NVARCHAR(MAX); SET @sql = N'SELECT id, name_voter, home_street_address_1, home_address_city FROM dbo.base_voter WITH(NOLOCK) WHERE deleted_at IS NULL' SET @sql = @sql + N' ORDER BY ' + @SortColumn + ' ' + @SortDirection + ' OFFSET @OF ROWS FETCH NEXT @LIM ROWS ONLY '; EXEC sp_executesql @sql, N'@OF int,@LIM int', @OF=@offset, @LIM=@limit End To make query faster, I have also created index as : CREATE NONCLUSTERED INDEX FIBaseVoterWithDeletedAt ON dbo.base_voter (name_voter asc,home_street_address_1, home_address_city) WHERE deleted_at IS NULL ; By creating this non-clustered index I have reduced the query time drastically. However, it does not applied same for the query with higher offset. For example with: Execute spTestingBaseVoter name_voter,asc,9999950,50 Is there something I am doing wrong, which is causing this performance issue? Or, it is more compulsory to create another index in descending order. Let, me know if there is any better way to tackle this situation, which might decrease query time drastically. **Update:** Estimated Execution Plan enter image description here
Saroj Shrestha (195 rep)
Oct 17, 2018, 07:58 AM • Last activity: Dec 12, 2018, 11:34 AM
1 votes
0 answers
797 views
MySQL: Limit number of rows returned with a given column value
My table has a list of activities, each associated with a customer and each with a date when the activity happened. I want to group by the customer ID and the month and the year, then sort by the month, the year and the count to generate a list of time activity generators by month. SELECT `original_...
My table has a list of activities, each associated with a customer and each with a date when the activity happened. I want to group by the customer ID and the month and the year, then sort by the month, the year and the count to generate a list of time activity generators by month. SELECT original_actor_id, MONTH(activities.created_at) as month, YEAR(activities.created_at) as year, COUNT(*) as count FROM activities activities WHERE activity_type = 'product_video_completed' GROUP BY original_actor_id, MONTH(activities.created_at), YEAR(activities.created_at) ORDER BY YEAR(activities.created_at) DESC, MONTH(activities.created_at) DESC, count DESC However, I am only interested in the top 5 counts for each month/year combo. What is the best way to limit this to the top 5 higest counts for each month/year combo?
user41527 (111 rep)
Oct 31, 2017, 09:27 PM • Last activity: Aug 8, 2018, 08:01 PM
Showing page 1 of 20 total questions