Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
5
votes
2
answers
8683
views
How to deal with large offsets in select?
Table `jtest` with 200k rows, each row contains jsonb `{ id: " ", key: }` (` ` is integer 1-200k incremented per each row). There is also btree index on `data->'key'`. create extension if not exists pgcrypto; create table jtest (data jsonb not null default '{}'); insert into jtest (data) select json...
Table
jtest
with 200k rows, each row contains jsonb { id: "", key: }
(` is integer 1-200k incremented per each row). There is also btree index on
data->'key'`.
create extension if not exists pgcrypto;
create table jtest (data jsonb not null default '{}');
insert into jtest (data)
select json_build_object('id', gen_random_uuid(), 'key', i)::jsonb
FROM generate_series(1,200000) i;
create index jtest_key on jtest ((data->'key'));
First query (fast):
EXPLAIN ANALYZE
select j.data
from jtest j
order by j.data->'key'
limit 20;
-- "Limit (cost=0.42..1.43 rows=20 width=74) (actual time=0.023..0.044 rows=20 loops=1)"
-- " -> Index Scan using jtest_key on jtest j (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.023..0.039 rows=20 loops=1)"
-- "Planning time: 0.082 ms"
-- "Execution time: 0.066 ms"
Second query with large offset (slow):
EXPLAIN ANALYZE
select j.data
from jtest j
order by j.data->'key'
offset 100000
limit 20;
-- "Limit (cost=5075.42..5076.44 rows=20 width=74) (actual time=112.861..112.896 rows=20 loops=1)"
-- " -> Index Scan using jtest_key on jtest j (cost=0.42..10150.42 rows=200000 width=74) (actual time=0.027..106.378 rows=100020 loops=1)"
-- "Planning time: 0.105 ms"
-- "Execution time: 112.931 ms"
In PG docs I found:
> The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient.
But they don't say anything about how to deal with large offsets. How I can improve this query? Also is this behaviour (inefficiency for large offsets) common to all RDMS or just PG (using 9.4)?
user606521
(1415 rep)
Aug 23, 2015, 06:48 PM
• Last activity: Mar 4, 2025, 06:09 AM
2
votes
2
answers
1494
views
Why OFFSET / FETCH causing Performance issues with Queries in MS SQL 2019?
I have this select query: ``` SELECT ITEM.ID ITEM_ID, ITEM.CODE ITEM_CODE, ITEM.DESCRIPTION ITEM_NAME, ITEM.CODE_CATEGORY_ID CATEGORY_ID, ITEM_CAT.CODE CATEGORY_CODE, ITEM_CAT.NAME CATEGORY_NAME FROM B1.SP_ITEM_MF ITEM INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID...
I have this select query:
SELECT ITEM.ID ITEM_ID,
ITEM.CODE ITEM_CODE,
ITEM.DESCRIPTION ITEM_NAME,
ITEM.CODE_CATEGORY_ID CATEGORY_ID,
ITEM_CAT.CODE CATEGORY_CODE,
ITEM_CAT.NAME CATEGORY_NAME
FROM B1.SP_ITEM_MF ITEM
INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT
ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID
AND ITEM_CAT.RECORD_TYPE = 'CD'
AND ITEM_CAT.ACTIVE = 'Y'
INNER JOIN [A1].SP_REQ_CATEGORY_MAPPING MAP
ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
INNER JOIN B1.SP_ADMIN_STRUCTURE REQ_CAT
ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID
AND REQ_CAT.RECORD_TYPE = 'RQ'
AND REQ_CAT.ACTIVE = 'Y'
AND REQ_CAT.CODE IN ('CRW', 'SCHF')
ORDER BY
ITEM.CODE
OFFSET 20 ROWS
FETCH NEXT 20 ROWS ONLY;
Which is returning 20 records and taking more than **5 minutes** to complete execution.
When I remove the FETCH NEXT 20 ROWS ONLY
it returns 2000 records and complete execution in just **0 seconds**.
Also, if I rewrite the query to insert the result set into a temp table (added below) and then use OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY
on that result set that also complete execution in **0 seconds**.
CREATE TABLE #TempPagination (
RowNum INT,
ITEM_ID BIGINT,
ITEM_CODE VARCHAR(64),
ITEM_NAME VARCHAR(256),
CATEGORY_ID INT,
CATEGORY_CODE VARCHAR(8),
CATEGORY_NAME VARCHAR(64)
);
INSERT INTO #TempPagination ( ITEM_ID, ITEM_CODE, ITEM_NAME, CATEGORY_ID, CATEGORY_CODE, CATEGORY_NAME)
SELECT
ITEM.ID ITEM_ID,ITEM.CODE ITEM_CODE ,ITEM.DESCRIPTION ITEM_NAME,ITEM.CODE_CATEGORY_ID CATEGORY_ID,ITEM_CAT.CODE CATEGORY_CODE,ITEM_CAT.NAME CATEGORY_NAME
FROM
B1.SP_ITEM_MF ITEM
INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID AND ITEM_CAT.RECORD_TYPE ='CD' and ITEM_CAT.ACTIVE='Y'
INNER JOIN A1.SP_REQ_CATEGORY_MAPPING MAP ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
INNER JOIN B1.SP_ADMIN_STRUCTURE REQ_CAT ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID and REQ_CAT.RECORD_TYPE ='RQ' and REQ_CAT.ACTIVE='Y' AND REQ_CAT.CODE IN ('CRW','SCHF');
SELECT
ITEM_ID,
ITEM_CODE,
ITEM_NAME,
CATEGORY_ID,
CATEGORY_CODE,
CATEGORY_NAME
FROM
#TempPagination
order by ITEM_CODE
OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY;
DROP TABLE #TempPagination;
Can someone explain why this happening? I have indexes on the columns and the version I am using is 2019.
1. Why does the query take more time without the temp table, and not taking more time when inserted into temp table first?
2. Why is FETCH NEXT 20 ROWS ONLY
causing so much performance degradation?
---
Here are the plans for both queries, I changed the 20 rows to 1 rows
* https://www.brentozar.com/pastetheplan/?id=H1_VmgyAa
* https://www.brentozar.com/pastetheplan/?id=ByLhQeJAp
This is a big database, and there are too many indexes for each table (at least 20 for each) to list, that might be added based on different queries. Even if you can't pinpoint the issue, a general understanding of what might be causing the performance issues will be helpful.
17,412,610 & 52,697,525 this is the physical and logical read count on the first query. The rewritten query has 7 & 10,438 respectively.
Subin Benny
(77 rep)
Mar 13, 2024, 08:41 AM
• Last activity: Mar 13, 2024, 03:25 PM
0
votes
1
answers
326
views
'Limit INT OFFSET @variable' throw runtime error in mysql server
I'm trying to solve [this problem in Hackerrand][1] The problem asks for the median in a specific column my code: SET @VAE = (SELECT COUNT(*) FROM STATION); SET @VAE1 = ROUND(@VAE/2)-1; SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT @VAE1; But I get syntax error. When I write the query as following...
I'm trying to solve this problem in Hackerrand
The problem asks for the median in a specific column
my code:
SET @VAE = (SELECT COUNT(*) FROM STATION);
SET @VAE1 = ROUND(@VAE/2)-1;
SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT @VAE1;
But I get syntax error.
When I write the query as following it runs ok, but I don't want it in such way
SET @VAE = (SELECT COUNT(*) FROM STATION);
SET @VAE1 = ROUND(@VAE/2)-1;
SELECT LAT_N FROM STATION ORDER BY LAT_N LIMIT 5;
what is the problem ?
ahmad alghadban
(1 rep)
Jun 14, 2023, 09:59 AM
• Last activity: Nov 12, 2023, 09:32 AM
2
votes
1
answers
1285
views
What's the difference between FETCH FIRST and FETCH NEXT in PostgreSQL?
What's the difference between `FETCH FIRST` and `FETCH NEXT` in PostgreSQL? The docs simply [say][1] > NEXT > Fetch the next row > > FIRST > Fetch the first row of the query which isn't much of an explanation. I created a sample table with sample rows, executed ```sql SELECT * from users FETCH FIRST...
What's the difference between
FETCH FIRST
and FETCH NEXT
in PostgreSQL? The docs simply say
> NEXT
> Fetch the next row
>
> FIRST
> Fetch the first row of the query
which isn't much of an explanation. I created a sample table with sample rows, executed
SELECT * from users
FETCH FIRST 2 ROWS ONLY;
SELECT * from users
FETCH NEXT 2 ROWS ONLY;
and got exactly the same output. It's almost as if FETCH FIRST
and FETCH NEXT
are merely synonyms for LIMIT
(which, I read, is not part of the SQL standard, unlike FETCH [FIRST|NEXT]
)
If the two are the same, why wouldn't the documentation say so? Or are there some differences after all?
This Stack Overflow answer implies they may function the same way in at least some RDBMSes, specifically in MS Server
Sergey Zolotarev
(243 rep)
Nov 5, 2023, 11:23 AM
• Last activity: Nov 6, 2023, 03:54 PM
5
votes
1
answers
3911
views
Difference between LIMIT N and FETCH NEXT N ROWS?
The two queries below seem equivalent. Is one better than the other? e.g. in terms of performance. ``` select * from some_table offset 10 limit 10; ``` ``` select * from some_table offset 10 fetch next 10 rows; ```
The two queries below seem equivalent. Is one better than the other? e.g. in terms of performance.
select * from some_table offset 10 limit 10;
select * from some_table offset 10 fetch next 10 rows;
James Hiew
(183 rep)
Mar 1, 2019, 02:19 PM
• Last activity: Jun 2, 2023, 02:24 PM
0
votes
1
answers
833
views
Why does my LIMITed SELECT fetch the last record of the table, not the first?
I [read][1] that to fetch the first table record, you may use `LIMIT 1`. I tried to apply that knowledge to my `JOIN` statement and [got][2] a diametrically opposite result. It joined the *last* record of the joined table ```sql SELECT * FROM users; ``` | user_id | username | [more columns] | |-----...
I read that to fetch the first table record, you may use
LIMIT 1
. I tried to apply that knowledge to my JOIN
statement and got a diametrically opposite result. It joined the *last* record of the joined table
SELECT * FROM users;
| user_id | username | [more columns] |
|--------:|----------:|--------:|
| 1 | mickey_m | [...] |
| [...] | [...] | [...] |
| 6 | daisy_d | [...] |
SELECT * FROM roles;
| role_id | role |
|--------:|----------:|
| 1 | USER |
| 2 | ADMIN |
INSERT IGNORE INTO user_role (user_id, username, role_id, role)
SELECT users.id, users.username, roles.id, roles.role
FROM users
JOIN roles ON roles.role = 'ADMIN' LIMIT 1;
SELECT * FROM user_role;
| user_id | username | role_id | role |
|--------:|----------:|--------:|------:|
| 6 | daisy_d | 2 | ADMIN |
**Why did it happen?** It seems unnatural for LIMIT
to limit the result set to the last record as opposed to the first one
Even this
INSERT IGNORE INTO user_role (user_id, username, role_id, role)
SELECT first_row.id, first_row.username, roles.id, roles.role
FROM (SELECT id, username FROM users LIMIT 1) AS first_row
JOIN roles ON roles.role = 'ADMIN';
SELECT * FROM user_role;
produces the same
In both cases, I expected to get this instead
| user_id | username | role_id | role |
|--------:|----------:|--------:|------:|
| 1 | mickey_m | 2 | ADMIN |
Does SELECT
select from the bottom up in some cases? It seems so because plain SELECT
s like this one
SELECT * FROM users LIMIT 1;
do indeed retrieve the first row. However, if you specify columns
SELECT id, username FROM users LIMIT 1;
you observe the same issue (the last row gets retrieved instead)
Sage (a chat bot) says when it comes to LIMIT
, no order is guaranteed so I should use ORDER BY
(it does help, by the way). Is it right? Does LIMIT 1
limit the result set to a random row that satisfies the criteria? Then why is it always the last one? The likelihood of that is infinitesimally small
> The behavior you are observing is likely due to the use of the LIMIT clause. When you use LIMIT without an explicit ORDER BY clause, the database will return an arbitrary set of rows that satisfy the query conditions, but the order of the rows is not guaranteed. In other words, the database may return the first row, the last row, or any other row that satisfies the query conditions.
>
> *–Sage*
Sergey Zolotarev
(243 rep)
Apr 18, 2023, 06:50 AM
• Last activity: Apr 19, 2023, 09:40 AM
1
votes
0
answers
1685
views
Best way to implement Pagination in SQL Server
I have a complex stored procedure that takes a customerId and some other parameters and returns thousands of rows. We want to implement pagination in this SP since the number of rows being returned is pretty huge. There is a main table that has all the product information (more than 100 million rows...
I have a complex stored procedure that takes a customerId and some other parameters and returns thousands of rows. We want to implement pagination in this SP since the number of rows being returned is pretty huge. There is a main table that has all the product information (more than 100 million rows) and then I have to join it to other tables to get more information about the product.
First Method:
Step 1: Get all the rows for the product id and order by date and stored them in a temp table (#temp1) that will have an identity column as the primary key.
Step 2: Pick the rows based on
Step 3: Join the rows from #temp2 to all the other tables and get the required data.
/* @display and @page are passed in along with other input parameters */ Declare @first Int = (@page - 1) * @display Declare @last Int = (@page * @display) + 1 Create Table #temp1 (Id int identity(1,1) primary key, productId uniqueidentifier) Create Table #temp2 (productId uniqueidentifier) Insert Into #temp1 Select productId From dbo.product Order by date Insert into #temp2 Select productId From #temp1 Where id > @first And id Second Method:
Step 1: Get the rows for the product id ordered by date and have an offset and fetch by clause and store them in a temp table (#temp1).
Step 2: Join the rows from #temp1 to all the other tables and get the required data.
/* @display and @page are passed in along with other input parameters */ Declare @first Int = (@page - 1) * @display Create Table #temp1 (productId uniqueidentifier) Insert Into #temp1 Select productId From dbo.product Order by date Offset @first rows fetch next @display rows only Select * From #temp1 Join some other product tables on product id With method 1, I'll have to load all the data everytime the procedure is called and then filter the rows that I need.
With method 2, I can only fetch the rows i'm interested in but I've read stuff online that offset and fetch can actually slow the query a lot. Which is usually the best approach for pagination in SQL Server? Of the two methods I described which one would be a better option or is there a better way? Thanks!
ProductId
is the primary key in the Product
table. We have a non-clustered index on date
with productId
in the INCLUDE section. There are two approaches that I have in mind. First Method:
Step 1: Get all the rows for the product id and order by date and stored them in a temp table (#temp1) that will have an identity column as the primary key.
Step 2: Pick the rows based on
@first
and @last
parameters where @first = (@page - 1) * @display
and @last = (@page * @display) + 1
from the temp table created in the previous step and store them in another temp table (#temp2).Step 3: Join the rows from #temp2 to all the other tables and get the required data.
/* @display and @page are passed in along with other input parameters */ Declare @first Int = (@page - 1) * @display Declare @last Int = (@page * @display) + 1 Create Table #temp1 (Id int identity(1,1) primary key, productId uniqueidentifier) Create Table #temp2 (productId uniqueidentifier) Insert Into #temp1 Select productId From dbo.product Order by date Insert into #temp2 Select productId From #temp1 Where id > @first And id Second Method:
Step 1: Get the rows for the product id ordered by date and have an offset and fetch by clause and store them in a temp table (#temp1).
Step 2: Join the rows from #temp1 to all the other tables and get the required data.
/* @display and @page are passed in along with other input parameters */ Declare @first Int = (@page - 1) * @display Create Table #temp1 (productId uniqueidentifier) Insert Into #temp1 Select productId From dbo.product Order by date Offset @first rows fetch next @display rows only Select * From #temp1 Join some other product tables on product id With method 1, I'll have to load all the data everytime the procedure is called and then filter the rows that I need.
With method 2, I can only fetch the rows i'm interested in but I've read stuff online that offset and fetch can actually slow the query a lot. Which is usually the best approach for pagination in SQL Server? Of the two methods I described which one would be a better option or is there a better way? Thanks!
lifeisajourney
(751 rep)
Aug 31, 2022, 03:49 PM
1
votes
0
answers
1034
views
limit user's SELECT query to a fixed number of rows
i have read the following questions: https://dba.stackexchange.com/questions/233307/limit-postgres-results-for-a-user and also noted the recommendation of limiting a user's statement_timeout, however i would like something a little more consistent. noticed one poster on the question mentions that 'r...
i have read the following questions:
https://dba.stackexchange.com/questions/233307/limit-postgres-results-for-a-user
and also noted the recommendation of limiting a user's statement_timeout, however i would like something a little more consistent.
noticed one poster on the question mentions that 'row level security' feature would allow someone to enforce this feature.
i have read some tutorials that use row-level security to limit what the user can see by enforcing some column relationship, but i don't know how to extend this to a case of limiting rows since it does not impose any column value constraint.
i also saw this about using FETCH_COUNT:
https://stackoverflow.com/questions/30369100/how-to-handle-large-result-sets-with-psql
as an _alternative_ (key word here):
if row-level security does not allow us to limit the rows returned per-query for a user, it seems a combination of fetch_count and statement_timeout for a specific user may achieve my goal.
is there a way to set the FETCH_COUNT for a user account without having them to specify it in their client/session?
thanks in advance.
gagan
(11 rep)
Jan 17, 2022, 01:21 AM
5
votes
2
answers
2856
views
Why doesn't an OFFSET query benefit more from the index?
I've got a simple example `books` table with an integer `id` primary key (`"books_pkey" PRIMARY KEY, btree (id)`) and 100,000,000 random rows. If I run: ``` EXPLAIN SELECT * FROM books ORDER BY id OFFSET 99999999 LIMIT 1; ``` I see a query plan like: ``` Limit (cost=3137296.54..3137296.57 rows=1 wid...
I've got a simple example
books
table with an integer id
primary key ("books_pkey" PRIMARY KEY, btree (id)
) and 100,000,000 random rows. If I run:
EXPLAIN
SELECT *
FROM books
ORDER BY id
OFFSET 99999999
LIMIT 1;
I see a query plan like:
Limit (cost=3137296.54..3137296.57 rows=1 width=14)
-> Index Scan using books_pkey on books (cost=0.57..3137296.57 rows=100000000 width=14)
Do I understand correctly that PostgreSQL is loading 100000000
rows into memory, only for the OFFSET
to discard all but 1
? If so, why can't it do the "load and discard" step using the index and only load one row into memory?
I understand that the typical solution to this is to use keyset pagination - to say WHERE id > x
. I'm just trying to understand why an index alone doesn't solve this. Adding another index which is explicitly sorted the same way as this query (CREATE INDEX books_id_ordered ON books (id ASC)
) makes no difference to EXPLAIN
.
Nathan Long
(1005 rep)
Aug 13, 2021, 05:54 PM
• Last activity: Aug 16, 2021, 07:22 PM
1
votes
1
answers
867
views
LIMIT and OFFSET a complex query
I have a query generated by a ORM (Sequelize), i ran into an issue where sequelize fails, look at those issues. https://github.com/sequelize/sequelize/issues/7344 https://github.com/sequelize/sequelize/issues/12200 Postgres Query: ````sql SELECT "feeds"."id", "feeds"."title", "feeds"."likes", "feeds...
I have a query generated by a ORM (Sequelize), i ran into an issue where sequelize fails, look at those issues.
https://github.com/sequelize/sequelize/issues/7344
https://github.com/sequelize/sequelize/issues/12200
Postgres Query:
`sql
SELECT "feeds"."id",
"feeds"."title",
"feeds"."likes",
"feeds"."description",
"feeds"."files",
"feeds"."allowComments",
"feeds"."readConfirmation",
"feeds"."isDraft",
"feeds"."createdAt",
"feeds"."updatedAt",
"feeds"."companyId",
"feeds"."createdById",
"reads"."id" AS "reads.id",
"reads->feeds_reads"."createdAt" AS "reads.feeds_reads.createdAt",
"reads->feeds_reads"."updatedAt" AS "reads.feeds_reads.updatedAt",
"reads->feeds_reads"."feedId" AS "reads.feeds_reads.feedId",
"reads->feeds_reads"."userId" AS "reads.feeds_reads.userId",
"createdBy"."id" AS "createdBy.id",
"createdBy"."firstName" AS "createdBy.firstName",
"createdBy"."jobTitle" AS "createdBy.jobTitle",
"createdBy"."lastName" AS "createdBy.lastName",
"createdBy"."profilePicture" AS "createdBy.profilePicture",
"bookmarks"."id" AS "bookmarks.id",
"bookmarks->feeds_bookmarks"."createdAt" AS "bookmarks.feeds_bookmarks.createdAt",
"bookmarks->feeds_bookmarks"."updatedAt" AS "bookmarks.feeds_bookmarks.updatedAt",
"bookmarks->feeds_bookmarks"."feedId" AS "bookmarks.feeds_bookmarks.feedId",
"bookmarks->feeds_bookmarks"."userId" AS "bookmarks.feeds_bookmarks.userId",
"units"."id" AS "units.id",
"units"."parentId" AS "units.parentId",
"units->feeds_units"."createdAt" AS "units.feeds_units.createdAt",
"units->feeds_units"."updatedAt" AS "units.feeds_units.updatedAt",
"units->feeds_units"."feedId" AS "units.feeds_units.feedId",
"units->feeds_units"."unitId" AS "units.feeds_units.unitId",
"units->users"."id" AS "units.users.id",
"units->users->users_units"."createdAt" AS "units.users.users_units.createdAt",
"units->users->users_units"."updatedAt" AS "units.users.users_units.updatedAt",
"units->users->users_units"."userId" AS "units.users.users_units.userId",
"units->users->users_units"."unitId" AS "units.users.users_units.unitId",
"units->descendents"."id" AS "units.descendents.id",
"units->descendents"."parentId" AS "units.descendents.parentId",
"units->descendents->unitsancestor"."unitsId" AS "units.descendents.unitsancestor.unitsId",
"units->descendents->unitsancestor"."ancestorId" AS "units.descendents.unitsancestor.ancestorId",
"units->descendents->users"."id" AS "units.descendents.users.id",
"units->descendents->users->users_units"."createdAt" AS "units.descendents.users.users_units.createdAt",
"units->descendents->users->users_units"."updatedAt" AS "units.descendents.users.users_units.updatedAt",
"units->descendents->users->users_units"."userId" AS "units.descendents.users.users_units.userId",
"units->descendents->users->users_units"."unitId" AS "units.descendents.users.users_units.unitId",
"teams"."id" AS "teams.id",
"teams->feeds_teams"."createdAt" AS "teams.feeds_teams.createdAt",
"teams->feeds_teams"."updatedAt" AS "teams.feeds_teams.updatedAt",
"teams->feeds_teams"."feedId" AS "teams.feeds_teams.feedId",
"teams->feeds_teams"."teamId" AS "teams.feeds_teams.teamId",
"teams->peoples->teams_users"."createdAt" AS "teams.peoples.teams_users.createdAt",
"teams->peoples->teams_users"."updatedAt" AS "teams.peoples.teams_users.updatedAt",
"teams->peoples->teams_users"."userId" AS "teams.peoples.teams_users.userId",
"teams->peoples->teams_users"."teamId" AS "teams.peoples.teams_users.teamId",
"comments"."text" AS "comments.text",
"comments"."id" AS "comments.id",
"comments"."likes" AS "comments.likes",
"comments"."parentId" AS "comments.parentId",
"comments"."createdById" AS "comments.createdById",
"comments"."createdAt" AS "comments.createdAt",
"comments"."updatedAt" AS "comments.updatedAt",
"comments->createdBy"."id" AS "comments.createdBy.id",
"comments->createdBy"."firstName" AS "comments.createdBy.firstName",
"comments->createdBy"."lastName" AS "comments.createdBy.lastName",
"comments->createdBy"."jobTitle" AS "comments.createdBy.jobTitle",
"comments->createdBy"."profilePicture" AS "comments.createdBy.profilePicture",
"peoples->feeds_peoples"."createdAt" AS "peoples.feeds_peoples.createdAt",
"peoples->feeds_peoples"."updatedAt" AS "peoples.feeds_peoples.updatedAt",
"peoples->feeds_peoples"."feedId" AS "peoples.feeds_peoples.feedId",
"peoples->feeds_peoples"."userId" AS "peoples.feeds_peoples.userId"
FROM "feeds" AS "feeds"
LEFT OUTER JOIN (
"feeds_reads" AS "reads->feeds_reads"
INNER JOIN "users" AS "reads" ON "reads"."id" = "reads->feeds_reads"."userId"
) ON "feeds"."id" = "reads->feeds_reads"."feedId"
LEFT OUTER JOIN "users" AS "createdBy" ON "feeds"."createdById" = "createdBy"."id"
LEFT OUTER JOIN (
"feeds_bookmarks" AS "bookmarks->feeds_bookmarks"
INNER JOIN "users" AS "bookmarks" ON "bookmarks"."id" = "bookmarks->feeds_bookmarks"."userId"
) ON "feeds"."id" = "bookmarks->feeds_bookmarks"."feedId"
LEFT OUTER JOIN (
"feeds_units" AS "units->feeds_units"
INNER JOIN "units" AS "units" ON "units"."id" = "units->feeds_units"."unitId"
) ON "feeds"."id" = "units->feeds_units"."feedId"
LEFT OUTER JOIN (
"users_units" AS "units->users->users_units"
LEFT OUTER JOIN "users" AS "units->users" ON "units->users"."id" = "units->users->users_units"."userId"
) ON "units"."id" = "units->users->users_units"."unitId"
LEFT OUTER JOIN (
"unitsancestor" AS "units->descendents->unitsancestor"
LEFT OUTER JOIN "units" AS "units->descendents" ON "units->descendents"."id" = "units->descendents->unitsancestor"."unitsId"
) ON "units"."id" = "units->descendents->unitsancestor"."ancestorId"
LEFT OUTER JOIN (
"users_units" AS "units->descendents->users->users_units"
LEFT OUTER JOIN "users" AS "units->descendents->users" ON "units->descendents->users"."id" = "units->descendents->users->users_units"."userId"
) ON "units->descendents"."id" = "units->descendents->users->users_units"."unitId"
LEFT OUTER JOIN (
"feeds_teams" AS "teams->feeds_teams"
INNER JOIN "teams" AS "teams" ON "teams"."id" = "teams->feeds_teams"."teamId"
) ON "feeds"."id" = "teams->feeds_teams"."feedId"
LEFT OUTER JOIN (
"teams_users" AS "teams->peoples->teams_users"
INNER JOIN "users" AS "teams->peoples" ON "teams->peoples"."id" = "teams->peoples->teams_users"."userId"
) ON "teams"."id" = "teams->peoples->teams_users"."teamId"
LEFT OUTER JOIN "comments" AS "comments" ON "feeds"."id" = "comments"."feedId"
LEFT OUTER JOIN "users" AS "comments->createdBy" ON "comments"."createdById" = "comments->createdBy"."id"
LEFT OUTER JOIN (
"feeds_peoples" AS "peoples->feeds_peoples"
INNER JOIN "users" AS "peoples" ON "peoples"."id" = "peoples->feeds_peoples"."userId"
) ON "feeds"."id" = "peoples->feeds_peoples"."feedId"
WHERE (
"peoples"."id" = 11
OR "feeds"."createdById" = 11
OR "teams->peoples"."id" = 11
OR "units->users"."id" = 11
OR "units->descendents->users"."id" = 11
)
AND "feeds"."companyId" = 4
AND "feeds"."isDraft" = false
AND "feeds"."createdAt" < '2020-12-09 12:59:34.017 +00:00'
LIMIT 20;
`
Here the limit is not applying to the feeds, i want 20 feeds but it's giving me same feed 20 times.
Raman Shekhawat
(113 rep)
Dec 10, 2020, 07:24 AM
• Last activity: Dec 10, 2020, 07:35 AM
0
votes
0
answers
21
views
Analysis of Conditions with no results and high offset number - PostgreSQL JSONB
I am asked to record down the analysis on PostgreSQL JSONB query having Conditions with high offset number and query that provide no results considering the below. ``` select * from sub_test df, jsonb_array_elements(json_data) jba where (jba->>'name')::text = 'SubYear' and (jba->>'value')::numeric >...
I am asked to record down the analysis on PostgreSQL JSONB query having Conditions with high offset number and query that provide no results considering the below.
select
*
from
sub_test df,
jsonb_array_elements(json_data) jba
where
(jba->>'name')::text = 'SubYear'
and (jba->>'value')::numeric > '2050';
1. SubYear > 2200 (has no results)
Explanations about the query plan, are full scans of tables necessary with JSON?
2. SubYear > 1970 (has a lot of results) and offset 30k
Explanations about the query plan, are full scans of tables necessary with JSON?
Note that there are about 35k+ records in the table "sub_test" with 500+ keys in the JSONB column "json_data"
Could you help me to answer them ?
Lingamoorthy Bheeman Balan
(1 rep)
Oct 5, 2020, 07:20 AM
• Last activity: Oct 5, 2020, 07:26 AM
4
votes
1
answers
4670
views
MSSQL efficient paging including a join on big tables
We have a table with products and a table with a current selection/assortment of products. Now need to join them create an export file by some .net code. Currently using query like this: SELECT * FROM ASSORTMENT a INNER JOIN PRODUCTS p on a.clientID = 38 and a.productID = p.productID ORDER BY a.ID_R...
We have a table with products and a table with a current selection/assortment of products. Now need to join them create an export file by some .net code.
Currently using query like this:
SELECT * FROM ASSORTMENT a
INNER JOIN PRODUCTS p on a.clientID = 38 and a.productID = p.productID
ORDER BY a.ID_ROW OFFSET 100000 ROWS FETCH NEXT 1000 ROWS ONLY
Problem is that the assortment can contain discontinued products, that are not existing in
products
table anymore, so need a join to verify. Also need each time exactly same amount of rows per select, e.g. 1000.
The query gets slower if the offset gets higher because it does a full join of products and assortment table each time. On 3 Million products and an assortment of 200.000 articles it takes multiple seconds to fetch next 1000 if the offset is over 50.0000
Is there a more efficient way of doing this?
Svn342
(41 rep)
Dec 31, 2017, 10:38 PM
• Last activity: May 12, 2020, 03:13 PM
1
votes
1
answers
960
views
Distinct result with row_id for limit query
There are 2 tables: ```sql CREATE TABLE sample1 ( id int, name varchar, score int, mode int ) CREATE TABLE sample2 ( id int, mode int ... ) ``` I need to select distinct `name` with `MIN(score)` and `ROW_NUMBER()`. Must sort by `ROW_NUMBER() ASC`. The `mode` must exist in table `sample2` for a row f...
There are 2 tables:
CREATE TABLE sample1 (
id int,
name varchar,
score int,
mode int
)
CREATE TABLE sample2 (
id int,
mode int
...
)
I need to select distinct name
with MIN(score)
and ROW_NUMBER()
. Must sort by ROW_NUMBER() ASC
. The mode
must exist in table sample2
for a row from sample1
, so there is INNER JOIN
.
ROW_NUMBER()
is needed for sequential LIMIT queries so I can select data in batches. I don't care about consistency of that LIMIT+ROW_NUMBER() trick.
My query:
WITH sample1 AS (
WITH sample1 AS (
SELECT a.name, MIN(a.score) s
FROM sample1 a
JOIN sample2 USING (mode)
GROUP BY a.name
)
SELECT *, ROW_NUMBER() OVER (ORDER BY s ASC) AS n
FROM sample1
) SELECT * FROM sample1 WHERE n > $1 LIMIT $2
I am concerned about efficiency of that select, in fact it's 3 level deep subquery. Perhaps there are less ugly and more efficient ways to achieve same result?
The number of rows in this table will be approximately 1,000,000 and distinct set is expected to be 100,000. Queries are frequent, actually it's a hot table for select, only score column will be updated in batches of the same size as limit select.
Alex
(113 rep)
Dec 14, 2019, 10:43 PM
• Last activity: Dec 16, 2019, 02:28 AM
16
votes
3
answers
4548
views
Why are there execution plan differences between OFFSET ... FETCH and the old-style ROW_NUMBER scheme?
The new `OFFSET ... FETCH` model introduces with SQL Server 2012 offers simple and faster paging. **Why are there any differences at all considering that the two forms are semantically identical and very common?** One would assume that the optimizer recognizes both and optimizes them (trivially) to...
The new
One can vary this test case by creating a CI on
OFFSET ... FETCH
model introduces with SQL Server 2012 offers simple and faster paging. **Why are there any differences at all considering that the two forms are semantically identical and very common?**
One would assume that the optimizer recognizes both and optimizes them (trivially) to the fullest.
Here is a very simple case where OFFSET ... FETCH
is ~2x faster according to the cost estimate.
SELECT * INTO #objects FROM sys.objects
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY object_id) r
FROM #objects
) x
WHERE r >= 30 AND r < (30 + 10)
ORDER BY object_id
SELECT *
FROM #objects
ORDER BY object_id
OFFSET 30 ROWS FETCH NEXT 10 ROWS ONLY

object_id
or adding filters but it is impossible to remove all plan differences. OFFSET ... FETCH
is always faster because it does less work at execution time.
usr
(7390 rep)
Dec 11, 2012, 07:39 PM
• Last activity: Nov 15, 2019, 09:27 AM
5
votes
1
answers
10852
views
How to take make OFFSET & LIMIT with COUNT (OVER?) when having a subquery?
It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too. In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as t...
It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too.
In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as the one presented here but without
OFFSET
and the output is just a COUNT(*)
of all of the rows.
I have two objectives:
1. Write the query so that COUNT(*)
is returned in the same query. Indeed I have been looking at help pieces such as the excellent [SQL SERVER – How to get total row count from OFFSET / FETCH NEXT (Paging)](https://raresql.com/2015/03/30/sql-server-how-to-get-total-row-count-from-offset-fetch-next-paging/) with different ways of solving the problem, but then there's another piece...
2. Rewrite the join with a window function (e.g. OVER(PARTITION BY
) or some more performant way as that query seem to generate an INDEX SCAN
and INDEX SEEK
on the table. The real query is a bit more complicated in the WHERE
part, but it looks to me even one scan could be enough if the query were a bit more straightforward so that the COUNT
and MAX
could be had simultaneously with the outer query. Even this would be a win, but combined with having the overall COUNT
would be even bigger.
Maybe I'm trying to chew a teeny bit more than I can chew currently, but on the other hand, maybe there is now a chance to learn something.
Here are the table and data
CREATE TABLE Temp
(
Id INT NOT NULL PRIMARY KEY,
Created INT NOT NULL,
ParentId INT,
SomeInfo INT NOT NULL,
GroupId INT NOT NULL
CONSTRAINT FK_Temp FOREIGN KEY(ParentId) REFERENCES Temp(Id)
);
-- Some root levels nodes.
INSERT INTO Temp VALUES(1, 1, NULL, 1, 1);
INSERT INTO Temp VALUES(2, 2, NULL, 2, 2);
INSERT INTO Temp VALUES(3, 3, NULL, 1, 3);
INSERT INTO Temp VALUES(13, 13, NULL, 1, 1);
-- First order child nodes.
INSERT INTO Temp VALUES(4, 4, 1, 2, 1);
INSERT INTO Temp VALUES(5, 5, 2, 1, 2);
INSERT INTO Temp VALUES(6, 6, 3, 2, 3);
-- Second order child nodes.
INSERT INTO Temp VALUES(7, 7, 4, 1, 1);
INSERT INTO Temp VALUES(8, 8, 5, 2, 2);
INSERT INTO Temp VALUES(9, 9, 6, 1, 3);
SELECT
Id,
newestTable.SomeInfo,
newestTable.Created,
CASE WHEN newestTable.RootCount > 1 THEN 1 ELSE 0 END AS IsMulti
FROM
Temp as originalTable
INNER JOIN
(
SELECT
SomeInfo,
Max(Created) AS Created,
Count(*) AS RootCount
FROM
Temp
WHERE ParentId IS NULL AND GroupId = 1
GROUP BY SomeInfo
) AS newestTable ON originalTable.SomeInfo = newestTable.SomeInfo AND originalTable.Created = newestTable.Created
/*WHERE
(
originalTable.SomeInfo = 1
)*/
ORDER BY newestTable.Created ASC
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;
P.S. Also https://dba.stackexchange.com/questions/229214/how-to-apply-outer-limit-offset-and-filters-in-the-subquery-to-avoid-grouping-ov looks interesting.
1 THEN 1 ELSE 0 END AS IsMulti
FROM
Temp
WHERE
(
GroupId = 1 AND ParentId IS NULL
)
ORDER BY Created ASC
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;
gets close to there. The problem is, though, there are now two result rows and it appears to me this is due to the original INNER JOIN
joining back to Temp
that cull it to one row. I wonder if there is a way to apply the conditions somehow either before or after the windowing to match more closely the original query. (And this isn't the same query, to be clear. There's just so little data, hence the queries look like being close to each other.)
Veksi
(269 rep)
Jun 11, 2019, 09:25 PM
• Last activity: Jun 17, 2019, 03:54 PM
4
votes
1
answers
1542
views
How to get tsql TOP PERCENT with OFFSET
select top 10 percent * from sys.databases order by database_id I want to get the same 10 percent result using offset but this query works wrong offset 0 rows fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only
select top 10 percent *
from sys.databases
order by database_id
I want to get the same 10 percent result using offset but this query works wrong
offset 0 rows
fetch next (select cast((count(*) * 10/100) as int ) from sys.databases) rows only
igelr
(2162 rep)
Mar 13, 2019, 08:45 AM
• Last activity: Mar 14, 2019, 08:16 AM
3
votes
1
answers
1170
views
Slow Query with LIMIT
We have this query: SELECT "messages".* FROM "messages" WHERE "messages"."account_id" = 1 AND "messages"."status" = 'unread' AND (created_at >= '2014-04-24 01:00:00.000000') ORDER BY id DESC It runs fine under 0.1ms. But when we add **LIMIT**: SELECT "messages".* FROM "messages" WHERE "messages"."ac...
We have this query:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
It runs fine under 0.1ms. But when we add **LIMIT**:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5
It runs over 20,000ms. Some notes:
- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The **account_id** and **created_at** conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that **account_id**, there are only 3 unread messages.
So we isolated each **WHERE** condition and found out about a certain behaviour -- we tweaked the **LIMIT** to correspond with the amount of unread messages:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3
It runs under 0.100ms. Any explanation for the difference in performance?
Ace Subido
(31 rep)
Oct 24, 2014, 01:32 AM
• Last activity: Feb 10, 2019, 07:01 AM
2
votes
1
answers
3616
views
Get total number of rows when offset is greater than max
I have a table `users` that has, say, 42 records. To get the total rows and some other details, I use a window function: select count(*) over() as count, name, id from users And since the consumer app is paginated, I can utilise `limit` and `offset` to get the desired data: select count(*) over() as...
I have a table
users
that has, say, 42 records. To get the total rows and some other details, I use a window function:
select count(*) over() as count, name, id from users
And since the consumer app is paginated, I can utilise limit
and offset
to get the desired data:
select count(*) over() as count, name, id from users limit 5 offset 20
This gives me a (sample) result:
count | name | id
-------+---------+----
42 | Dummy05 | 27
42 | Dummy06 | 30
42 | Dummy07 | 31
42 | Dummy08 | 32
42 | Dummy09 | 33
(5 rows)
Of course, this can be optimised as described in this answer .
However when the offset value overshoots, I get 0 rows which is correct.
select count(*) over() as count, name, id from users limit 5 offset 50
Is there a way that gives me the total count despite the offset returning 0 rows? Something on the similar lines:
count | name | id
-------+---------+----
42 | |
(0 rows)
Slartibartfast
(123 rep)
Jan 31, 2019, 01:26 PM
• Last activity: Jan 31, 2019, 03:02 PM
5
votes
1
answers
11993
views
Reduce query time for higher offset in sql server
Currently, I have table `base_voter` with data around 100M of dummy data. I have stored procedure as follows: CREATE Procedure [dbo].[spTestingBaseVoter] @SortColumn NVARCHAR(128) = N'name_voter', @SortDirection VARCHAR(4) = 'asc', @offset INT, @limit INT As Begin SET NOCOUNT ON; -- reject any inval...
Currently, I have table
base_voter
with data around 100M of dummy data. I have stored procedure as follows:
CREATE Procedure [dbo].[spTestingBaseVoter]
@SortColumn NVARCHAR(128) = N'name_voter',
@SortDirection VARCHAR(4) = 'asc',
@offset INT,
@limit INT
As
Begin
SET NOCOUNT ON;
-- reject any invalid sort directions:
IF LOWER(@SortDirection) NOT IN ('asc','desc')
BEGIN
RAISERROR('Invalid parameter for @SortDirection: %s', 11, 1, @SortDirection);
RETURN -1;
END
-- reject any unexpected column names:
IF LOWER(@SortColumn) NOT IN (N'name_voter', N'home_street_address_1', N'home_address_city')
BEGIN
RAISERROR('Invalid parameter for @SortColumn: %s', 11, 1, @SortColumn);
RETURN -1;
END
--SET @SortColumn = QUOTENAME(@SortColumn);
DECLARE @sql NVARCHAR(MAX);
SET @sql = N'SELECT id, name_voter, home_street_address_1, home_address_city
FROM dbo.base_voter
WITH(NOLOCK)
WHERE deleted_at IS NULL'
SET @sql = @sql + N' ORDER BY ' + @SortColumn + ' ' + @SortDirection +
' OFFSET @OF ROWS
FETCH NEXT @LIM ROWS ONLY ';
EXEC sp_executesql @sql,
N'@OF int,@LIM int',
@OF=@offset, @LIM=@limit
End
To make query faster, I have also created index as :
CREATE NONCLUSTERED INDEX FIBaseVoterWithDeletedAt
ON dbo.base_voter (name_voter asc,home_street_address_1, home_address_city)
WHERE deleted_at IS NULL ;
By creating this non-clustered index I have reduced the query time drastically. However, it does not applied same for the query with higher offset.
For example with:
Execute spTestingBaseVoter name_voter,asc,9999950,50
Is there something I am doing wrong, which is causing this performance issue? Or, it is more compulsory to create another index in descending order.
Let, me know if there is any better way to tackle this situation, which might decrease query time drastically.
**Update:**
Estimated Execution Plan

Saroj Shrestha
(195 rep)
Oct 17, 2018, 07:58 AM
• Last activity: Dec 12, 2018, 11:34 AM
1
votes
0
answers
797
views
MySQL: Limit number of rows returned with a given column value
My table has a list of activities, each associated with a customer and each with a date when the activity happened. I want to group by the customer ID and the month and the year, then sort by the month, the year and the count to generate a list of time activity generators by month. SELECT `original_...
My table has a list of activities, each associated with a customer and each with a date when the activity happened. I want to group by the customer ID and the month and the year, then sort by the month, the year and the count to generate a list of time activity generators by month.
SELECT
original_actor_id
,
MONTH(activities.created_at
) as month,
YEAR(activities.created_at
) as year,
COUNT(*) as count
FROM activities
activities
WHERE activity_type
= 'product_video_completed'
GROUP BY
original_actor_id
,
MONTH(activities.created_at
),
YEAR(activities.created_at
)
ORDER BY
YEAR(activities.created_at
) DESC,
MONTH(activities.created_at
) DESC,
count DESC
However, I am only interested in the top 5 counts for each month/year combo. What is the best way to limit this to the top 5 higest counts for each month/year combo?
user41527
(111 rep)
Oct 31, 2017, 09:27 PM
• Last activity: Aug 8, 2018, 08:01 PM
Showing page 1 of 20 total questions