Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
2
votes
2
answers
3240
views
Optimizing a recursive CTE or replacing it with a temporary table
I have a recursive query like this: ``` with recursive PCte(id) as ( select p1.id from Product p1 where p1.parent_id is null and p1.serial_number in ('123', '124','125', 'n') union all select p2.id from Product p2 inner join PCte cte on p2.parent_id = cte.id ) select * from Product p left join Produ...
I have a recursive query like this:
with recursive PCte(id) as
(
select p1.id from Product p1 where p1.parent_id is null and p1.serial_number in
('123', '124','125', 'n')
union all
select p2.id from Product p2 inner join PCte cte on p2.parent_id = cte.id
)
select *
from Product p
left join Product psub on psub.parent_id = p.id
where p.id in (select c.id from PCte c)
This query have a slow performance with a large number of children, are there any possible optimization ways? If it is possible to replace the recursive CTE with a temporary table, I would like to see an example, thanks a lot.
Itan Reimbergh
(21 rep)
Sep 28, 2021, 01:10 PM
• Last activity: Aug 1, 2025, 02:06 AM
1
votes
1
answers
2767
views
Insert multiple rows into a table with id from other table if not exists insert to other table
I have done similar task where I can insert a row into a table if data doesn't exists: WITH user_exists AS ( Select id from users where username='%s' ), user_new AS ( INSERT INTO users (username) SELECT w.username FROM (values ('%s')) w(username) WHERE not exists (SELECT 1 FROM users u WHERE u.usern...
I have done similar task where I can insert a row into a table if data doesn't exists:
WITH
user_exists AS (
Select id from users where username='%s'
),
user_new AS (
INSERT INTO users (username)
SELECT w.username FROM (values ('%s')) w(username)
WHERE not exists
(SELECT 1 FROM users u WHERE u.username = w.username)
returning id
)
INSERT INTO feedbacks ('static_row', userid)
SELECT
'static_data',
(SELECT id FROM users_exists UNION ALL SELECT id FROM users_new) AS userid
Above works well when we insert a new row to feedbacks table. If user doesn't exists it inserts data in users table and returns id which is used for inserting data to feedbacks table.
But now my use case is, I have to insert multiple rows into the feedback table. Something like this:
user_variable = ['a','b', ...]
Insert into feedbacks ('static_row', userid)
VALUES
('sample_data', (Select if from users where username='a')),
('sample_data', (Select if from users where username='b')),
('sample_data', (Select if from users where username='c'))
For above case, how we can insert a new row to users table if username='b' doesn't exist?
undefined
(151 rep)
Feb 4, 2021, 11:18 AM
• Last activity: Jul 21, 2025, 09:02 PM
0
votes
1
answers
148
views
How to have CTE query recurse parent query result one by one?
I have this query used on Postgres to return place administrative levels. ``` WITH RECURSIVE hierarchy(name, pinyin, level) AS ( SELECT p.name, p.pinyin, p.level, p.upper_place_object_id_fk from place p where p.pinyin = 'Jiulong' UNION ALL SELECT up.name, up.pinyin, up.level, up.upper_place_object_i...
I have this query used on Postgres to return place administrative levels.
WITH RECURSIVE hierarchy(name, pinyin, level) AS (
SELECT p.name, p.pinyin, p.level, p.upper_place_object_id_fk
from place p
where p.pinyin = 'Jiulong'
UNION ALL
SELECT up.name, up.pinyin, up.level, up.upper_place_object_id_fk
from hierarchy h
INNER JOIN place up ON h.upper_place_object_id_fk = up.object_id
WHERE h.upper_place_object_id_fk IS NOT NULL
)
SELECT h.name, h.pinyin, h.level from hierarchy h;
It works just fine for a single match, e.g.
"Jiulong" 4
"Yingde" 3
"Qingyuan" 2
"Guangdong" 1
But of course, there can be several places with the same name (and a different position in the admin hierarchy)
"Fenggang" 4
"Fenggang" 4
"Huaiji" 3
"Dongguan" 2
"Zhaoqing" 2
"Guangdong" 1
"Guangdong" 1
In this case the result doesn't make sense as I can't reconstruct the proper hierarchy.
And that's because the recursion occurs for every result of the parent query at the same time.
It'd work if the recursion could be done until the exit condition for each result of the parent.
It is possible to control how the recursion is done?
Place Table script
CREATE TABLE public.place
(
object_id uuid NOT NULL DEFAULT uuid_generate_v4(),
upper_place_object_id_fk uuid,
name character varying(50) COLLATE pg_catalog."default" NOT NULL,
level smallint NOT NULL,
CONSTRAINT pk_place PRIMARY KEY (object_id),
CONSTRAINT unique_place UNIQUE (name, upper_place_object_id_fk, level)
,
CONSTRAINT fk_place_upper_pla_place FOREIGN KEY (upper_place_object_id_fk)
REFERENCES public.place (object_id) MATCH SIMPLE
ON UPDATE RESTRICT
ON DELETE RESTRICT
NOT VALID
)
and here are the relevant data from the table
object_id upper_place_object_id_fk name level
"540ed8e9-c456-4a46-8049-1332f4ab52c1" NULL "Guangdong" 1 false
"4f7c7071-f917-4ab8-82c9-75aff0ba44d0" "540ed8e9-c456-4a46-8049-1332f4ab52c1" "Dongguan" 2 false
"afce8ec3-599f-4cea-9227-29bb750d4b01" "540ed8e9-c456-4a46-8049-1332f4ab52c1" "Zhaoqing" 2 false
"714b8e23-c346-4370-8697-25bb09497342" "afce8ec3-599f-4cea-9227-29bb750d4b01" "Huaiji " 3 false
"0461963d-e886-4149-bded-ee08216dd374" "714b8e23-c346-4370-8697-25bb09497342" "Fenggang" 4 false
"6955f89b-b7d3-4b1e-b21e-fb584d7f165b" "712c1f45-24ac-4711-a47f-22513afc6fdd" "Dongguan" 4 true
"d931f691-0ec7-4fc7-9197-b15461e984e0" "4f7c7071-f917-4ab8-82c9-75aff0ba44d0" "Fenggang" 4 true
coolnodje
(101 rep)
Oct 10, 2019, 04:07 PM
• Last activity: Jul 19, 2025, 07:05 PM
3
votes
1
answers
170
views
Why delete and insert in a CTE works despite UNIQUE constraint?
Given a table with a unique constraint on a column ```sql CREATE TABLE t ( id int GENERATED ALWAYS AS IDENTITY, name text NOT NULL UNIQUE ); ``` Let's create a single record ```sql INSERT INTO t (name) VALUES ('a'); ``` When I try to delete an existing record and insert a new one with the same value...
Given a table with a unique constraint on a column
CREATE TABLE t
(
id int GENERATED ALWAYS AS IDENTITY,
name text NOT NULL UNIQUE
);
Let's create a single record
INSERT INTO t (name) VALUES ('a');
When I try to delete an existing record and insert a new one with the same value for the unique column in a single statement using CTE, it fails as expected by me:
WITH
deleted_cte AS (DELETE FROM t WHERE name = 'a' RETURNING id),
inserted_cte AS (INSERT INTO t (name) VALUES ('a') RETURNING id)
SELECT 1;
-- ERROR: duplicate key value violates unique constraint "t_name_key"
-- DETAIL: Key (name)=(a) already exists.
I expect DELETE and INSERT commands to run concurrently in an unspecified order and see the same snapshot of the table.
However, if I introduce dependency between the primary query and the deleting sub-statement, it works:
WITH
deleted_cte AS (DELETE FROM t WHERE name = 'a' RETURNING id),
inserted_cte AS (INSERT INTO t (name) VALUES ('a') RETURNING id)
SELECT id from deleted_cte; -- only this line modified
-- 1 Delete on t
-- -> Seq Scan on t
-- Filter: (name = 'a'::text)
-- CTE inserted_cte
-- -> Insert on t t_1
-- -> Result
I do not understand what happens here.
Are sub-statements ordered now and INSERT is forced to run after DELETE?
Is UNIQUE check somehow moved to the end of the whole CTE?
Where in the docs can I read about this behavior or is it an unreliable thing?
https://dbfiddle.uk/FYrW2n5W
----------
I realize that this exact delete-insert can be replaced with upsert.
But in the actual code records aren't hard deleted. They instead are being soft deleted using UPDATE t SET deleted_at = now()
and then new records inserted. Unique constraint filters out soft deleted records CREATE UNIQUE INDEX ON t (name) WHERE (deleted_at IS NULL)
. So upsert wouldn't work here.
metalon
(31 rep)
Nov 11, 2024, 03:43 PM
• Last activity: Jul 19, 2025, 08:11 AM
0
votes
1
answers
181
views
PostgreSQL CTE Recursion - How to Improve Performance for Parent->Child Tree (8 levels deep, 890 roots)
I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table). https...
I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table).
https://explain.dalibo.com/plan/g2d2f5fg9c563b0d#plan
https://explain.depesz.com/s/eaRu
It seems like both the CTE table construction and moreso the scanning of the CTE to apply the constraints are the most time consuming parts of the query. Most work appears to be done in memory with the given settings. We still seem to scan 780GB of cache (cache hit) which seems like duplicate work.
The actual number of nodes in any given tree may be 1000-5000. In this operation, we are starting from 890 root nodes. There are both 1:many and many:1 relationships in this tree (splitting, recombining)
Doing some experiments in a blank environment I notice that:
- To get all parent->child resource_ids in single column, we need to do left join so all leaf nodes will have a
null
child row with their calculated depth.
- We track path to ensure we don't get in an infinite cycle. While path is important for avoiding cycles, it it not relevant for the final calculation. Some paths will be duplicating information for the same root -> descendant. This makes the CTE table much larger
Are there any techniques that can be used here to make this recursion less painful? We need to calculate the deepest leaf node from the root matching the starting root to only the deepest leaf with the 'begat' relationship and matching 'sample type' property.
Justin Lowen
(68 rep)
Oct 17, 2024, 05:14 PM
• Last activity: Jul 16, 2025, 06:02 PM
0
votes
1
answers
151
views
calculate or get price of parent items on bill of materials
i want to calculate or get the price of parent items on bill of materials query here is the fiddle-> https://www.db-fiddle.com/f/o3jLgZxKNLG14mna8QGdVN/6 please note in the second row, betax has qty = 2 and the other parent items could behave the same qty of the first row, always will be 1 **what i...
i want to calculate or get the price of parent items on bill of materials query
here is the fiddle-> https://www.db-fiddle.com/f/o3jLgZxKNLG14mna8QGdVN/6
please note in the second row, betax has qty = 2 and the other parent items could behave the same
qty of the first row, always will be 1
**what i need is to get the 'price' and 'subtotal' for every parent**
nkne
(9 rep)
Jun 17, 2020, 05:02 AM
• Last activity: Jul 13, 2025, 12:02 PM
1
votes
1
answers
187
views
Query To Show YTD Total Summing Each Month Prior To Current
I need a query to give a YTD total that should sum each month before it. For example, January YTD would equal January Total, February YTD would equal January Total + February Total, March YTD would equal January Total + February Total + March Total. My issue is that my query is showing the SUM() of...
I need a query to give a YTD total that should sum each month before it. For example, January YTD would equal January Total, February YTD would equal January Total + February Total, March YTD would equal January Total + February Total + March Total. My issue is that my query is showing the SUM() of the Total for all months.
This is DDL that illustrates my issue
Create Table empSales
(
spendType varchar(100)
,spendAmt decimal(10,2)
,spendMonth varchar(100)
);
Insert Into empSales
(spendType
, spendAmt
, spendMonth
) VALUES
('James', '1.00', 'January'),
('Richard', '3.28', 'January'),
('Barb', '4.13', 'January'),
('James', '3.00', 'February'),
('Richard', '3.28', 'February'),
('Barb', '4.13', 'February'),
('James', '2.00', 'March'),
('Richard', '5.28', 'March'),
('Barb', '7.13', 'March');
And this is the query I tried that only gives the overall total instead of just the YTD total I'm after
with MonthAgg as (
select IFNULL(spendType,'Total') spendType
, Sum(Case when spendMonth='January' then spendAmt else null end ) January
, Sum(Case when spendMonth='February' then spendAmt else null end ) February
, Sum(Case when spendMonth='March' then spendAmt else null end ) March
, Sum(Case when spendMonth='April' then spendAmt else null end ) April
, Sum(Case when spendMonth='May' then spendAmt else null end ) May
, Sum(Case when spendMonth='June' then spendAmt else null end ) June
, Sum(Case when spendMonth='July' then spendAmt else null end ) July
, Sum(Case when spendMonth='August' then spendAmt else null end ) August
, Sum(Case when spendMonth='September' then spendAmt else null end ) September
, Sum(Case when spendMonth='October' then spendAmt else null end ) October
, Sum(Case when spendMonth='November' then spendAmt else null end ) November
, Sum(Case when spendMonth='December' then spendAmt else null end ) December
FROM empSales
GROUP BY spendType WITH ROLLUP)
, mycteSum as (
select spendMonth , Sum(spendAmt) Amt
FROM empSales
GROUP BY spendMonth
)
,mycteYTD as (
select spendMonth , sum(Amt) Over(order by Cast(spendMonth+ ' 01, 1900' as date) ) YTD
from mycteSum)
Select * from MonthAgg
UNION ALL
Select 'YTD' as summarySpend,
max(Case when spendMonth='January' then YTD else null end ) JanuaryYTD
, max(Case when spendMonth='February' then YTD else null end ) FebruaryYTD
, max(Case when spendMonth='March' then YTD else null end ) MarchYTD
, max(Case when spendMonth='April' then YTD else null end ) AprilYTD
, max(Case when spendMonth='May' then YTD else null end ) MayYTD
, max(Case when spendMonth='June' then YTD else null end ) JuneYTD
, max(Case when spendMonth='July' then YTD else null end ) JulyYTD
, max(Case when spendMonth='August' then YTD else null end ) AugustYTD
, max(Case when spendMonth='September' then YTD else null end ) SeptemberYTD
, max(Case when spendMonth='October' then YTD else null end ) OctoberYTD
, max(Case when spendMonth='November' then YTD else null end ) NovemberYTD
, max(Case when spendMonth='December' then YTD else null end ) DecemberYTD
from mycteYTD
jamesMandatory
(69 rep)
Jul 19, 2020, 07:12 PM
• Last activity: Jun 28, 2025, 12:09 PM
0
votes
1
answers
201
views
List Top Daily Rentals Per Customer using a CTE
I've been trying to learn CTEs and having a bit or trouble getting the hang of them. I wrote a query against the [Sakila Sample database][1] that lists information about horror movie rentals for each day. Here is the overblown (and redundant) SQL that I came up with: SELECT CONCAT(CU.last_name, ', '...
I've been trying to learn CTEs and having a bit or trouble getting the hang of them. I wrote a query against the Sakila Sample database that lists information about horror movie rentals for each day.
Here is the overblown (and redundant) SQL that I came up with:
SELECT CONCAT(CU.last_name, ', ', CU.first_name) AS customer,
A.phone,
F.title,
date(R.rental_date) AS rental_date
FROM sakila.rental R
LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id
LEFT JOIN sakila.film F ON I.film_id = F.film_id
LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id
LEFT JOIN sakila.category C ON FC.category_id = C.category_id
LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
LEFT JOIN sakila.address A ON CU.address_id = A.address_id
WHERE CU.customer_id in
(SELECT CU.customer_id
FROM rental R
LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id
LEFT JOIN sakila.film F ON I.film_id = F.film_id
LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id
LEFT JOIN sakila.category C ON FC.category_id = C.category_id
WHERE C.name = "Horror"
GROUP BY CU.customer_id
HAVING COUNT(CU.customer_id) >= 3)
AND C.name = "Horror"
ORDER BY customer, title, rental_date DESC;
And here are some of the results in my database client (Navicat for MySQL ):
**Is there a way to rewrite the query using a CTE?**
It seems like just the type of query that a CTE would be perfect for, if only I could figure it out!
Thanks!

Rob Gravelle
(123 rep)
May 12, 2020, 10:53 PM
• Last activity: Jun 23, 2025, 06:03 PM
0
votes
1
answers
266
views
Query Locks Table And Takes Long Time to Run
I have a query that can take a long time to run. I tried running the same query in a test server and it runs very quick. However, in production it seems to hang and causes blocking of other queries. Where do I begin to find a solution for this? Below is my code and the plan... https://www.brentozar....
I have a query that can take a long time to run. I tried running the same query in a test server and it runs very quick.
However, in production it seems to hang and causes blocking of other queries.
Where do I begin to find a solution for this? Below is my code and the plan...
https://www.brentozar.com/pastetheplan/?id=ryAQWfL2X
The issue seems to be with the CTE statement. Is there a more efficient way of writing this?
-- table to store break details
CREATE TABLE #PositionReconciliationCurrentBreaks
(
TransactionKey INT,
BreakOriginationDate DATE,
BreakAge DECIMAL (18,0)
);
TRUNCATE TABLE #PositionReconciliationCurrentBreaks;
WITH PreviousRecBreaks_CTE -- get breaks from previous rec
AS
(
SELECT
MAX(i.InvestmentID) AS InvestmentID,
PRT.PositionDate,
PRT.PositionReconciliationDetailKey,
MAX(PRT.PositionReconciliationTransactionKey) AS PositionReconciliationTransactionKey,
PRT.InvestmentKey,
MAX(PRT.GenevaInvestmentKey) AS GenevaInvestmentKey,
MAX(PRT.BrokerInvestmentKey) AS BrokerInvestmentKey,
PRT.PaymentCurrencyKey,
PRT.AccountKey,
CASE WHEN MAX(PRT.BreakOriginationDate) = '2100-01-01' THEN NULL ELSE MAX(PRT.BreakOriginationDate) END AS BreakOriginationDate,
MAX(PRT.BreakAge) AS BreakAge,
SUM(ISNULL(PRT.GenevaQuantity,0)) AS GenevaQuantity,
SUM(ISNULL(PRT.BrokerQuantity,0)) AS BrokerQuantity
FROM [Fund.Accounting.Model.PositionReconciliation].PositionReconciliationTransaction PRT -- get transactions from previous rec
INNER JOIN
[Fund.Accounting.Model.PositionReconciliation].PositionReconciliationDetail TRD -- check details to get Action on transaction
ON TRD.PositionReconciliationDetailKey = PRT.PositionReconciliationDetailKey -- join on Detail Key
inner join [Fund.Accounting.Model.TradeReconciliation].Investment i -- add in investmentID for debugging purpose
ON i.InvestmentKey = PRT.InvestmentKey
INNER JOIN
[Fund.Accounting.Model].[ReconciliationActions] RA -- actions
ON RA.ReconciliationActionID=TRD.ReconciliationActionID
AND (TRD.PositionCommentKey IS NOT NULL) --approved breaks must have a comment.
AND RA.Name 'Suppress' -- ensure not suppressed
AND RA.Name 'Auto Match' -- ensure not perfect match - we do want to include cross references with a qty diff
WHERE PRT.PositionReconciliationID = @PreviousPositionReconciliationId -- previous rec
GROUP BY -- added grouping for quantity breaks as x-refs will be 2 separate lines in PositionReconciliationTransaction tbl; We group to force this as 1 line
PRT.PositionDate,
PRT.PositionReconciliationDetailKey,
PRT.InvestmentKey,
PRT.PaymentCurrencyKey,
PRT.AccountKey
),CurrentRecBreaks_CTE -- get breaks from current rec
AS
(
SELECT
MAX(i.InvestmentID) AS InvestmentID,
PRT.PositionDate,
PRT.PositionReconciliationDetailKey,
MAX(PRT.PositionReconciliationTransactionKey) AS PositionReconciliationTransactionKey,
PRT.InvestmentKey,
MAX(PRT.GenevaInvestmentKey) AS GenevaInvestmentKey,
MAX(PRT.BrokerInvestmentKey) AS BrokerInvestmentKey,
PRT.PaymentCurrencyKey,
PRT.AccountKey,
CASE WHEN MAX(PRT.BreakOriginationDate) = '2100-01-01' THEN NULL ELSE MAX(PRT.BreakOriginationDate) END AS BreakOriginationDate,
MAX(PRT.BreakAge) AS BreakAge,
SUM(ISNULL(PRT.GenevaQuantity,0)) AS GenevaQuantity,
SUM(ISNULL(PRT.BrokerQuantity,0)) AS BrokerQuantity
FROM [Fund.Accounting.Model.PositionReconciliation].PositionReconciliationTransaction PRT -- get transactions from previous rec
INNER JOIN
[Fund.Accounting.Model.PositionReconciliation].PositionReconciliationDetail TRD -- check details to get Action on transaction
ON TRD.PositionReconciliationDetailKey = PRT.PositionReconciliationDetailKey -- join on Detail Key
inner join [Fund.Accounting.Model.TradeReconciliation].Investment i -- add in investmentID for debugging purpose
ON i.InvestmentKey = PRT.InvestmentKey
LEFT JOIN
[Fund.Accounting.Model].[ReconciliationActions] RA -- actions
ON RA.ReconciliationActionID=TRD.ReconciliationActionID
AND RA.Name 'Suppress' -- ensure not suppressed
AND RA.Name 'Auto Match' -- ensure not perfect match - we do want to include cross references with a qty diff
WHERE PRT.PositionReconciliationID = @PositionReconciliationID -- current rec
GROUP BY -- added grouping for quantity breaks as x-refs will be 2 separate lines in PositionReconciliationTransaction tbl; We group to force this as 1 line
PRT.PositionDate,
PRT.PositionReconciliationDetailKey,
PRT.InvestmentKey,
PRT.PaymentCurrencyKey,
PRT.AccountKey
)
INSERT INTO #PositionReconciliationCurrentBreaks
(
TransactionKey,
BreakOriginationDate,
BreakAge
)
SELECT -- get breaks in current rec that also appeared in previous rec
C.PositionReconciliationTransactionKey
,ISNULL(P.BreakOriginationDate,@PreviousPositionReconciliationPeriod) --set ro previous date if it is first time presence.
,CASE DATEDIFF(DAY, P.PositionDate, C.PositionDate)
WHEN 0 THEN 1 --set to 1 day old, if it is 1st time
ELSE ISNULL(P.BreakAge,0) + DATEDIFF(DAY, P.PositionDate, C.PositionDate) -- get no. of days difference between current rec and previous one
END AS BreakAge
FROM PreviousRecBreaks_CTE P
INNER JOIN
CurrentRecBreaks_CTE C
ON P.AccountKey = C.AccountKey
AND P.PaymentCurrencyKey=C.PaymentCurrencyKey
AND P.InvestmentKey=C.InvestmentKey
--AND (P.GenevaInvestmentKey=C.GenevaInvestmentKey
-- OR P.BrokerInvestmentKey=C.BrokerInvestmentKey)
AND P.GenevaQuantity=C.GenevaQuantity
AND P.BrokerQuantity=C.BrokerQuantity
WHERE NOT EXISTS(
-- exclude breaks that are now perfect matches (or suppressed???)
SELECT
D.PositionReconciliationDetailKey
FROM
[Fund.Accounting.Model.PositionReconciliation].PositionReconciliationDetail D
INNER JOIN -- join on AggregatePositionReconciliationTransaction so we can get the Qty Diff value
[Fund.Accounting.Facade.PositionReconciliation].[AggregatePositionReconciliationTransaction] (@PositionReconciliationID,NULL,NULL) A
ON A.PositionReconciliationDetailKey = D.PositionReconciliationDetailKey
WHERE D.PositionReconciliationID = @PositionReconciliationID -- current rec
AND A.QuantityDifference = 0 -- perfect matches
AND D.PositionReconciliationDetailKey = C.PositionReconciliationDetailKey -- join details to transactions
);
Kevin
(533 rep)
Oct 30, 2018, 05:19 PM
• Last activity: May 18, 2025, 06:03 AM
0
votes
0
answers
31
views
is it possible to make the CTE parallel in PostgreSQL 14
I have an analyse query in PostgreSQL 14 that use many CTEs, like this: with a0 as (select count(*) from ccc0) , a1 as (select count(*) from ccc1) , a2 as (select count(*) from ccc2) , a3 as (select count(*) from ccc3) , a4 as (select count(*) from ccc4) , a5 as (select count(*) from ccc5) , a6 as (...
I have an analyse query in PostgreSQL 14 that use many CTEs, like this:
with
a0 as (select count(*) from ccc0) ,
a1 as (select count(*) from ccc1) ,
a2 as (select count(*) from ccc2) ,
a3 as (select count(*) from ccc3) ,
a4 as (select count(*) from ccc4) ,
a5 as (select count(*) from ccc5) ,
a6 as (select count(*) from ccc6) ,
a7 as (select count(*) from ccc7) ,
a8 as (select count(*) from ccc8) ,
a9 as (select count(*) from ccc9) ,
a10 as (select count(*) from ccc10) ,
a11 as (select count(*) from ccc11) ,
a12 as (select count(*) from ccc12) ,
a13 as (select count(*) from ccc13) ,
a14 as (select count(*) from ccc14) ,
a15 as (select count(*) from ccc15) ,
a16 as (select count(*) from ccc16) ,
a17 as (select count(*) from ccc17) ,
a18 as (select count(*) from ccc18) ,
a19 as (select count(*) from ccc19) ,
a20 as (select count(*) from ccc20) ,
a21 as (select count(*) from ccc21) ,
a22 as (select count(*) from ccc22) ,
a23 as (select count(*) from ccc23)
select * from a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23;
I read some docs that tell
set max_parallel_workers_per_gather
can make the CTE parallel https://github.com/digoal/blog/blob/master/201903/20190317_17.md , and the https://www.postgresql.org/docs/current/parallel-safety.html docs told:
> The following operations are always parallel restricted:
>
> Scans of common table expressions (CTEs).
>
> Scans of temporary tables.
>
> Scans of foreign tables, unless the foreign data wrapper has an
> IsForeignScanParallelSafe API that indicates otherwise.
>
> Plan nodes that reference a correlated SubPlan.
is it possible to make the CTE parallel? now each CTE takes more than 7s make the query so slow. I have tried to add index and the result of each CTE more than 100m rows(the data could not filter anymore). if the parallel works, the result can be fetched 7s, if not, the result will take minites to fetch the result.
Dolphin
(939 rep)
May 10, 2025, 04:06 AM
0
votes
1
answers
630
views
Difference between a query based on CTE and a simple query
i'm doing a select starting from a table named `events` that is joined with device measurements using a date between function (this time between is a sort of "mobile windowing") in order to get the measurements that are included between the event log insert time and a -1 hour inverval. After that, d...
i'm doing a select starting from a table named
events
that is joined with device measurements using a date between function (this time between is a sort of "mobile windowing") in order to get the measurements that are included between the event log insert time and a -1 hour inverval.
After that, due to the fact that the measurement is a JSON structure, i use a cross join to unpack the ap_info
object and then get the ap_name
, ap_ip
and neigh_name
of the latest measurement that is present on db right before the event occours.
My two attempts:
**First approach** *(very slow, also and unusable when there is no match on the concat
filter expression)*:
Select
p.neigh_name
from tv_smartdevicemeasurement_snmp
cross join jsonb_to_recordset(tv_smartdevicemeasurement_snmp.data->'ap_info') as p(ap_name text,ap_ip text, neigh_name text)
inner join tv_event on tv_event.name = p.ap_name
where smart_device_id = 3 and (tv_smartdevicemeasurement_snmp.insert_time tv_event.insert_time - '1 hour'::interval)
and is_event_ack = false
and (CONCAT('Host: ', p.ap_name, ' - IP: ' , p.ap_ip) = 'Host: AP-04 - IP: 10.50.2.130')
order by tv_smartdevicemeasurement_snmp.insert_time desc
limit 1
Explain: HERE
**CTE Method** *(goes without any evident issue)*:
with cte_temp as (select * from (
select tv_event.insert_time, tv_smartdevicemeasurement_snmp.data from tv_event
inner join tv_smartdevicemeasurement_snmp on
(tv_smartdevicemeasurement_snmp.insert_time tv_event.insert_time - '1 hour'::interval)
where is_event_ack = false and tv_smartdevicemeasurement_snmp.smart_device_id = 3) as join_non_ack_evt_meas
) Select insert_time, ap_name, ap_ip, neigh_name from cte_temp
cross join jsonb_to_recordset(cte_temp.data->'ap_info') as p(ap_name text,ap_ip text, neigh_name text)
where (CONCAT('Host: ', ap_name, ' - IP: ' , ap_ip) = 'Host: AP-04 - IP: 10.50.2.130')
order by insert_time desc
limit 1
Explain: HERE
The question is: why there is so much difference between these two queries? i mean, they works on the same tables! please explain me in which differs: the only things that i'm seeing is that this approach is working on a large row subset
And, in addition, you have any suggest to optimize my second approach?
Postgres Version: "PostgreSQL 11.13 on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.3.1_git20210424) 10.3.1 20210424, 64-bit"
VirtApp
(191 rep)
Dec 16, 2021, 09:53 AM
• Last activity: Apr 24, 2025, 03:02 PM
1
votes
0
answers
56
views
PostgreSQL 16: WITH/CTE versus Non-Deferrable Constraints
I am using PostgreSQL 16 upwards. I am currently trying to create an example for a relationship of type `G-|o-----| = 1 H. x CHAR(3), -- example for other attributes CONSTRAINT g_h_fk FOREIGN KEY (h, id) REFERENCES h (id, g) ); -- To table H, we add the foreign key reference constraint towards g. AL...
I am using PostgreSQL 16 upwards.
I am currently trying to create an example for a relationship of type `G-|o-----|= 1 H.
x CHAR(3), -- example for other attributes
CONSTRAINT g_h_fk FOREIGN KEY (h, id) REFERENCES h (id, g)
);
-- To table H, we add the foreign key reference constraint towards g.
ALTER TABLE h ADD CONSTRAINT h_g_fk FOREIGN KEY (g) REFERENCES g (id);
Then I can insert data into the tables and read them back out as follows:
/* Insert into tables for G-|o-----|<-H relationship. */
-- Insert some rows into the table for entity type H.
-- Not specifying g
leave the references G as NULL for now.
INSERT INTO h (y) VALUES ('AB'), ('CD'), ('EF'), ('GH'), ('IJ');
-- Insert into G and relate to H. We do this three times.
WITH g_id AS (INSERT INTO g (h, x) VALUES (1, '123') RETURNING id)
UPDATE h SET g = g_id.id FROM g_id WHERE h.id = 1;
WITH g_id AS (INSERT INTO g (h, x) VALUES (3, '456') RETURNING id)
UPDATE h SET g = g_id.id FROM g_id WHERE h.id = 3;
WITH g_id AS (INSERT INTO g (h, x) VALUES (4, '789') RETURNING id)
UPDATE h SET g = g_id.id FROM g_id WHERE h.id = 4;
-- Link one H row to another G row. (We do this twice.)
UPDATE h SET g = 3 WHERE id = 2;
UPDATE h SET g = 3 WHERE id = 5;
-- Combine the rows from G and H.
SELECT g.id AS g_id, g.x, h.id AS h_id, h.y FROM h
INNER JOIN g ON g.id = h.g;
```
This still requires the use of Common Table Expressions.
However, by now, I am fairly confident that this is OK.
Still, I am not 100% sure.
I think both approaches do work and I could not find an error with either of them.
But the two table method is probably more efficient and more elegant.
@Akina was right.
Thomas Weise
(111 rep)
Apr 20, 2025, 09:00 AM
• Last activity: Apr 22, 2025, 05:34 AM
0
votes
1
answers
762
views
Can EXPLAIN be used to get some insights about Common Table Expression (CTE)?
I need to find the ancestors in a table having an ID value and a Google research point me to recursive [Common Table Expression (CTE)][1] I wonder if EXPLAIN can be used to get some insights on how MySQL handles this and if there is room for optimization. Also is this the best strategy to get ancest...
I need to find the ancestors in a table having an ID value and a Google research point me to recursive Common Table Expression (CTE) I wonder if EXPLAIN can be used to get some insights on how MySQL handles this and if there is room for optimization.
Also is this the best strategy to get ancestors in a table? I could involve code in here but don't want to because is gonna lead me to loops and probably multiple SELECT statements sent to the DB.
The SQL I wrote following the previous article is as follow:
WITH RECURSIVE page_revisions_path (id, page_id, parent_id) AS
(
SELECT id, page_id, parent_id
FROM page_revisions
WHERE parent_id = 'some_id'
UNION ALL
SELECT c.id, c.page_id, c.parent_id
FROM page_revisions_path AS cp JOIN page_revisions AS c ON cp.id = c.parent_id
)
SELECT * FROM page_revisions_path;
Here is also the SHOW CREATE TABLE
query result:
CREATE_TABLE page_revisions
(
id
varchar(26) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
page_id
varchar(26) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
parent_id
varchar(26) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
routable
tinyint(1) NOT NULL,
PRIMARY KEY (id
),
KEY IDX1
(page_id
),
KEY IDX2
(parent_id
),
CONSTRAINT FK1
FOREING KEY (parent_id
) REFERENCES page_revisions
(id
),
CONSTRAINT FK2
FOREING KEY (page_id
) REFERENCES pages
(id
)
) ENGINE=InnoDB
ReynierPM
(1888 rep)
Feb 7, 2022, 10:13 PM
• Last activity: Apr 18, 2025, 05:07 PM
0
votes
1
answers
632
views
How to make use of WITH in PostgreSQL to reduce query cost
I am trying to get twitter data of narendramodi using below command. SELECT b.t_id,a.profile_image,b.tweet_text,e.media_media_url,b.retweet_count,b.favorite_count as like_count,count(reply_to_status_id) as reply_count,f.imp_count,f.eng_count,f.eng_rate FROM twitter_users a LEFT JOIN twitter_tweets b...
I am trying to get twitter data of narendramodi using below command.
SELECT b.t_id,a.profile_image,b.tweet_text,e.media_media_url,b.retweet_count,b.favorite_count as like_count,count(reply_to_status_id) as reply_count,f.imp_count,f.eng_count,f.eng_rate
FROM twitter_users a
LEFT JOIN twitter_tweets b on a.user_id=b.user_id
LEFT JOIN replies c on b.t_id = c.t_id
LEFT JOIN media e on b.t_id = e.t_id
LEFT JOIN twitter_tweet_metric_aggregates f
ON f.metric_timestamp=(select max(metric_timestamp)
FROM twitter_tweet_metric_aggregates g
WHERE g.t_id=f.t_id and g.t_id=b.t_id)
WHERE a.twitter_screen_name= 'narendramodi'
GROUP BY b.t_id,a.profile_image
,b.tweet_text,b.retweet_count,b.favorite_count,
e.media_media_url,f.imp_count,f.eng_count,f.eng_rate);
Query was working correctly But, in the above query I have used sub-select to get recent data of imp_counts of each tweet based on timestamp. Because of this sub-select Query_cost was huge and so it was taking more than 15min for query execution. I want to reduce that and should able to execute within 10seconds. For that reason I was trying to use WITH (CTE) expression
WITH metric_counts AS (
SELECT max(metric_timestamp),f.t_id,f.imp_count,f.eng_count,f.eng_rate
FROM twitter_tweet_metric_aggregates f LEFT JOIN tweets b on
f.t_id=b.t_id
)
SELECT
b.t_id,a.profile_image,b.tweet_text,e.media_media_url,b.retweet_count
,b.favorite_count as like_count, count(reply_to_status_id) as
reply_count,metric_counts.imp_count
,metric_counts.eng_count,metric_counts.eng_rate
FROM twitter_users as a
LEFT JOIN tweets as b on a.twitter_user_id=b.twitter_user_id
LEFT JOIN replies c on b.t_id = c.t_id
LEFT JOIN media e on b.t_id = e.t_id
LEFT JOIN metric_counts on metric_counts.t_id = b.t_id
WHERE lower(a.twitter_screen_name)=lower('narendramodi')
GROUP BY b.t_id,a.profile_image,b.tweet_text,e.media_media_url,
b.retweet_count,b.favorite_count,
metric_counts.imp_count,metric_counts.eng_count,
metric_counts.eng_rate;
The above WITH expression was giving results of imp_counts also for each tweet but not giving latest record/value. Can anyone help me in achieving this.
Here is the Query cost of WITH query
HashAggregate (cost=1734856.13..1735618.48 rows=76235 width=673)
So can anyone help me to reduce cost to even lesser but giving results within 15 sec.
Query_cost
HashAggregate (cost=4923196.15..4923958.50 rows=76235 width=673) (actual time=51871.524..51872.333 rows=1513 loops=1)
Group Key: imp_counts.tweet_status_id, a.profile_image, b.tweet_text, b.tweet_created_at, d.sentiment, d.emotion, b.retweet_count, b.favorit
e_count, e.media_media_url, imp_counts.tweet_impression_count, imp_counts.tweet_engagement_count, imp_counts.tweet_engagement_rate
CTE imp_counts
-> Seq Scan on twitter_tweet_metric_aggregates f (cost=0.00..3356389.12 rows=17676 width=47) (actual time=37516.805..41642.354 rows=6069
9 loops=1)
Filter: (metric_timestamp = (SubPlan 2))
Rows Removed by Filter: 3475596
SubPlan 2
-> Result (cost=0.90..0.91 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=3536295)
InitPlan 1 (returns $1)
-> Limit (cost=0.56..0.90 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=3536295)
-> Index Only Scan Backward using pk_twitter_tweet_metric_aggregates on twitter_tweet_metric_aggregates g (cost=0.
56..55.57 rows=158 width=8) (actual time=0.010..0.010 rows=1 loops=3536295)
Index Cond: ((tweet_status_id = (f.tweet_status_id)::text) AND (metric_timestamp IS NOT NULL))
Heap Fetches: 3536295
-> Nested Loop Left Join (cost=1202381.45..1564329.39 rows=76235 width=673) (actual time=50478.887..51854.010 rows=10672 loops=1)
-> Nested Loop Left Join (cost=1202380.90..1472231.15 rows=76235 width=641) (actual time=50478.871..51781.010 rows=10649 loops=1)
-> Hash Right Join (cost=1202380.34..1362479.87 rows=76235 width=626) (actual time=50478.841..51702.556 rows=10649 loops=1)
Hash Cond: ((c.tweet_status_id)::text = (b.tweet_status_id)::text)
-> Seq Scan on twitter_tweet_replies c (cost=0.00..150216.68 rows=2606068 width=38) (actual time=0.041..692.358 rows=260
6068 loops=1)
-> Hash (cost=1201427.40..1201427.40 rows=76235 width=607) (actual time=50477.837..50477.837 rows=1499 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 1497kB
-> Merge Left Join (cost=1200957.10..1201427.40 rows=76235 width=607) (actual time=50446.646..50476.746 rows=1499
loops=1)
Merge Cond: ((b.tweet_status_id)::text = (imp_counts.tweet_status_id)::text)
-> Sort (cost=1199356.58..1199547.17 rows=76235 width=293) (actual time=8608.183..8608.530 rows=1499 loops=1
)
Sort Key: b.tweet_status_id
Sort Method: quicksort Memory: 597kB
-> Hash Right Join (cost=29591.36..1193174.62 rows=76235 width=293) (actual time=3.714..8604.233 rows=
1499 loops=1)
Hash Cond: ((b.twitter_user_id)::text = (a.twitter_user_id)::text)
-> Seq Scan on twitter_tweets b (cost=0.00..1095151.30 rows=18045230 width=230) (actual time=0.0
26..5037.377 rows=18044981 loops=1)
-> Hash (cost=29214.36..29214.36 rows=30160 width=89) (actual time=0.025..0.025 rows=1 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 257kB
-> Index Scan using twitter_screen_name_idx on twitter_users a (cost=0.56..29214.36 rows=3
0160 width=89) (actual time=0.020..0.021 rows=1 loops=1)
Index Cond: (lower((twitter_screen_name)::text) = 'narendramodi'::text)
-> Sort (cost=1600.52..1644.71 rows=17676 width=314) (actual time=41838.456..41847.253 rows=60655 loops=1)
Sort Key: imp_counts.tweet_status_id
Sort Method: quicksort Memory: 8540kB
-> CTE Scan on imp_counts (cost=0.00..353.52 rows=17676 width=314) (actual time=37516.810..41684.563 r
ows=60699 loops=1)
-> Index Scan using tweet_sentiment_index on tweet_sentiment d (cost=0.56..1.43 rows=1 width=34) (actual time=0.006..0.007 row
s=0 loops=10649)
Index Cond: ((b.tweet_status_id)::text = (tweet_status_id)::text)
-> Index Only Scan using twitter_tweet_media_pkey on twitter_tweet_media e (cost=0.55..1.20 rows=1 width=70) (actual time=0.006..0.0
06 rows=0 loops=10649)
Index Cond: (tweet_status_id = (b.tweet_status_id)::text)
Heap Fetches: 1074
Planning time: 2.913 ms
Execution time: 51875.165 ms
(43 rows)
bunny sunny
(113 rep)
Mar 7, 2019, 04:28 AM
• Last activity: Apr 11, 2025, 10:06 PM
0
votes
1
answers
3553
views
How to create multiple temp tables using records from a CTE that I need to call multiple times in Postgres plpgsql Procedure?
I am already using a CTE expression within a plpgsql Procedure to grab some Foreign Keys from (1) specific table, we can call it `master_table`. I created a brand new table, we can call this table `table_with_fks`, in my DDL statements so this table holds the FKs I am fetching and saving. I later ta...
I am already using a CTE expression within a plpgsql Procedure to grab some Foreign Keys from (1) specific table, we can call it
My CTE example:
master_table
. I created a brand new table, we can call this table table_with_fks
, in my DDL statements so this table holds the FKs I am fetching and saving.
I later take these FKs from my table_with_fks
and JOIN on my other tables in my database to get the entire original record (the full record with all columns from its corresponding table) and insert it into an archive table.
I have an awesome lucid chart I drew that might make what I say down below make much more sense:

LOOP
EXIT WHEN some_condition;
WITH fk_list_cte AS (
SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
FROM master_table mt
WHERE mt.created_date = 1 year old
LIMIT 10000
)
INSERT INTO table_with_fks (SELECT * FROM fk_list_cte);
commit;
END LOOP;
Now, I have (4) other Procedures that JOIN
on each FK in this table_with_fks
with its **parent** table that it references. I do this because as I said, I only got the FK at first, and I don't have all the original columns for the record. So I will do something like
LOOP
EXIT WHEN some_condition;
WITH full_record_cte AS (
SELECT *
FROM table_with_fks fks
JOIN parent_table1 pt1
ON fks.fk1 = pt1.id
LIMIT 10000),
INSERT INTO (select * from full_record_cte);
commit;
END LOOP;
***NOW***, what I want to do, is instead of having to RE-JOIN 4 times later on these FK's that are found in my table_with_fks
, I want to use the first CTE fk_list_cte
to JOIN on the parent tables right away and grab the full record from each (4) tables and put it in some TEMP postgres table. I think I will need (4) unique TEMP tables, as I don't know how it would work if I combine all their data into one BIG table, because each table has different data/different columns.
Is there a way to use the original CTE fk_list_cte
and call it multiple times in succession and CREATE 4 TEMP tables right after, that all use the original CTE? example:
LOOP
EXIT WHEN some_condition;
WITH fk_list_cte AS (
SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
FROM master_table mt
WHERE mt.created_date = 1 year old
LIMIT 10000
),
WITH fetch_fk1_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table1 pt1
ON cte.fk1 = pt1.id
),
WITH fetch_fk2_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table2 pt2
ON cte.fk2 = pt2.id
),
WITH fetch_fk3_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table3 pt3
ON cte.fk3 = pt3.id
),
WITH fetch_fk4_original_record_from_parent AS (
SELECT *
FROM fk_list_cte cte
JOIN parent_table4 pt4
ON cte.fk4 = pt4.id
),
CREATE TEMPORARY TABLE fk1_tmp_tbl AS (
SELECT *
FROM fetch_fk1_original_record_from_parent
)
CREATE TEMPORARY TABLE fk2_tmp_tbl AS (
SELECT *
FROM fetch_fk2_original_record_from_parent
)
CREATE TEMPORARY TABLE fk3_tmp_tbl AS (
SELECT *
FROM fetch_fk3_original_record_from_parent
)
CREATE TEMPORARY TABLE fk4_tmp_tbl AS (
SELECT *
FROM fetch_fk4_original_record_from_parent
);
END LOOP;
I know the 4 CREATE TEMPORARY TABLE
statements definitely won't work, (can I create 4 temp tables simultaneously/at once?) . Does anyone see the logic of what I am trying to do here and can help me?
PainIsAMaster
(131 rep)
Oct 9, 2021, 04:01 AM
• Last activity: Apr 7, 2025, 12:15 AM
0
votes
1
answers
5330
views
Unrecognized configuration parameter when using set_config(text, text, boolean) in CTE
I'm trying to implement a simple ABAC system using row level security, with main policy defined as following: CREATE policy resource_access ON resource USING ( ( org_id::varchar = current_setting('scope.org.id', true) ) AND ( acl_read && regexp_split_to_array(current_setting('scope.acl'), ',')::varc...
I'm trying to implement a simple ABAC system using row level security, with main policy defined as following:
CREATE policy resource_access ON resource
USING (
(
org_id::varchar = current_setting('scope.org.id', true)
)
AND
(
acl_read && regexp_split_to_array(current_setting('scope.acl'), ',')::varchar[]
)
)
Issuing queries akin to:
WITH
acl AS (SELECT set_config('scope.acl', 'ACL', true) "__acl"),
result AS ( ... )
SELECT * FROM acl, result
With the main reason to use **WITH** is to avoid multiple statements when queries are later **PREPARE**d and **EXECUTE**d by the Postgres driver I'm using.
The **result** in example above can contain any arbitrary queries required by the application. To ensure that **set_config** is executed in the query, it's also added to the final **SELECT**.
However, I still do consistently encounter the following error:
QueryFailedError: unrecognized configuration parameter "scope.acl"
Which appears to be caused by executing the subquery from
WITH
in isolation from the result query.
So the main questions are:
- Is there any elegant way to ensure running set_config
before the main query (the one in result
) is executed?
- Is there any better way to construct queries for the application side, to avoid using WITH
, but keeping them as a single SQL statement?
Thank you!
Ivan C.
(1 rep)
Nov 7, 2020, 09:49 AM
• Last activity: Mar 21, 2025, 09:06 PM
1
votes
1
answers
82
views
How to adapt a recursive CTE over multiple tables?
I'm using PostgreSQL 17 --- I am modelling a package index for the Haskell ecosystem, and a feature that is useful is to determine transitive dependencies. Haskell packages can be normalised as: ``` Package (name + package-specific metadata) \-> Releases (version + release-specific metadata like syn...
I'm using PostgreSQL 17
---
I am modelling a package index for the Haskell ecosystem, and a feature that is useful is to determine transitive dependencies. Haskell packages can be normalised as:
Package
(name + package-specific metadata)
\-> Releases
(version + release-specific metadata like synopsis, attached data files)
\-> Components
(library, executable, test suite, benchmark suite)
\-> Dependencies
(Each component declares a dependency of a package name and version expression).
(Each of these sections are a table, and they are linked together by one-to-many relationships. One package links to many releases, each release links to many components, each component links to many dependencies)
For the purpose of my own enlightenment, I have first reduced the complexity of the model to create a CTE that does what I expect.
Especially, I don't use bigints as the PKs of the table in my codebase, but UUIDs.
(Full dbfiddle is available at https://dbfiddle.uk/hVOmMdYQ)
-- Data model where packages and versions are combined,
-- and dependencies refer to packages
create table packages (
package_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name text unique not null,
version int[] not null
);
create unique index on packages(name, version);
create table dependencies (
dependency_id bigint generated always as identity PRIMARY KEY,
dependent_id bigint references packages(package_id),
depended_id bigint references packages(package_id)
);
create unique index on dependencies(dependent_id, depended_id);
And here is the data:
insert into packages (name, version) values ('base', '{1,0,0,0}');
insert into packages (name, version) values ('vector', '{0,0,7,0}');
insert into packages (name, version) values ('random', '{0,1,5,8}');
insert into packages (name, version) values ('unix', '{1,2,1,0}');
insert into packages (name, version) values ('time', '{3,14,1,2}');
insert into dependencies (dependent_id, depended_id) values (2, 1);
insert into dependencies (dependent_id, depended_id) values (3, 1);
insert into dependencies (dependent_id, depended_id) values (3, 2);
insert into dependencies (dependent_id, depended_id) values (4, 1);
insert into dependencies (dependent_id, depended_id) values (5, 1);
insert into dependencies (dependent_id, depended_id) values (5, 3);
insert into dependencies (dependent_id, depended_id) values (5, 4);
Here is a preliminary result:
select dependent.package_id, dependent.name as dependent, depended.name as depended
from dependencies as d1
inner join packages as dependent on d1.dependent_id = dependent.package_id
inner join packages as depended on d1.depended_id = depended.package_id;
| package_id | dependent | depended |
|-----------:|-----------|----------|
| 2 | vector | base |
| 3 | random | base |
| 3 | random | vector |
| 4 | unix | base |
| 5 | time | base |
| 5 | time | random |
| 5 | time | unix |
Until now, every looks good. I then made this recursive CTE to create a view of transitive dependencies, with breadcrumbs:
with recursive transitive_dependencies ( dependent_id, dependent, depended_id, breadcrumbs) as
( select dependent.package_id as dependent_id
, dependent.name as dependent
, depended.package_id as depended_id
, concat_ws(' > ', dependent.name, depended.name) as breadcrumbs
from dependencies as d1
inner join packages as dependent on d1.dependent_id = dependent.package_id
inner join packages as depended on d1.depended_id = depended.package_id
where dependent_id = 5
union all
select dependent.package_id as dependent_id
, dependent.name as dependent
, depended.package_id as depended_id
, concat_ws(' > ', t2.breadcrumbs, depended.name) as breadcrumbs
from dependencies as d1
inner join packages as dependent on d1.dependent_id = dependent.package_id
inner join packages as depended on d1.depended_id = depended.package_id
inner join transitive_dependencies as t2 on t2.depended_id = dependent.package_id -- ← This is where we refer to the CTE
)
cycle dependent_id set is_cycle using path
select t3.dependent_id
, t3.dependent
, t3.depended_id
, t3.breadcrumbs
from transitive_dependencies as t3;
| dependent_id | dependent | depended_id | breadcrumbs |
|-------------:|-----------|------------:|-------------------------------|
| 5 | time | 1 | time > base |
| 5 | time | 3 | time > random |
| 5 | time | 4 | time > unix |
| 3 | random | 1 | time > random > base |
| 3 | random | 2 | time > random > vector |
| 4 | unix | 1 | time > unix > base |
| 2 | vector | 1 | time > random > vector > base |
Behold, it works!
---
Now, I am looking into splitting things a bit further. Namely, package and release will be separated. This is due to the fact that there is some metadata specific to the Haskell ecosystem that targets the notion of "package" and some that is only relevant to "releases", and they are not interchangeable.

-- Data model where packages and releases are separated
create table packages2 (
package_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name text unique not null
);
create table releases2 (
release_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
package_id bigint references packages2,
version text not null
);
create unique index on releases2(package_id, version);
create table dependencies2 (
dependency_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
release_id bigint references releases2 not null,
package_id bigint references packages2 not null,
requirement int[] not null
);
And here is the data
insert into packages2 (name) values ('base'); -- 1
insert into packages2 (name) values ('vector'); -- 2
insert into packages2 (name) values ('random'); -- 3
insert into packages2 (name) values ('unix'); -- 4
insert into packages2 (name) values ('time'); -- 5
insert into releases2 (package_id, version) values (1, '{1,0,0,0}');
insert into releases2 (package_id, version) values (2, '{0,0,7,0}');
insert into releases2 (package_id, version) values (3, '{0,1,5,8}');
insert into releases2 (package_id, version) values (4, '{1,2,1,0}');
insert into releases2 (package_id, version) values (5, '{3,14,1,2}');
insert into dependencies2 (release_id, package_id, requirement) values ( 2, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 3, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 3, 2, '>= 0.0.7.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 4, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 5, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 5, 3, ' ', p2.name, p3.name) as breadcrumbs
from dependencies2 as d0
-- Dependent releases
inner join releases2 as r1 on d0.release_id = r1.release_id
-- Dependent packages
inner join packages2 as p2 on r1.package_id = p2.package_id
-- Dependencies packages
inner join packages2 as p3 on d0.package_id = p3.package_id
where r1.release_id = 5
union
select p2.package_id as dependent_id
, p2.name as dependent
, p3.package_id as dependency_id
, concat_ws(' > ', p2.name, p3.name) as breadcrumbs
from dependencies2 as d0
-- Dependent releases
inner join releases2 as r1 on d0.release_id = r1.release_id
-- Dependent packages
inner join packages2 as p2 on r1.package_id = p2.package_id
-- Dependencies packages
inner join packages2 as p3 on d0.package_id = p3.package_id
inner join transitive_dependencies2 as t2 on t2.dependency_id = p2.package_id ← This is where we refer to the CTE
)
cycle dependent_id set is_cycle using path
select t3.dependent_id
, t3.dependent
, t3.dependency_id
, t3.breadcrumbs
from transitive_dependencies2 as t3;
Quite unfortunately, this does not give the expected result:
| dependent_id | dependent | dependency_id | breadcrumbs |
|-------------:|-----------|--------------:|-----------------|
| 5 | time | 1 | time > base |
| 5 | time | 3 | time > random |
| 5 | time | 4 | time > unix |
| 3 | random | 1 | random > base |
| 3 | random | 2 | random > vector |
| 4 | unix | 1 | unix > base |
| 2 | vector | 1 | vector > base |
My question is as follow: How can I build my intuition to further split a CTE that works, over more granular tables? I'm still very new to all of this, and this is my first "real-world" use case of CTEs.
Happy to clarify or disambiguate things.
Through this, I'm also interested in best practices when it comes to data modelling. I was warned in the past against storing arrays of foreign keys, for instance, and to strive and reach normal forms and splitting entities that have different life cycles.
Théophile Choutri de Tarlé
(84 rep)
Jan 27, 2025, 01:15 PM
• Last activity: Mar 21, 2025, 09:30 AM
6
votes
1
answers
666
views
CTE Error when installing sp_WhoIsActive
I have several identical (near as I can tell) SQL Servers where I've recently added sp_WhoIsActive (showing some folks how much I like this tool) but one of them will not let me create the stored procedure. I get an error I'm quite familiar with about a CTE needing the previous statement to end with...
I have several identical (near as I can tell) SQL Servers where I've recently added sp_WhoIsActive (showing some folks how much I like this tool) but one of them will not let me create the stored procedure. I get an error I'm quite familiar with about a CTE needing the previous statement to end with a semicolon. None of the other servers get this error.
I've boiled down the script to this to recreate the issue:
SET QUOTED_IDENTIFIER ON;
SET ANSI_PADDING ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET ANSI_WARNINGS ON;
SET NUMERIC_ROUNDABORT OFF;
SET ARITHABORT ON;
GO
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME = 'sp_WhoIsActive')
EXEC ('CREATE PROC dbo.sp_WhoIsActive AS SELECT ''stub version, to be replaced''')
GO
ALTER PROC dbo.sp_WhoIsActive
(
@filter sysname = ''
)
AS
BEGIN;
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SET QUOTED_IDENTIFIER ON;
SET ANSI_PADDING ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET ANSI_WARNINGS ON;
SET NUMERIC_ROUNDABORT OFF;
SET ARITHABORT ON;
IF @filter IS NULL
BEGIN;
RAISERROR('Input parameters cannot be NULL', 16, 1);
RETURN;
END;
--SELECT 'FIZZ' AS BUZZ;
WITH
a0 AS
(SELECT 'FOO' AS BAR)
SELECT * FROM a0;
END;
GO
IF you un-comment the SELECT 'FIZZ' AS BUZZ;
line, it gets created.
Anyone have an idea what is different about this server?
Things I've checked:
* database compatibility levels for all dbs are > 100 (master, model, etc are 140)
* SQL Servers are all 14.0.344.5
* No availability Groups
* SSMS 18.12.1
* I'm SA on all servers
* version 12 of sp_whoIsActive https://github.com/amachanic/sp_whoisactive/releases/tag/v12.00
Full text of actual error below:
Msg 319, Level 15, State 1, Procedure sp_WhoIsActive, Line 206 [Batch Start Line 11]
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 209 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 212 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 215 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 218 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 221 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 156, Level 15, State 1, Procedure sp_WhoIsActive, Line 225 [Batch Start Line 11]
Incorrect syntax near the keyword 'ORDER'.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 230 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 319, Level 15, State 1, Procedure sp_WhoIsActive, Line 443 [Batch Start Line 11]
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 446 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 449 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 452 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 455 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 458 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 156, Level 15, State 1, Procedure sp_WhoIsActive, Line 462 [Batch Start Line 11]
Incorrect syntax near the keyword 'ORDER'.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 468 [Batch Start Line 11]
Incorrect syntax near ','.
Completion time: 2023-01-06T15:04:08.0850888-05:00
RandomCorpSQLDev
(63 rep)
Jan 6, 2023, 07:26 PM
• Last activity: Mar 15, 2025, 05:21 PM
0
votes
1
answers
470
views
How do I calculate a streak of data and show the data that corresponds to it on MySQL?
I'm currently working on a project with NBA stats and I want to check a player's longest streak of his highest amount of points on MySQL. I'm open to a 5-10 game streak to simplify it. For example: Kevin Durant scores 30, 32, 35, 36, 41 in 5 consecutive games and that's his highest one, how would I...
I'm currently working on a project with NBA stats and I want to check a player's longest streak of his highest amount of points on MySQL. I'm open to a 5-10 game streak to simplify it.
For example: Kevin Durant scores 30, 32, 35, 36, 41 in 5 consecutive games and that's his highest one, how would I go about extracting the game stats, points, the length of the streak AND showing the total amount of points during this streak all together.
I've tried using multiple CTEs with window functions, but I confuse myself as I'm doing it. Is there an easy or simple way to do this at all?
thisisforfun
(1 rep)
Jul 18, 2023, 05:49 PM
• Last activity: Feb 21, 2025, 03:04 PM
3
votes
2
answers
3644
views
What's the (Big O) computational complexity of a PostgreSQL recursive common table expression?
For example, taking this StackOverflow #44620695 question, [recursive path aggregation and CTE query for top-down tree postgres](https://stackoverflow.com/questions/44620695/recursive-path-aggregation-and-cte-query-for-top-down-tree-postgres/50685815#50685815) as an example, which uses a recursive C...
For example, taking this StackOverflow #44620695 question, [recursive path aggregation and CTE query for top-down tree postgres](https://stackoverflow.com/questions/44620695/recursive-path-aggregation-and-cte-query-for-top-down-tree-postgres/50685815#50685815) as an example, which uses a recursive CTE to traverse a tree structure to determine the paths from a starting node.
The screenshot above shows the the input data, the recursive CTE result, and a visualization of the source data.
Recursive CTE are iterative over the preceding result -- *right?* (as suggested in the accepted answer [here](https://stackoverflow.com/questions/35979198/understanding-steps-of-recursive-cte)) -- so would the time complexity be something like

O(log n)
?
Victoria Stuart
(191 rep)
Jun 8, 2018, 06:58 PM
• Last activity: Feb 16, 2025, 09:05 PM
Showing page 1 of 20 total questions