Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes

2 answers

3240 views

Optimizing a recursive CTE or replacing it with a temporary table

query-performance postgresql-10 cte recursive

I have a recursive query like this: ``` with recursive PCte(id) as ( select p1.id from Product p1 where p1.parent_id is null and p1.serial_number in ('123', '124','125', 'n') union all select p2.id from Product p2 inner join PCte cte on p2.parent_id = cte.id ) select * from Product p left join Produ...

I have a recursive query like this:

with recursive PCte(id) as 
(
    select p1.id from Product p1 where p1.parent_id is null and p1.serial_number in
    ('123', '124','125', 'n')
    union all
    select p2.id from Product p2 inner join PCte cte on p2.parent_id = cte.id
) 
select * 
from Product p 
left join Product psub on psub.parent_id = p.id 
where p.id in (select c.id from PCte c)

This query have a slow performance with a large number of children, are there any possible optimization ways? If it is possible to replace the recursive CTE with a temporary table, I would like to see an example, thanks a lot.

Itan Reimbergh (21 rep)

Sep 28, 2021, 01:10 PM • Last activity: Aug 1, 2025, 02:06 AM

1 votes

1 answers

2767 views

Insert multiple rows into a table with id from other table if not exists insert to other table

postgresql insert cte

I have done similar task where I can insert a row into a table if data doesn't exists: WITH user_exists AS ( Select id from users where username='%s' ), user_new AS ( INSERT INTO users (username) SELECT w.username FROM (values ('%s')) w(username) WHERE not exists (SELECT 1 FROM users u WHERE u.usern...

                                  I have done similar task where I can insert a row into a table if data doesn't exists:

    WITH 
            user_exists AS (
                Select id from users where username='%s'
            ),
            user_new AS (
                INSERT INTO users (username)
                SELECT w.username FROM (values ('%s')) w(username)
                    WHERE not exists
                    (SELECT 1 FROM users u WHERE u.username = w.username)
                    returning id
            )
    INSERT INTO feedbacks ('static_row', userid)
    SELECT 
    'static_data',
    (SELECT id FROM users_exists UNION ALL SELECT id FROM users_new) AS userid


Above works well when we insert a new row to feedbacks table. If user doesn't exists it inserts data in users table and returns id which is used for inserting data to feedbacks table.

But now my use case is, I have to insert multiple rows into the feedback table. Something like this:
user_variable = ['a','b', ...]

    Insert into feedbacks ('static_row', userid) 
    VALUES
    ('sample_data', (Select if from users where username='a')),
    ('sample_data', (Select if from users where username='b')),
    ('sample_data', (Select if from users where username='c'))  

For above case, how we can insert a new row to users table if username='b' doesn't exist?
                                

undefined (151 rep)

Feb 4, 2021, 11:18 AM • Last activity: Jul 21, 2025, 09:02 PM

0 votes

1 answers

148 views

How to have CTE query recurse parent query result one by one?

postgresql cte recursive

I have this query used on Postgres to return place administrative levels. ``` WITH RECURSIVE hierarchy(name, pinyin, level) AS ( SELECT p.name, p.pinyin, p.level, p.upper_place_object_id_fk from place p where p.pinyin = 'Jiulong' UNION ALL SELECT up.name, up.pinyin, up.level, up.upper_place_object_i...

I have this query used on Postgres to return place administrative levels.

WITH RECURSIVE hierarchy(name, pinyin, level) AS (
	SELECT p.name, p.pinyin, p.level, p.upper_place_object_id_fk 
    from place p 
    where p.pinyin = 'Jiulong'
	UNION ALL
	SELECT up.name, up.pinyin, up.level, up.upper_place_object_id_fk 
    from hierarchy h
	INNER JOIN place up ON h.upper_place_object_id_fk = up.object_id
	WHERE h.upper_place_object_id_fk IS NOT NULL
)
SELECT h.name, h.pinyin, h.level from hierarchy h;

It works just fine for a single match, e.g.

"Jiulong"	4
"Yingde"	3
"Qingyuan"	2
"Guangdong"	1

But of course, there can be several places with the same name (and a different position in the admin hierarchy)

"Fenggang"	4
"Fenggang"	4
"Huaiji"	3
"Dongguan"	2
"Zhaoqing"	2
"Guangdong"	1
"Guangdong"	1

In this case the result doesn't make sense as I can't reconstruct the proper hierarchy. And that's because the recursion occurs for every result of the parent query at the same time. It'd work if the recursion could be done until the exit condition for each result of the parent. It is possible to control how the recursion is done? Place Table script

CREATE TABLE public.place
(
    object_id uuid NOT NULL DEFAULT uuid_generate_v4(),
    upper_place_object_id_fk uuid,
    name character varying(50) COLLATE pg_catalog."default" NOT NULL,
    level smallint NOT NULL,
    CONSTRAINT pk_place PRIMARY KEY (object_id),
    CONSTRAINT unique_place UNIQUE (name, upper_place_object_id_fk, level)
,
    CONSTRAINT fk_place_upper_pla_place FOREIGN KEY (upper_place_object_id_fk)
        REFERENCES public.place (object_id) MATCH SIMPLE
        ON UPDATE RESTRICT
        ON DELETE RESTRICT
        NOT VALID
)

and here are the relevant data from the table

object_id                               upper_place_object_id_fk                 name      level 
"540ed8e9-c456-4a46-8049-1332f4ab52c1"	NULL									"Guangdong"	1	false
"4f7c7071-f917-4ab8-82c9-75aff0ba44d0"	"540ed8e9-c456-4a46-8049-1332f4ab52c1"	"Dongguan"	2	false
"afce8ec3-599f-4cea-9227-29bb750d4b01"	"540ed8e9-c456-4a46-8049-1332f4ab52c1"	"Zhaoqing"	2	false
"714b8e23-c346-4370-8697-25bb09497342"	"afce8ec3-599f-4cea-9227-29bb750d4b01"	"Huaiji "	3	false
"0461963d-e886-4149-bded-ee08216dd374"	"714b8e23-c346-4370-8697-25bb09497342"	"Fenggang"	4	false
"6955f89b-b7d3-4b1e-b21e-fb584d7f165b"	"712c1f45-24ac-4711-a47f-22513afc6fdd"	"Dongguan"	4	true
"d931f691-0ec7-4fc7-9197-b15461e984e0"	"4f7c7071-f917-4ab8-82c9-75aff0ba44d0"	"Fenggang"	4	true

coolnodje (101 rep)

Oct 10, 2019, 04:07 PM • Last activity: Jul 19, 2025, 07:05 PM

3 votes

1 answers

170 views

Why delete and insert in a CTE works despite UNIQUE constraint?

postgresql cte unique-constraint

Given a table with a unique constraint on a column ```sql CREATE TABLE t ( id int GENERATED ALWAYS AS IDENTITY, name text NOT NULL UNIQUE ); ``` Let's create a single record ```sql INSERT INTO t (name) VALUES ('a'); ``` When I try to delete an existing record and insert a new one with the same value...

Given a table with a unique constraint on a column

CREATE TABLE t
(
    id   int GENERATED ALWAYS AS IDENTITY,
    name text NOT NULL UNIQUE
);

Let's create a single record

INSERT INTO t (name) VALUES ('a');

When I try to delete an existing record and insert a new one with the same value for the unique column in a single statement using CTE, it fails as expected by me:

WITH
    deleted_cte AS (DELETE FROM t WHERE name = 'a' RETURNING id),
    inserted_cte AS (INSERT INTO t (name) VALUES ('a') RETURNING id)
SELECT 1;

-- ERROR:  duplicate key value violates unique constraint "t_name_key"
-- DETAIL:  Key (name)=(a) already exists.

I expect DELETE and INSERT commands to run concurrently in an unspecified order and see the same snapshot of the table. However, if I introduce dependency between the primary query and the deleting sub-statement, it works:

WITH
    deleted_cte AS (DELETE FROM t WHERE name = 'a' RETURNING id),
    inserted_cte AS (INSERT INTO t (name) VALUES ('a') RETURNING id)
SELECT id from deleted_cte; -- only this line modified
-- 1   Delete on t
--           ->  Seq Scan on t
--                 Filter: (name = 'a'::text)
--   CTE inserted_cte
--     ->  Insert on t t_1
--           ->  Result

I do not understand what happens here. Are sub-statements ordered now and INSERT is forced to run after DELETE? Is UNIQUE check somehow moved to the end of the whole CTE? Where in the docs can I read about this behavior or is it an unreliable thing? https://dbfiddle.uk/FYrW2n5W ---------- I realize that this exact delete-insert can be replaced with upsert. But in the actual code records aren't hard deleted. They instead are being soft deleted using UPDATE t SET deleted_at = now() and then new records inserted. Unique constraint filters out soft deleted records CREATE UNIQUE INDEX ON t (name) WHERE (deleted_at IS NULL). So upsert wouldn't work here.

metalon (31 rep)

Nov 11, 2024, 03:43 PM • Last activity: Jul 19, 2025, 08:11 AM

0 votes

1 answers

181 views

PostgreSQL CTE Recursion - How to Improve Performance for Parent->Child Tree (8 levels deep, 890 roots)

postgresql performance execution-plan cte recursive

I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table). https...

                                  I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13.  This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table).

https://explain.dalibo.com/plan/g2d2f5fg9c563b0d#plan 
https://explain.depesz.com/s/eaRu 

It seems like both the CTE table construction and moreso the scanning of the CTE to apply the constraints are the most time consuming parts of the query.  Most work appears to be done in memory with the given settings.  We still seem to scan 780GB of cache (cache hit) which seems like duplicate work.

The actual number of nodes in any given tree may be 1000-5000. In this operation, we are starting from 890 root nodes.  There are both 1:many and many:1 relationships in this tree (splitting, recombining)

Doing some experiments in a blank environment I notice that:
- To get all parent->child resource_ids in single column, we need to do left join so all leaf nodes will have a null child row with their calculated depth.
- We track path to ensure we don't get in an infinite cycle.  While path is important for avoiding cycles, it it not relevant for the final calculation.  Some paths will be duplicating information for the same root -> descendant. This makes the CTE table much larger

Are there any techniques that can be used here to make this recursion less painful? We need to calculate the deepest leaf node from the root matching the starting root to only the deepest leaf with the 'begat' relationship and matching 'sample type' property.

Justin Lowen (68 rep)

Oct 17, 2024, 05:14 PM • Last activity: Jul 16, 2025, 06:02 PM

0 votes

1 answers

151 views

calculate or get price of parent items on bill of materials

postgresql cte recursive

i want to calculate or get the price of parent items on bill of materials query here is the fiddle-> https://www.db-fiddle.com/f/o3jLgZxKNLG14mna8QGdVN/6 please note in the second row, betax has qty = 2 and the other parent items could behave the same qty of the first row, always will be 1 **what i...

                                  i want to calculate or get the price of parent items on bill of materials query

here is the fiddle-> https://www.db-fiddle.com/f/o3jLgZxKNLG14mna8QGdVN/6 

please note in the second row, betax has qty = 2 and the other parent items could behave the same

qty of the first row, always will be 1

**what i need is to get the 'price' and 'subtotal' for every parent**

nkne (9 rep)

Jun 17, 2020, 05:02 AM • Last activity: Jul 13, 2025, 12:02 PM

1 votes

1 answers

187 views

Query To Show YTD Total Summing Each Month Prior To Current

mysql mysql-5.6 cte

I need a query to give a YTD total that should sum each month before it. For example, January YTD would equal January Total, February YTD would equal January Total + February Total, March YTD would equal January Total + February Total + March Total. My issue is that my query is showing the SUM() of...

Create Table empSales
(
    spendType varchar(100)
    ,spendAmt decimal(10,2)
    ,spendMonth varchar(100)
);

Insert Into empSales (spendType, spendAmt, spendMonth ) VALUES
('James', '1.00', 'January'),
('Richard', '3.28', 'January'),
('Barb', '4.13', 'January'),
('James', '3.00', 'February'),
('Richard', '3.28', 'February'),
('Barb', '4.13', 'February'),
('James', '2.00', 'March'),
('Richard', '5.28', 'March'),
('Barb', '7.13', 'March');

And this is the query I tried that only gives the overall total instead of just the YTD total I'm after

with MonthAgg as (
select IFNULL(spendType,'Total') spendType 
, Sum(Case when spendMonth='January' then spendAmt else null end ) January
, Sum(Case when spendMonth='February' then spendAmt else null end ) February 
, Sum(Case when spendMonth='March' then spendAmt else null end ) March
, Sum(Case when spendMonth='April' then spendAmt else null end ) April 
, Sum(Case when spendMonth='May' then spendAmt else null end ) May
, Sum(Case when spendMonth='June' then spendAmt else null end ) June
, Sum(Case when spendMonth='July' then spendAmt else null end ) July
, Sum(Case when spendMonth='August' then spendAmt else null end ) August 
, Sum(Case when spendMonth='September' then spendAmt else null end ) September
, Sum(Case when spendMonth='October' then spendAmt else null end ) October
, Sum(Case when spendMonth='November' then spendAmt else null end ) November 
, Sum(Case when spendMonth='December' then spendAmt else null end ) December
FROM empSales
GROUP BY spendType WITH ROLLUP)
, mycteSum as (
select spendMonth , Sum(spendAmt) Amt 
FROM empSales
GROUP BY spendMonth
)
,mycteYTD as (
select spendMonth , sum(Amt) Over(order by Cast(spendMonth+ ' 01, 1900' as date) ) YTD
from mycteSum)
Select * from MonthAgg
UNION ALL
Select 'YTD' as summarySpend,
max(Case when spendMonth='January' then YTD else null end ) JanuaryYTD 
, max(Case when spendMonth='February' then YTD else null end ) FebruaryYTD 
, max(Case when spendMonth='March' then YTD else null end ) MarchYTD 
, max(Case when spendMonth='April' then YTD else null end ) AprilYTD 
, max(Case when spendMonth='May' then YTD else null end ) MayYTD 
, max(Case when spendMonth='June' then YTD else null end ) JuneYTD 
, max(Case when spendMonth='July' then YTD else null end ) JulyYTD 
, max(Case when spendMonth='August' then YTD else null end ) AugustYTD 
, max(Case when spendMonth='September' then YTD else null end ) SeptemberYTD 
, max(Case when spendMonth='October' then YTD else null end ) OctoberYTD 
, max(Case when spendMonth='November' then YTD else null end ) NovemberYTD 
, max(Case when spendMonth='December' then YTD else null end ) DecemberYTD 
from mycteYTD

jamesMandatory (69 rep)

Jul 19, 2020, 07:12 PM • Last activity: Jun 28, 2025, 12:09 PM

0 votes

1 answers

201 views

List Top Daily Rentals Per Customer using a CTE

mysql query select cte

I've been trying to learn CTEs and having a bit or trouble getting the hang of them. I wrote a query against the [Sakila Sample database][1] that lists information about horror movie rentals for each day. Here is the overblown (and redundant) SQL that I came up with: SELECT CONCAT(CU.last_name, ', '...

                                  I've been trying to learn CTEs and having a bit or trouble getting the hang of them.  I wrote a query against the Sakila Sample database  that lists information about horror movie rentals for each day. 

Here is the overblown (and redundant) SQL that I came up with:

    SELECT CONCAT(CU.last_name, ', ', CU.first_name) AS customer,
            A.phone, 
            F.title, 
            date(R.rental_date) AS rental_date
     FROM sakila.rental R 
         LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id 
         LEFT JOIN sakila.film F ON I.film_id = F.film_id 
         LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id
         LEFT JOIN sakila.category C ON FC.category_id = C.category_id 
         LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
         LEFT JOIN sakila.address A ON CU.address_id = A.address_id
     WHERE CU.customer_id in 
           (SELECT CU.customer_id
            FROM rental R
            LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
            LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id 
            LEFT JOIN sakila.film F ON I.film_id = F.film_id 
            LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id
            LEFT JOIN sakila.category C ON FC.category_id = C.category_id 
            WHERE C.name = "Horror"
            GROUP BY CU.customer_id
            HAVING COUNT(CU.customer_id) >= 3)
     AND C.name = "Horror"
     ORDER BY customer, title, rental_date DESC;

And here are some of the results in my database client (Navicat for MySQL ):



**Is there a way to rewrite the query using a CTE?**  

It seems like just the type of query that a CTE would be perfect for, if only I could figure it out!

Thanks!


                                

Rob Gravelle (123 rep)

May 12, 2020, 10:53 PM • Last activity: Jun 23, 2025, 06:03 PM

0 votes

1 answers

266 views

Query Locks Table And Takes Long Time to Run

sql-server performance cte performance-tuning query-performance

I have a query that can take a long time to run. I tried running the same query in a test server and it runs very quick. However, in production it seems to hang and causes blocking of other queries. Where do I begin to find a solution for this? Below is my code and the plan... https://www.brentozar....

                                  I have a query that can take a long time to run. I tried running the same query in a test server and it runs very quick.
However, in production it seems to hang and causes blocking of other queries.

Where do I begin to find a solution for this? Below is my code and the plan...

https://www.brentozar.com/pastetheplan/?id=ryAQWfL2X 

The issue seems to be with the CTE statement. Is there a more efficient way of writing this? 


	-- table to store break details
	CREATE TABLE #PositionReconciliationCurrentBreaks
	(
		TransactionKey INT,
		BreakOriginationDate DATE,
		BreakAge DECIMAL (18,0)
	);
	TRUNCATE TABLE #PositionReconciliationCurrentBreaks;

	WITH PreviousRecBreaks_CTE -- get breaks from previous rec
	AS
	(
		SELECT 
			MAX(i.InvestmentID) AS InvestmentID,
			PRT.PositionDate,
			PRT.PositionReconciliationDetailKey,
			MAX(PRT.PositionReconciliationTransactionKey) AS PositionReconciliationTransactionKey,
			PRT.InvestmentKey, 
			MAX(PRT.GenevaInvestmentKey) AS GenevaInvestmentKey,
			MAX(PRT.BrokerInvestmentKey) AS BrokerInvestmentKey,
			PRT.PaymentCurrencyKey,
			PRT.AccountKey,
			CASE WHEN MAX(PRT.BreakOriginationDate) = '2100-01-01' THEN NULL ELSE MAX(PRT.BreakOriginationDate) END AS BreakOriginationDate,
			MAX(PRT.BreakAge) AS BreakAge,
			SUM(ISNULL(PRT.GenevaQuantity,0)) AS GenevaQuantity,
			SUM(ISNULL(PRT.BrokerQuantity,0)) AS BrokerQuantity
		FROM [Fund.Accounting.Model.PositionReconciliation].PositionReconciliationTransaction PRT -- get transactions from previous rec
		INNER JOIN 
			[Fund.Accounting.Model.PositionReconciliation].PositionReconciliationDetail TRD -- check details to get Action on transaction
				ON TRD.PositionReconciliationDetailKey = PRT.PositionReconciliationDetailKey -- join on Detail Key
		inner join [Fund.Accounting.Model.TradeReconciliation].Investment i -- add in investmentID for debugging purpose
				ON i.InvestmentKey = PRT.InvestmentKey
		INNER JOIN 
			[Fund.Accounting.Model].[ReconciliationActions] RA -- actions
				ON RA.ReconciliationActionID=TRD.ReconciliationActionID		
				AND (TRD.PositionCommentKey IS NOT NULL) --approved breaks must have a comment.
				AND RA.Name  'Suppress' -- ensure not suppressed
				AND RA.Name  'Auto Match' -- ensure not perfect match - we do want to include cross references with a qty diff
		WHERE PRT.PositionReconciliationID = @PreviousPositionReconciliationId -- previous rec
		GROUP BY -- added grouping for quantity breaks as x-refs will be 2 separate lines in PositionReconciliationTransaction tbl; We group to force this as 1 line
			PRT.PositionDate,
			PRT.PositionReconciliationDetailKey,
			PRT.InvestmentKey, 
			PRT.PaymentCurrencyKey,
			PRT.AccountKey
	),CurrentRecBreaks_CTE -- get breaks from current rec
	AS
	(
		SELECT  
			MAX(i.InvestmentID) AS InvestmentID,
			PRT.PositionDate,
			PRT.PositionReconciliationDetailKey,
			MAX(PRT.PositionReconciliationTransactionKey) AS PositionReconciliationTransactionKey,
			PRT.InvestmentKey, 
			MAX(PRT.GenevaInvestmentKey) AS GenevaInvestmentKey,
			MAX(PRT.BrokerInvestmentKey) AS BrokerInvestmentKey,
			PRT.PaymentCurrencyKey,
			PRT.AccountKey,
			CASE WHEN MAX(PRT.BreakOriginationDate) = '2100-01-01' THEN NULL ELSE MAX(PRT.BreakOriginationDate) END AS BreakOriginationDate,
			MAX(PRT.BreakAge) AS BreakAge,
			SUM(ISNULL(PRT.GenevaQuantity,0)) AS GenevaQuantity,
			SUM(ISNULL(PRT.BrokerQuantity,0)) AS BrokerQuantity
		FROM [Fund.Accounting.Model.PositionReconciliation].PositionReconciliationTransaction PRT -- get transactions from previous rec
		INNER JOIN 
			[Fund.Accounting.Model.PositionReconciliation].PositionReconciliationDetail TRD -- check details to get Action on transaction
				ON TRD.PositionReconciliationDetailKey = PRT.PositionReconciliationDetailKey -- join on Detail Key
		inner join [Fund.Accounting.Model.TradeReconciliation].Investment i -- add in investmentID for debugging purpose
				ON i.InvestmentKey = PRT.InvestmentKey
		LEFT JOIN 
			[Fund.Accounting.Model].[ReconciliationActions] RA -- actions
				ON RA.ReconciliationActionID=TRD.ReconciliationActionID		
				AND RA.Name  'Suppress' -- ensure not suppressed
				AND RA.Name  'Auto Match' -- ensure not perfect match - we do want to include cross references with a qty diff
		WHERE PRT.PositionReconciliationID = @PositionReconciliationID -- current rec
		GROUP BY -- added grouping for quantity breaks as x-refs will be 2 separate lines in PositionReconciliationTransaction tbl; We group to force this as 1 line
			PRT.PositionDate,
			PRT.PositionReconciliationDetailKey,
			PRT.InvestmentKey, 
			PRT.PaymentCurrencyKey,
			PRT.AccountKey
	)
	INSERT INTO #PositionReconciliationCurrentBreaks
	(
		TransactionKey, 
		BreakOriginationDate, 
		BreakAge
	)	
	SELECT  -- get breaks in current rec that also appeared in previous rec
		 C.PositionReconciliationTransactionKey
		,ISNULL(P.BreakOriginationDate,@PreviousPositionReconciliationPeriod) --set ro previous date if it is first time presence.
		,CASE DATEDIFF(DAY, P.PositionDate, C.PositionDate)
			WHEN  0 THEN 1 --set to 1 day old, if it is 1st time 
			ELSE ISNULL(P.BreakAge,0) + DATEDIFF(DAY, P.PositionDate, C.PositionDate)  -- get no. of days difference between current rec and previous one
		END AS BreakAge
	FROM PreviousRecBreaks_CTE P 
	INNER JOIN 
		CurrentRecBreaks_CTE C
			ON P.AccountKey = C.AccountKey
				AND P.PaymentCurrencyKey=C.PaymentCurrencyKey
				AND P.InvestmentKey=C.InvestmentKey
				--AND (P.GenevaInvestmentKey=C.GenevaInvestmentKey
				--	OR P.BrokerInvestmentKey=C.BrokerInvestmentKey)
				AND P.GenevaQuantity=C.GenevaQuantity
				AND P.BrokerQuantity=C.BrokerQuantity
	WHERE NOT EXISTS(
		-- exclude breaks that are now perfect matches (or suppressed???)
		SELECT
			D.PositionReconciliationDetailKey
		FROM 
			[Fund.Accounting.Model.PositionReconciliation].PositionReconciliationDetail D
		INNER JOIN	-- join on AggregatePositionReconciliationTransaction so we can get the Qty Diff value
			[Fund.Accounting.Facade.PositionReconciliation].[AggregatePositionReconciliationTransaction] (@PositionReconciliationID,NULL,NULL) A
				ON A.PositionReconciliationDetailKey = D.PositionReconciliationDetailKey
		WHERE D.PositionReconciliationID = @PositionReconciliationID -- current rec
			AND A.QuantityDifference = 0 -- perfect matches
			AND D.PositionReconciliationDetailKey = C.PositionReconciliationDetailKey -- join details to transactions
	);
                                

Kevin (533 rep)

Oct 30, 2018, 05:19 PM • Last activity: May 18, 2025, 06:03 AM

0 votes

0 answers

31 views

is it possible to make the CTE parallel in PostgreSQL 14

cte postgresql-14 parallel

I have an analyse query in PostgreSQL 14 that use many CTEs, like this: with a0 as (select count(*) from ccc0) , a1 as (select count(*) from ccc1) , a2 as (select count(*) from ccc2) , a3 as (select count(*) from ccc3) , a4 as (select count(*) from ccc4) , a5 as (select count(*) from ccc5) , a6 as (...

                                  I have an analyse query in PostgreSQL 14 that use many CTEs, like this:


    with
    a0 as (select count(*) from ccc0) ,   
    a1 as (select count(*) from ccc1) ,  
    a2 as (select count(*) from ccc2) ,  
    a3 as (select count(*) from ccc3) ,  
    a4 as (select count(*) from ccc4) ,  
    a5 as (select count(*) from ccc5) ,  
    a6 as (select count(*) from ccc6) ,  
    a7 as (select count(*) from ccc7) ,  
    a8 as (select count(*) from ccc8) ,  
    a9 as (select count(*) from ccc9) ,  
    a10 as (select count(*) from ccc10) ,   
    a11 as (select count(*) from ccc11) ,  
    a12 as (select count(*) from ccc12) ,  
    a13 as (select count(*) from ccc13) ,  
    a14 as (select count(*) from ccc14) ,  
    a15 as (select count(*) from ccc15) ,  
    a16 as (select count(*) from ccc16) ,  
    a17 as (select count(*) from ccc17) ,  
    a18 as (select count(*) from ccc18) ,  
    a19 as (select count(*) from ccc19) ,  
    a20 as (select count(*) from ccc20) ,   
    a21 as (select count(*) from ccc21) ,  
    a22 as (select count(*) from ccc22) ,  
    a23 as (select count(*) from ccc23)   
    select * from a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23;



I read some docs that tell set max_parallel_workers_per_gather can make the CTE parallel https://github.com/digoal/blog/blob/master/201903/20190317_17.md , and the https://www.postgresql.org/docs/current/parallel-safety.html  docs told:




> The following operations are always parallel restricted:
> 
> Scans of common table expressions (CTEs).
> 
> Scans of temporary tables.
> 
> Scans of foreign tables, unless the foreign data wrapper has an
> IsForeignScanParallelSafe API that indicates otherwise.
> 
> Plan nodes that reference a correlated SubPlan.




is it possible to make the CTE parallel? now each CTE takes more than 7s make the query so slow. I have tried to add index and the result of each CTE more than 100m rows(the data could not filter anymore). if the parallel works, the result can be fetched 7s, if not, the result will take minites to fetch the result.












                                

Dolphin (939 rep)

May 10, 2025, 04:06 AM

0 votes

1 answers

630 views

Difference between a query based on CTE and a simple query

postgresql query-performance optimization postgresql-performance cte

i'm doing a select starting from a table named `events` that is joined with device measurements using a date between function (this time between is a sort of "mobile windowing") in order to get the measurements that are included between the event log insert time and a -1 hour inverval. After that, d...

                                  i'm doing a select starting from a table named events that is joined with device measurements using a date between function (this time between is a sort of "mobile windowing") in order to get the measurements that are included between the event log insert time and a -1 hour inverval.

After that, due to the fact that the measurement is a JSON structure, i use a cross join to unpack the ap_info object and then get the ap_name, ap_ip and neigh_name of the latest measurement that is present on db right before the event occours.

My two attempts:

**First approach** *(very slow, also and unusable when there is no match on the concat filter expression)*:

    Select
    p.neigh_name
    from tv_smartdevicemeasurement_snmp
    cross join jsonb_to_recordset(tv_smartdevicemeasurement_snmp.data->'ap_info') as p(ap_name text,ap_ip text, neigh_name text)
    inner join tv_event on tv_event.name = p.ap_name
    where smart_device_id = 3 and (tv_smartdevicemeasurement_snmp.insert_time  tv_event.insert_time - '1 hour'::interval)
    						  and is_event_ack = false
    						  and (CONCAT('Host: ', p.ap_name, ' - IP: ' , p.ap_ip) = 'Host: AP-04 - IP: 10.50.2.130')
    order by tv_smartdevicemeasurement_snmp.insert_time desc
    limit 1

Explain: HERE 

**CTE Method** *(goes without any evident issue)*:

    with cte_temp as (select * from (
    select tv_event.insert_time, tv_smartdevicemeasurement_snmp.data from tv_event
    	inner join tv_smartdevicemeasurement_snmp on
    		(tv_smartdevicemeasurement_snmp.insert_time  tv_event.insert_time - '1 hour'::interval)
    where is_event_ack = false and tv_smartdevicemeasurement_snmp.smart_device_id = 3) as join_non_ack_evt_meas
    ) Select insert_time, ap_name, ap_ip, neigh_name  from cte_temp
    
    cross join jsonb_to_recordset(cte_temp.data->'ap_info') as p(ap_name text,ap_ip text, neigh_name text)
    
    where (CONCAT('Host: ', ap_name, ' - IP: ' , ap_ip) = 'Host: AP-04 - IP: 10.50.2.130')
    order by insert_time desc
    limit 1

Explain: HERE 

The question is: why there is so much difference between these two queries? i mean, they works on the same tables! please explain me in which differs: the only things that i'm seeing is that this approach is working on a large row subset

And, in addition, you have any suggest to optimize my second approach?


Postgres Version: "PostgreSQL 11.13 on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.3.1_git20210424) 10.3.1 20210424, 64-bit"

                                

VirtApp (191 rep)

Dec 16, 2021, 09:53 AM • Last activity: Apr 24, 2025, 03:02 PM

1 votes

0 answers

56 views

PostgreSQL 16: WITH/CTE versus Non-Deferrable Constraints

constraint cte postgresql-16

I am using PostgreSQL 16 upwards. I am currently trying to create an example for a relationship of type `G-|o-----| = 1 H. x CHAR(3), -- example for other attributes CONSTRAINT g_h_fk FOREIGN KEY (h, id) REFERENCES h (id, g) ); -- To table H, we add the foreign key reference constraint towards g. AL...

I am using PostgreSQL 16 upwards. I am currently trying to create an example for a relationship of type `G-|o-----|= 1 H. x CHAR(3), -- example for other attributes CONSTRAINT g_h_fk FOREIGN KEY (h, id) REFERENCES h (id, g) ); -- To table H, we add the foreign key reference constraint towards g. ALTER TABLE h ADD CONSTRAINT h_g_fk FOREIGN KEY (g) REFERENCES g (id);

Then I can insert data into the tables and read them back out as follows:

/* Insert into tables for G-|o-----|<-H relationship. */ -- Insert some rows into the table for entity type H. -- Not specifying g leave the references G as NULL for now. INSERT INTO h (y) VALUES ('AB'), ('CD'), ('EF'), ('GH'), ('IJ'); -- Insert into G and relate to H. We do this three times. WITH g_id AS (INSERT INTO g (h, x) VALUES (1, '123') RETURNING id) UPDATE h SET g = g_id.id FROM g_id WHERE h.id = 1; WITH g_id AS (INSERT INTO g (h, x) VALUES (3, '456') RETURNING id) UPDATE h SET g = g_id.id FROM g_id WHERE h.id = 3; WITH g_id AS (INSERT INTO g (h, x) VALUES (4, '789') RETURNING id) UPDATE h SET g = g_id.id FROM g_id WHERE h.id = 4; -- Link one H row to another G row. (We do this twice.) UPDATE h SET g = 3 WHERE id = 2; UPDATE h SET g = 3 WHERE id = 5; -- Combine the rows from G and H. SELECT g.id AS g_id, g.x, h.id AS h_id, h.y FROM h INNER JOIN g ON g.id = h.g; ``` This still requires the use of Common Table Expressions. However, by now, I am fairly confident that this is OK. Still, I am not 100% sure. I think both approaches do work and I could not find an error with either of them. But the two table method is probably more efficient and more elegant. @Akina was right.

Thomas Weise (111 rep)

Apr 20, 2025, 09:00 AM • Last activity: Apr 22, 2025, 05:34 AM

0 votes

1 answers

762 views

Can EXPLAIN be used to get some insights about Common Table Expression (CTE)?

mysql cte explain recursive

I need to find the ancestors in a table having an ID value and a Google research point me to recursive [Common Table Expression (CTE)][1] I wonder if EXPLAIN can be used to get some insights on how MySQL handles this and if there is room for optimization. Also is this the best strategy to get ancest...

I need to find the ancestors in a table having an ID value and a Google research point me to recursive Common Table Expression (CTE) I wonder if EXPLAIN can be used to get some insights on how MySQL handles this and if there is room for optimization. Also is this the best strategy to get ancestors in a table? I could involve code in here but don't want to because is gonna lead me to loops and probably multiple SELECT statements sent to the DB. The SQL I wrote following the previous article is as follow:

WITH RECURSIVE page_revisions_path (id, page_id, parent_id) AS
(
 SELECT id, page_id, parent_id
 FROM page_revisions
 WHERE parent_id = 'some_id'
 UNION ALL
 SELECT c.id, c.page_id, c.parent_id 
 FROM page_revisions_path AS cp JOIN page_revisions AS c ON cp.id = c.parent_id
)

SELECT * FROM page_revisions_path;

Here is also the SHOW CREATE TABLE query result:

CREATE_TABLE page_revisions (
 id varchar(26) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
 page_id varchar(26) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 parent_id varchar(26) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 routable tinyint(1) NOT NULL,
 PRIMARY KEY (id),
 KEY IDX1 (page_id),
 KEY IDX2 (parent_id),
 CONSTRAINT FK1 FOREING KEY (parent_id) REFERENCES page_revisions (id),
 CONSTRAINT FK2 FOREING KEY (page_id) REFERENCES pages (id)
) ENGINE=InnoDB

ReynierPM (1888 rep)

Feb 7, 2022, 10:13 PM • Last activity: Apr 18, 2025, 05:07 PM

0 votes

1 answers

632 views

How to make use of WITH in PostgreSQL to reduce query cost

postgresql-9.4 cte

I am trying to get twitter data of narendramodi using below command. SELECT b.t_id,a.profile_image,b.tweet_text,e.media_media_url,b.retweet_count,b.favorite_count as like_count,count(reply_to_status_id) as reply_count,f.imp_count,f.eng_count,f.eng_rate FROM twitter_users a LEFT JOIN twitter_tweets b...

                                  I am trying to get twitter data of narendramodi using below command.

    SELECT b.t_id,a.profile_image,b.tweet_text,e.media_media_url,b.retweet_count,b.favorite_count as like_count,count(reply_to_status_id) as reply_count,f.imp_count,f.eng_count,f.eng_rate 
    FROM twitter_users a 
    LEFT JOIN twitter_tweets b on a.user_id=b.user_id 
    LEFT JOIN replies c on b.t_id = c.t_id 
    LEFT JOIN media e on b.t_id = e.t_id 
    LEFT JOIN twitter_tweet_metric_aggregates f 
        ON f.metric_timestamp=(select max(metric_timestamp) 
                               FROM twitter_tweet_metric_aggregates g 
                               WHERE g.t_id=f.t_id and g.t_id=b.t_id) 
    WHERE a.twitter_screen_name= 'narendramodi' 
    GROUP BY  b.t_id,a.profile_image 
    ,b.tweet_text,b.retweet_count,b.favorite_count,
    e.media_media_url,f.imp_count,f.eng_count,f.eng_rate);

Query was working correctly But, in the above query I have used sub-select to get recent data of imp_counts of each tweet based on timestamp. Because of this sub-select Query_cost was huge and so it was taking more than 15min for query execution. I want to reduce that and should able to execute within 10seconds. For that reason I was trying to use WITH (CTE) expression

    WITH metric_counts AS (
    SELECT max(metric_timestamp),f.t_id,f.imp_count,f.eng_count,f.eng_rate 
    FROM twitter_tweet_metric_aggregates f LEFT JOIN tweets b on 
    f.t_id=b.t_id 
    )
    SELECT 
    b.t_id,a.profile_image,b.tweet_text,e.media_media_url,b.retweet_count
    ,b.favorite_count as like_count, count(reply_to_status_id) as 
    reply_count,metric_counts.imp_count
    ,metric_counts.eng_count,metric_counts.eng_rate 
    FROM twitter_users as a 
    LEFT JOIN tweets as b on a.twitter_user_id=b.twitter_user_id 
    LEFT JOIN replies c on b.t_id = c.t_id 
    LEFT JOIN media e on b.t_id = e.t_id 
    LEFT JOIN metric_counts on metric_counts.t_id = b.t_id 
    WHERE lower(a.twitter_screen_name)=lower('narendramodi') 
    GROUP BY b.t_id,a.profile_image,b.tweet_text,e.media_media_url,
             b.retweet_count,b.favorite_count, 
             metric_counts.imp_count,metric_counts.eng_count,
             metric_counts.eng_rate;

The above WITH expression was giving results of imp_counts also for each tweet but not giving latest record/value. Can anyone help me in achieving this.

Here is the Query cost of WITH query

    HashAggregate  (cost=1734856.13..1735618.48 rows=76235 width=673)

So can anyone help me to reduce cost to even lesser but giving results within 15 sec.

Query_cost

     HashAggregate  (cost=4923196.15..4923958.50 rows=76235 width=673) (actual time=51871.524..51872.333 rows=1513 loops=1)
       Group Key: imp_counts.tweet_status_id, a.profile_image, b.tweet_text, b.tweet_created_at, d.sentiment, d.emotion, b.retweet_count, b.favorit
    e_count, e.media_media_url, imp_counts.tweet_impression_count, imp_counts.tweet_engagement_count, imp_counts.tweet_engagement_rate
       CTE imp_counts
         ->  Seq Scan on twitter_tweet_metric_aggregates f  (cost=0.00..3356389.12 rows=17676 width=47) (actual time=37516.805..41642.354 rows=6069
    9 loops=1)
               Filter: (metric_timestamp = (SubPlan 2))
               Rows Removed by Filter: 3475596
               SubPlan 2
                 ->  Result  (cost=0.90..0.91 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=3536295)
                       InitPlan 1 (returns $1)
                         ->  Limit  (cost=0.56..0.90 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=3536295)
                               ->  Index Only Scan Backward using pk_twitter_tweet_metric_aggregates on twitter_tweet_metric_aggregates g  (cost=0.
    56..55.57 rows=158 width=8) (actual time=0.010..0.010 rows=1 loops=3536295)
                                     Index Cond: ((tweet_status_id = (f.tweet_status_id)::text) AND (metric_timestamp IS NOT NULL))
                                     Heap Fetches: 3536295
       ->  Nested Loop Left Join  (cost=1202381.45..1564329.39 rows=76235 width=673) (actual time=50478.887..51854.010 rows=10672 loops=1)
             ->  Nested Loop Left Join  (cost=1202380.90..1472231.15 rows=76235 width=641) (actual time=50478.871..51781.010 rows=10649 loops=1)
                   ->  Hash Right Join  (cost=1202380.34..1362479.87 rows=76235 width=626) (actual time=50478.841..51702.556 rows=10649 loops=1)
                         Hash Cond: ((c.tweet_status_id)::text = (b.tweet_status_id)::text)
                         ->  Seq Scan on twitter_tweet_replies c  (cost=0.00..150216.68 rows=2606068 width=38) (actual time=0.041..692.358 rows=260
    6068 loops=1)
                         ->  Hash  (cost=1201427.40..1201427.40 rows=76235 width=607) (actual time=50477.837..50477.837 rows=1499 loops=1)
                               Buckets: 131072  Batches: 1  Memory Usage: 1497kB
                               ->  Merge Left Join  (cost=1200957.10..1201427.40 rows=76235 width=607) (actual time=50446.646..50476.746 rows=1499 
    loops=1)
                                     Merge Cond: ((b.tweet_status_id)::text = (imp_counts.tweet_status_id)::text)
                                     ->  Sort  (cost=1199356.58..1199547.17 rows=76235 width=293) (actual time=8608.183..8608.530 rows=1499 loops=1
    )
                                           Sort Key: b.tweet_status_id
                                           Sort Method: quicksort  Memory: 597kB
                                           ->  Hash Right Join  (cost=29591.36..1193174.62 rows=76235 width=293) (actual time=3.714..8604.233 rows=
    1499 loops=1)
                                                 Hash Cond: ((b.twitter_user_id)::text = (a.twitter_user_id)::text)
                                                 ->  Seq Scan on twitter_tweets b  (cost=0.00..1095151.30 rows=18045230 width=230) (actual time=0.0
    26..5037.377 rows=18044981 loops=1)
                                                 ->  Hash  (cost=29214.36..29214.36 rows=30160 width=89) (actual time=0.025..0.025 rows=1 loops=1)
                                                       Buckets: 32768  Batches: 1  Memory Usage: 257kB
                                                       ->  Index Scan using twitter_screen_name_idx on twitter_users a  (cost=0.56..29214.36 rows=3
    0160 width=89) (actual time=0.020..0.021 rows=1 loops=1)
                                                             Index Cond: (lower((twitter_screen_name)::text) = 'narendramodi'::text)
                                     ->  Sort  (cost=1600.52..1644.71 rows=17676 width=314) (actual time=41838.456..41847.253 rows=60655 loops=1)
                                           Sort Key: imp_counts.tweet_status_id
                                           Sort Method: quicksort  Memory: 8540kB
                                           ->  CTE Scan on imp_counts  (cost=0.00..353.52 rows=17676 width=314) (actual time=37516.810..41684.563 r
    ows=60699 loops=1)
                   ->  Index Scan using tweet_sentiment_index on tweet_sentiment d  (cost=0.56..1.43 rows=1 width=34) (actual time=0.006..0.007 row
    s=0 loops=10649)
                         Index Cond: ((b.tweet_status_id)::text = (tweet_status_id)::text)
             ->  Index Only Scan using twitter_tweet_media_pkey on twitter_tweet_media e  (cost=0.55..1.20 rows=1 width=70) (actual time=0.006..0.0
    06 rows=0 loops=10649)
                   Index Cond: (tweet_status_id = (b.tweet_status_id)::text)
                   Heap Fetches: 1074
     Planning time: 2.913 ms
     Execution time: 51875.165 ms
    (43 rows)


                                

bunny sunny (113 rep)

Mar 7, 2019, 04:28 AM • Last activity: Apr 11, 2025, 10:06 PM

0 votes

1 answers

3553 views

How to create multiple temp tables using records from a CTE that I need to call multiple times in Postgres plpgsql Procedure?

postgresql plpgsql cte temporary-tables

I am already using a CTE expression within a plpgsql Procedure to grab some Foreign Keys from (1) specific table, we can call it `master_table`. I created a brand new table, we can call this table `table_with_fks`, in my DDL statements so this table holds the FKs I am fetching and saving. I later ta...

I am already using a CTE expression within a plpgsql Procedure to grab some Foreign Keys from (1) specific table, we can call it master_table. I created a brand new table, we can call this table table_with_fks, in my DDL statements so this table holds the FKs I am fetching and saving. I later take these FKs from my table_with_fks and JOIN on my other tables in my database to get the entire original record (the full record with all columns from its corresponding table) and insert it into an archive table. I have an awesome lucid chart I drew that might make what I say down below make much more sense:

My CTE example:

LOOP
   EXIT WHEN some_condition;

WITH fk_list_cte AS (
  SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
  FROM master_table mt
  WHERE mt.created_date = 1 year old
  LIMIT 10000
)
INSERT INTO table_with_fks (SELECT * FROM fk_list_cte);

commit;
END LOOP;

Now, I have (4) other Procedures that JOIN on each FK in this table_with_fks with its **parent** table that it references. I do this because as I said, I only got the FK at first, and I don't have all the original columns for the record. So I will do something like

LOOP
   EXIT WHEN some_condition;

WITH full_record_cte AS (
  SELECT * 
  FROM table_with_fks fks
  JOIN parent_table1 pt1
  ON fks.fk1 = pt1.id
  LIMIT 10000),
  INSERT INTO (select * from full_record_cte);

commit;
END LOOP;

***NOW***, what I want to do, is instead of having to RE-JOIN 4 times later on these FK's that are found in my table_with_fks, I want to use the first CTE fk_list_cte to JOIN on the parent tables right away and grab the full record from each (4) tables and put it in some TEMP postgres table. I think I will need (4) unique TEMP tables, as I don't know how it would work if I combine all their data into one BIG table, because each table has different data/different columns. Is there a way to use the original CTE fk_list_cte and call it multiple times in succession and CREATE 4 TEMP tables right after, that all use the original CTE? example:

LOOP
   EXIT WHEN some_condition;

WITH fk_list_cte AS (
  SELECT mt.fk1, mt.fk2, mt.fk3, mt.fk4
  FROM master_table mt
  WHERE mt.created_date = 1 year old
  LIMIT 10000
),
WITH fetch_fk1_original_record_from_parent AS (
  SELECT * 
  FROM fk_list_cte cte
  JOIN parent_table1 pt1
  ON  cte.fk1 = pt1.id
),
WITH fetch_fk2_original_record_from_parent AS (
  SELECT * 
  FROM fk_list_cte cte
  JOIN parent_table2 pt2
  ON  cte.fk2 = pt2.id
),
WITH fetch_fk3_original_record_from_parent AS (
  SELECT * 
  FROM fk_list_cte cte
  JOIN parent_table3 pt3
  ON  cte.fk3 = pt3.id
),
WITH fetch_fk4_original_record_from_parent AS (
  SELECT * 
  FROM fk_list_cte cte
  JOIN parent_table4 pt4
  ON  cte.fk4 = pt4.id
),
CREATE TEMPORARY TABLE fk1_tmp_tbl AS (
  SELECT * 
  FROM fetch_fk1_original_record_from_parent
)
CREATE TEMPORARY TABLE fk2_tmp_tbl AS (
  SELECT * 
  FROM fetch_fk2_original_record_from_parent
)
CREATE TEMPORARY TABLE fk3_tmp_tbl AS (
  SELECT * 
  FROM fetch_fk3_original_record_from_parent
)
CREATE TEMPORARY TABLE fk4_tmp_tbl AS (
  SELECT * 
  FROM fetch_fk4_original_record_from_parent
);

END LOOP;

I know the 4 CREATE TEMPORARY TABLE statements definitely won't work, (can I create 4 temp tables simultaneously/at once?) . Does anyone see the logic of what I am trying to do here and can help me?

PainIsAMaster (131 rep)

Oct 9, 2021, 04:01 AM • Last activity: Apr 7, 2025, 12:15 AM

0 votes

1 answers

5330 views

Unrecognized configuration parameter when using set_config(text, text, boolean) in CTE

postgresql cte row-level-security

I'm trying to implement a simple ABAC system using row level security, with main policy defined as following: CREATE policy resource_access ON resource USING ( ( org_id::varchar = current_setting('scope.org.id', true) ) AND ( acl_read && regexp_split_to_array(current_setting('scope.acl'), ',')::varc...

                                  I'm trying to implement a simple ABAC system using row level security, with main policy defined as following:

    CREATE policy resource_access ON resource
    USING (
      (
        org_id::varchar = current_setting('scope.org.id', true)
      )
      AND 
      (
        acl_read && regexp_split_to_array(current_setting('scope.acl'), ',')::varchar[]
      )
    )

Issuing queries akin to:

    WITH 
      acl AS (SELECT set_config('scope.acl', 'ACL', true) "__acl"), 
      result AS ( ... )
    SELECT * FROM acl, result

With the main reason to use **WITH** is to avoid multiple statements when queries are later **PREPARE**d and **EXECUTE**d by the Postgres driver I'm using.

The **result** in example above can contain any arbitrary queries required by the application. To ensure that **set_config** is executed in the query, it's also added to the final **SELECT**.

However, I still do consistently encounter the following error:

    QueryFailedError: unrecognized configuration parameter "scope.acl"

Which appears to be caused by executing the subquery from WITH in isolation from the result query.

So the main questions are:
- Is there any elegant way to ensure running set_config before the main query (the one in result) is executed?
- Is there any better way to construct queries for the application side, to avoid using WITH, but keeping them as a single SQL statement?

Thank you!
                                

Ivan C. (1 rep)

Nov 7, 2020, 09:49 AM • Last activity: Mar 21, 2025, 09:06 PM

1 votes

1 answers

82 views

How to adapt a recursive CTE over multiple tables?

postgresql cte recursive

I'm using PostgreSQL 17 --- I am modelling a package index for the Haskell ecosystem, and a feature that is useful is to determine transitive dependencies. Haskell packages can be normalised as: ``` Package (name + package-specific metadata) \-> Releases (version + release-specific metadata like syn...

I'm using PostgreSQL 17 --- I am modelling a package index for the Haskell ecosystem, and a feature that is useful is to determine transitive dependencies. Haskell packages can be normalised as:

Package 
(name + package-specific metadata)
 \-> Releases 
    (version + release-specific metadata like synopsis, attached data files)
   \-> Components 
      (library, executable, test suite, benchmark suite)
     \-> Dependencies 
        (Each component declares a dependency of a package name and version expression).

(Each of these sections are a table, and they are linked together by one-to-many relationships. One package links to many releases, each release links to many components, each component links to many dependencies) For the purpose of my own enlightenment, I have first reduced the complexity of the model to create a CTE that does what I expect. Especially, I don't use bigints as the PKs of the table in my codebase, but UUIDs. (Full dbfiddle is available at https://dbfiddle.uk/hVOmMdYQ)

-- Data model where packages and versions are combined, 
-- and dependencies refer to packages

create table packages (
  package_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name text unique not null,
  version int[] not null
);

create unique index on packages(name, version);

create table dependencies (
  dependency_id bigint generated always as identity PRIMARY KEY,
  dependent_id bigint references packages(package_id),
  depended_id bigint references packages(package_id) 
);

create unique index on dependencies(dependent_id, depended_id);

And here is the data:

insert into packages (name, version) values ('base', '{1,0,0,0}');
insert into packages (name, version) values ('vector', '{0,0,7,0}');
insert into packages (name, version) values ('random', '{0,1,5,8}');
insert into packages (name, version) values ('unix', '{1,2,1,0}');
insert into packages (name, version) values ('time', '{3,14,1,2}');

insert into dependencies (dependent_id, depended_id) values (2, 1);
insert into dependencies (dependent_id, depended_id) values (3, 1);
insert into dependencies (dependent_id, depended_id) values (3, 2);
insert into dependencies (dependent_id, depended_id) values (4, 1);
insert into dependencies (dependent_id, depended_id) values (5, 1);
insert into dependencies (dependent_id, depended_id) values (5, 3);
insert into dependencies (dependent_id, depended_id) values (5, 4);

Here is a preliminary result:

select dependent.package_id, dependent.name as dependent, depended.name as depended
  from dependencies as d1
  inner join packages as dependent on d1.dependent_id = dependent.package_id
  inner join packages as depended  on d1.depended_id = depended.package_id;

| package_id | dependent | depended | |-----------:|-----------|----------| | 2 | vector | base | | 3 | random | base | | 3 | random | vector | | 4 | unix | base | | 5 | time | base | | 5 | time | random | | 5 | time | unix | Until now, every looks good. I then made this recursive CTE to create a view of transitive dependencies, with breadcrumbs:

with recursive transitive_dependencies ( dependent_id, dependent, depended_id, breadcrumbs) as 
( select dependent.package_id as dependent_id
       , dependent.name as dependent
       , depended.package_id as depended_id
       , concat_ws(' > ', dependent.name, depended.name) as breadcrumbs
  from dependencies as d1
    inner join packages as dependent on d1.dependent_id = dependent.package_id
    inner join packages as depended  on d1.depended_id = depended.package_id
    where dependent_id = 5

  union all
  
  select dependent.package_id as dependent_id
       , dependent.name as dependent
       , depended.package_id as depended_id
       , concat_ws(' > ', t2.breadcrumbs, depended.name) as breadcrumbs   

  from dependencies as d1
     inner join packages as dependent on d1.dependent_id = dependent.package_id
     inner join packages as depended  on d1.depended_id = depended.package_id
     inner join transitive_dependencies as t2 on t2.depended_id = dependent.package_id -- ← This is where we refer to the CTE
)

  cycle dependent_id set is_cycle using path
 
select t3.dependent_id
     , t3.dependent
     , t3.depended_id
     , t3.breadcrumbs
from transitive_dependencies as t3;

| dependent_id | dependent | depended_id | breadcrumbs | |-------------:|-----------|------------:|-------------------------------| | 5 | time | 1 | time > base | | 5 | time | 3 | time > random | | 5 | time | 4 | time > unix | | 3 | random | 1 | time > random > base | | 3 | random | 2 | time > random > vector | | 4 | unix | 1 | time > unix > base | | 2 | vector | 1 | time > random > vector > base | Behold, it works! --- Now, I am looking into splitting things a bit further. Namely, package and release will be separated. This is due to the fact that there is some metadata specific to the Haskell ecosystem that targets the notion of "package" and some that is only relevant to "releases", and they are not interchangeable.

Diagram where packages point to releases, which point to dependencies, which themselves point back to packages

-- Data model where packages and releases are separated

create table packages2 (
  package_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name text unique not null
);

create table releases2 (
  release_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  package_id bigint references packages2,
  version text not null
);

create unique index on releases2(package_id, version);

create table dependencies2 (
  dependency_id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  release_id bigint references releases2 not null,
  package_id bigint references packages2 not null,
  requirement int[] not null
);

And here is the data

insert into packages2 (name) values ('base'); -- 1
insert into packages2 (name) values ('vector'); -- 2
insert into packages2 (name) values ('random'); -- 3
insert into packages2 (name) values ('unix'); -- 4
insert into packages2 (name) values ('time'); -- 5

insert into releases2 (package_id, version) values (1, '{1,0,0,0}');
insert into releases2 (package_id, version) values (2, '{0,0,7,0}');
insert into releases2 (package_id, version) values (3, '{0,1,5,8}');
insert into releases2 (package_id, version) values (4, '{1,2,1,0}');
insert into releases2 (package_id, version) values (5, '{3,14,1,2}');


insert into dependencies2 (release_id, package_id, requirement) values ( 2, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 3, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 3, 2, '>= 0.0.7.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 4, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 5, 1, '== 1.0.0.0' );
insert into dependencies2 (release_id, package_id, requirement) values ( 5, 3, ' ', p2.name, p3.name) as breadcrumbs
  from dependencies2 as d0
    -- Dependent releases
    inner join releases2 as r1 on d0.release_id = r1.release_id
    -- Dependent packages
    inner join packages2 as p2 on r1.package_id = p2.package_id
    -- Dependencies packages
    inner join packages2 as p3 on d0.package_id = p3.package_id
  where r1.release_id = 5

  union

  select p2.package_id as dependent_id
       , p2.name as dependent
       , p3.package_id as dependency_id
       , concat_ws(' > ', p2.name, p3.name) as breadcrumbs
  from dependencies2 as d0
    -- Dependent releases
    inner join releases2 as r1 on d0.release_id = r1.release_id
    -- Dependent packages
    inner join packages2 as p2 on r1.package_id = p2.package_id
    -- Dependencies packages
    inner join packages2 as p3 on d0.package_id = p3.package_id
    inner join transitive_dependencies2 as t2 on t2.dependency_id = p2.package_id ← This is where we refer to the CTE
)

    cycle dependent_id set is_cycle using path

select t3.dependent_id
     , t3.dependent
     , t3.dependency_id
     , t3.breadcrumbs
from transitive_dependencies2 as t3;

Quite unfortunately, this does not give the expected result: | dependent_id | dependent | dependency_id | breadcrumbs | |-------------:|-----------|--------------:|-----------------| | 5 | time | 1 | time > base | | 5 | time | 3 | time > random | | 5 | time | 4 | time > unix | | 3 | random | 1 | random > base | | 3 | random | 2 | random > vector | | 4 | unix | 1 | unix > base | | 2 | vector | 1 | vector > base | My question is as follow: How can I build my intuition to further split a CTE that works, over more granular tables? I'm still very new to all of this, and this is my first "real-world" use case of CTEs. Happy to clarify or disambiguate things. Through this, I'm also interested in best practices when it comes to data modelling. I was warned in the past against storing arrays of foreign keys, for instance, and to strive and reach normal forms and splitting entities that have different life cycles.

Théophile Choutri de Tarlé (84 rep)

Jan 27, 2025, 01:15 PM • Last activity: Mar 21, 2025, 09:30 AM

6 votes

1 answers

666 views

CTE Error when installing sp_WhoIsActive

sql-server ssms cte sp-whoisactive

I have several identical (near as I can tell) SQL Servers where I've recently added sp_WhoIsActive (showing some folks how much I like this tool) but one of them will not let me create the stored procedure. I get an error I'm quite familiar with about a CTE needing the previous statement to end with...

SET QUOTED_IDENTIFIER ON;
SET ANSI_PADDING ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET ANSI_WARNINGS ON;
SET NUMERIC_ROUNDABORT OFF;
SET ARITHABORT ON;
GO

IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME = 'sp_WhoIsActive')
    EXEC ('CREATE PROC dbo.sp_WhoIsActive AS SELECT ''stub version, to be replaced''')
GO

ALTER PROC dbo.sp_WhoIsActive
(
    @filter sysname = ''
)
AS
BEGIN;
    SET NOCOUNT ON;
    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
    SET QUOTED_IDENTIFIER ON;
    SET ANSI_PADDING ON;
    SET CONCAT_NULL_YIELDS_NULL ON;
    SET ANSI_WARNINGS ON;
    SET NUMERIC_ROUNDABORT OFF;
    SET ARITHABORT ON;

    IF @filter IS NULL
    BEGIN;
        RAISERROR('Input parameters cannot be NULL', 16, 1);
        RETURN;
    END;

	--SELECT 'FIZZ' AS BUZZ;
    WITH
    a0 AS
    (SELECT 'FOO' AS BAR)
	SELECT * FROM a0;
END;
GO

IF you un-comment the SELECT 'FIZZ' AS BUZZ; line, it gets created. Anyone have an idea what is different about this server? Things I've checked: * database compatibility levels for all dbs are > 100 (master, model, etc are 140) * SQL Servers are all 14.0.344.5 * No availability Groups * SSMS 18.12.1 * I'm SA on all servers * version 12 of sp_whoIsActive https://github.com/amachanic/sp_whoisactive/releases/tag/v12.00 Full text of actual error below:

Msg 319, Level 15, State 1, Procedure sp_WhoIsActive, Line 206 [Batch Start Line 11]
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 209 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 212 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 215 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 218 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 221 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 156, Level 15, State 1, Procedure sp_WhoIsActive, Line 225 [Batch Start Line 11]
Incorrect syntax near the keyword 'ORDER'.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 230 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 319, Level 15, State 1, Procedure sp_WhoIsActive, Line 443 [Batch Start Line 11]
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 446 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 449 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 452 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 455 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 458 [Batch Start Line 11]
Incorrect syntax near ','.
Msg 156, Level 15, State 1, Procedure sp_WhoIsActive, Line 462 [Batch Start Line 11]
Incorrect syntax near the keyword 'ORDER'.
Msg 102, Level 15, State 1, Procedure sp_WhoIsActive, Line 468 [Batch Start Line 11]
Incorrect syntax near ','.

Completion time: 2023-01-06T15:04:08.0850888-05:00

RandomCorpSQLDev (63 rep)

Jan 6, 2023, 07:26 PM • Last activity: Mar 15, 2025, 05:21 PM

0 votes

1 answers

470 views

How do I calculate a streak of data and show the data that corresponds to it on MySQL?

mysql cte window-functions

I'm currently working on a project with NBA stats and I want to check a player's longest streak of his highest amount of points on MySQL. I'm open to a 5-10 game streak to simplify it. For example: Kevin Durant scores 30, 32, 35, 36, 41 in 5 consecutive games and that's his highest one, how would I...

                                  I'm currently working on a project with NBA stats and I want to check a player's longest streak of his highest amount of points on MySQL. I'm open to a 5-10 game streak to simplify it. 

For example: Kevin Durant scores 30, 32, 35, 36, 41 in 5 consecutive games and that's his highest one, how would I go about extracting the game stats, points, the length of the streak AND showing the total amount of points during this streak all together.

I've tried using multiple CTEs with window functions, but I confuse myself as I'm doing it. Is there an easy or simple way to do this at all?

thisisforfun (1 rep)

Jul 18, 2023, 05:49 PM • Last activity: Feb 21, 2025, 03:04 PM

3 votes

2 answers

3644 views

What's the (Big O) computational complexity of a PostgreSQL recursive common table expression?

postgresql performance postgresql-performance cte

For example, taking this StackOverflow #44620695 question, [recursive path aggregation and CTE query for top-down tree postgres](https://stackoverflow.com/questions/44620695/recursive-path-aggregation-and-cte-query-for-top-down-tree-postgres/50685815#50685815) as an example, which uses a recursive C...

                                  For example, taking this StackOverflow #44620695 question, [recursive path aggregation and CTE query for top-down tree postgres](https://stackoverflow.com/questions/44620695/recursive-path-aggregation-and-cte-query-for-top-down-tree-postgres/50685815#50685815)  as an example, which uses a recursive CTE to traverse a tree structure to determine the paths from a starting node.

The screenshot above shows the the input data, the recursive CTE result, and a visualization of the source data.

Recursive CTE are iterative over the preceding result -- *right?* (as suggested in the accepted answer [here](https://stackoverflow.com/questions/35979198/understanding-steps-of-recursive-cte))  -- so would the time complexity be something like O(log n)?

Victoria Stuart (191 rep)

Jun 8, 2018, 06:58 PM • Last activity: Feb 16, 2025, 09:05 PM

Showing page 1 of 20 total questions