Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
2
votes
1
answers
1861
views
MariaDB Inner join is slow
Here is my SQL (with explain): EXPLAIN SELECT * FROM GBI gbi INNER JOIN VF_KRED kred ON gbi.vendor = kred.vendor; Giving below the outcome: [![enter image description here][1]][1] Now, the `select` SQL takes ~10 seconds to produce the outcome. Below are the indexes on tables: - VF_KRED table: [
select
SQL takes ~10 seconds to produce the outcome.
Below are the indexes on tables:
- VF_KRED table:


Akshay Lokur
(121 rep)
Jun 25, 2019, 04:06 PM
• Last activity: Aug 5, 2025, 04:05 PM
0
votes
1
answers
1561
views
Query with left join not using index
The below query is taking about 1 second to run SELECT `taxi_driver_mapping`.*, concat("dpk00", taxi_driver_mapping.mapping_payment_plan_id) AS package_text, `people`.*, `taxi`.*, `company`.`company_name`, `country`.`country_name`, `state`.`state_name`, `city`.`city_name`, `peoplecreated`.`name` AS...
The below query is taking about 1 second to run
SELECT
taxi_driver_mapping
.*,
concat("dpk00",
taxi_driver_mapping.mapping_payment_plan_id) AS package_text,
people
.*,
taxi
.*,
company
.company_name
,
country
.country_name
,
state
.state_name
,
city
.city_name
,
peoplecreated
.name
AS created_name
,
peopleupdated
.name
AS updated_name
,
device_owner
FROM taxi_driver_mapping
LEFT JOIN company
ON taxi_driver_mapping
.mapping_companyid
= company
.cid
LEFT JOIN taxi
ON taxi_driver_mapping
.mapping_taxiid
= taxi
.taxi_id
LEFT JOIN country
ON taxi_driver_mapping
.mapping_countryid
= country
.country_id
LEFT JOIN state
ON taxi_driver_mapping
.mapping_stateid
= state
.state_id
LEFT JOIN city
ON taxi_driver_mapping
.mapping_cityid
= city
.city_id
LEFT JOIN people
ON taxi_driver_mapping
.mapping_driverid
= people
.id
LEFT JOIN people
AS peoplecreated
ON taxi_driver_mapping
.mapping_createdby
= peoplecreated
.id
LEFT JOIN people
AS peopleupdated
ON taxi_driver_mapping
.mapping_updatedby
= peopleupdated
.id
LEFT JOIN driver_information
ON taxi_driver_mapping
.mapping_driverid
= driver_information
.driver_id
WHERE (people
.name
LIKE '%abm1173%'
OR people
.lastname
LIKE '%abm1173%'
OR people
.email
LIKE '%abm1173%'
OR company
.company_name
LIKE '%abm1173%'
OR people
.phone
LIKE '%abm1173%'
OR people
.id
LIKE '%abm1173%'
OR people
.username
LIKE '%abm1173%'
OR taxi
.taxi_no
LIKE '%abm1173%')
ORDER BY mapping_id
DESC limit 10 offset 0
Below is the explain plan for it:
+----+-------------+---------------------+--------+------------------+---------+---------+----------------------------------------------+-------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+--------+------------------+---------+---------+----------------------------------------------+-------+----------------------------------------------------+
| 1 | SIMPLE | taxi_driver_mapping | ALL | NULL | NULL | NULL | NULL | 78718 | Using temporary; Using filesort |
| 1 | SIMPLE | company | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_companyid | 1 | NULL |
| 1 | SIMPLE | taxi | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_taxiid | 1 | NULL |
| 1 | SIMPLE | country | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_countryid | 1 | NULL |
| 1 | SIMPLE | state | ALL | PRIMARY,state_id | NULL | NULL | NULL | 3 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | city | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | people | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_driverid | 1 | Using where |
| 1 | SIMPLE | peoplecreated | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_createdby | 1 | NULL |
| 1 | SIMPLE | peopleupdated | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_updatedby | 1 | NULL |
| 1 | SIMPLE | driver_information | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_driverid | 1 | NULL |
+----+-------------+---------------------+--------+------------------+---------+---------+----------------------------------------------+-------+----------------------------------------------------+
Below is the table definition for taxi_driver_mapping table:
| taxi_driver_mapping | CREATE TABLE taxi_driver_mapping
(
mapping_id
int(100) NOT NULL AUTO_INCREMENT,
mapping_driverid
int(100) NOT NULL,
mapping_taxiid
int(100) NOT NULL,
mapping_taxi_model_id
int(11) NOT NULL,
mapping_companyid
int(100) NOT NULL,
mapping_countryid
int(100) NOT NULL,
mapping_stateid
int(100) NOT NULL,
mapping_cityid
int(100) NOT NULL,
mapping_startdate
timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
mapping_enddate
timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
mapping_createdby
int(100) NOT NULL,
mapping_createdate
timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
mapping_status
varchar(3) NOT NULL DEFAULT 'A',
mapping_updatedby
int(11) DEFAULT NULL,
mapping_updatedate
timestamp NULL DEFAULT NULL,
mapping_payment_plan_id
int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (mapping_id
),
KEY ndx_mapping_driverid
(mapping_driverid
),
KEY ndx_mapping_taxiid
(mapping_taxiid
),
KEY ndx_driver_status_taxi_cid
(mapping_driverid
,mapping_status
,mapping_taxiid
,mapping_companyid
),
KEY idx_test
(mapping_status
,mapping_driverid
,mapping_taxiid
)
) ENGINE=InnoDB AUTO_INCREMENT=123027 DEFAULT CHARSET=latin1 |
Table definition for company table:
| company | CREATE TABLE company
(
cid
int(100) NOT NULL AUTO_INCREMENT,
company_name
varchar(250) NOT NULL,
company_address
varchar(250) NOT NULL,
current_location
text NOT NULL,
latitude
varchar(250) NOT NULL,
longitude
varchar(250) NOT NULL,
bankname
varchar(100) NOT NULL,
bankaccount_no
varchar(250) NOT NULL,
company_country
int(11) NOT NULL,
company_state
int(11) NOT NULL,
company_city
int(11) NOT NULL,
header_bgcolor
varchar(25) NOT NULL,
menu_color
varchar(25) NOT NULL,
mouseover_color
varchar(25) NOT NULL,
time_zone
varchar(250) NOT NULL,
userid
int(11) NOT NULL,
company_status
varchar(3) NOT NULL DEFAULT 'A',
company_type
enum('0','1','2','4') NOT NULL DEFAULT '0' COMMENT '0-Company , 1-Kiosk , 2-Corporate 4-Restaurant',
drivers_count
int(11) NOT NULL DEFAULT '500',
created_by
int(11) DEFAULT NULL,
created_at
datetime DEFAULT NULL,
updated_by
int(11) DEFAULT NULL,
updated_at
datetime DEFAULT NULL,
PRIMARY KEY (cid
)
) ENGINE=InnoDB AUTO_INCREMENT=4122 DEFAULT CHARSET=latin1 |
I thought that the column mapping_companyid used in the first left join not being indexed is the reason for full table scan. But even after adding the index to it, the query is taking 1 second. Can someone please let me know what could be the issue here? Let me know if any more info. is needed. Thanks
user5594148
(43 rep)
Jan 29, 2020, 06:36 PM
• Last activity: Aug 2, 2025, 03:02 PM
6
votes
1
answers
1035
views
PostgreSQL predicate not pushed down (through join conditions)
Consider the following data model in a PostgreSQL v13 system; [![parent-child data model][1]][1] Here, parent table `dim` contains a small set of reference data, and child table `fact` contains a much higher volume of records. A typical use case for these data sets would be to query all `fact::value...
Consider the following data model in a PostgreSQL v13 system;
Here, parent table

dim
contains a small set of reference data, and child table fact
contains a much higher volume of records. A typical use case for these data sets would be to query all fact::value
's data belonging to a dim::name
. Note that dim::name
holds a UNIQUE constraint.
While I think this is a very common scenario, I was somewhat taken aback that the style of queries I've been using for years on other RDBMS's (Oracle, MSSQL) didn't perform _at all_ on PostgreSQL the way I imagined they would. That is, when querying a dataset (fact
) using a highly selective, but implicit, predicate (fact::dim_id eq X
) through a join condition, I expect the index on fact::dim_id
to be used (in a nested-loop). Instead, a hash-join is used, requiring a full table scan of fact
.
**Question:** is there some way I can nudge the query planner into considering any predicate I issue on a joined relation to not need a full table scan? (without impacting other DB loads)
To illustrate the problem with an example, these tables are populated with some random data;
CREATE TABLE dim(
id SERIAL NOT NULL
, name TEXT NOT NULL
, CONSTRAINT pk_dim PRIMARY KEY (id)
, CONSTRAINT uq_dim UNIQUE (name)
);
CREATE TABLE fact(
id SERIAL NOT NULL
, dim_id INTEGER NOT NULL
, value TEXT
, CONSTRAINT pk_fact PRIMARY KEY (id)
, CONSTRAINT fk_facts_dim FOREIGN KEY (dim_id) REFERENCES dim (id)
);
CREATE INDEX idx_fact_dim ON fact(dim_id);
INSERT INTO dim(name)
SELECT SUBSTRING(md5(random()::TEXT) FOR 5)
FROM generate_series(1,50)
UNION
SELECT 'key';
INSERT INTO fact(dim_id, value)
SELECT (SELECT id FROM dim ORDER BY random() LIMIT 1)
, md5(random()::TEXT)
FROM generate_series(1,1000000);
ANALYZE dim;
ANALYZE fact;
EXPLAIN ANALYZE
SELECT f.*
FROM fact AS f
JOIN dim AS d
ON (d.id = f.dim_id)
WHERE d.name = 'key'; -- Note: UNIQUE
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1001.65..18493.29 rows=20588 width=41) (actual time=319.331..322.582 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=1.65..15434.49 rows=8578 width=41) (actual time=306.193..306.195 rows=0 loops=3)
Hash Cond: (f.dim_id = d.id)
-> Parallel Seq Scan on fact f (cost=0.00..14188.98 rows=437498 width=41) (actual time=0.144..131.050 rows=350000 loops=3)
-> Hash (cost=1.64..1.64 rows=1 width=4) (actual time=0.138..0.139 rows=1 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on dim d (cost=0.00..1.64 rows=1 width=4) (actual time=0.099..0.109 rows=1 loops=3)
Filter: (name = 'key'::text)
Rows Removed by Filter: 50
Planning Time: 1.059 ms
Execution Time: 322.662 ms
Now, we execute the same question, but instead of filtering using an inner join, we filter using a scalar subquery;
EXPLAIN ANALYZE
SELECT *
FROM fact
WHERE dim_id = (SELECT id FROM dim WHERE name = 'key');
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_fact_dim on fact (cost=2.07..15759.53 rows=524998 width=41) (actual time=0.096..0.097 rows=0 loops=1)
Index Cond: (dim_id = $0)
InitPlan 1 (returns $0)
-> Seq Scan on dim (cost=0.00..1.64 rows=1 width=4) (actual time=0.046..0.054 rows=1 loops=1)
Filter: (name = 'key'::text)
Rows Removed by Filter: 50
Planning Time: 0.313 ms
Execution Time: 0.156 ms
As shown, the performance difference is huge. Somehow, the query planner did not consider the predicate on the unique dim::name
attribute to be equal to a predicate on fact::dim_id
in the first query.
Michiel T
(161 rep)
Jan 15, 2021, 10:55 PM
• Last activity: Jul 25, 2025, 03:10 AM
1
votes
1
answers
48
views
Postgres query planner join selectivity greater than 1?
I am using PostgreSQL 14.17. I am trying to debug a query planner failure in a bigger query, but I think I've narrowed down the problem to a self-join on a join table: ```sql SELECT t2.item_id FROM item_sessions t1 JOIN item_sessions t2 ON t1.session_key = t2.session_key WHERE t1.item_id = 'xxxxxxxx...
I am using PostgreSQL 14.17.
I am trying to debug a query planner failure in a bigger query, but I think I've narrowed down the
problem to a self-join on a join table:
SELECT t2.item_id
FROM item_sessions t1
JOIN item_sessions t2
ON t1.session_key = t2.session_key
WHERE t1.item_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
After running ANALYZE on the table, EXPLAIN gives this plan (which matches the subplan in the larger query):
Nested Loop (cost=1.12..119.60 rows=7398 width=16)
-> Index Only Scan using item_sessions_item_id_session_key_uniq on item_sessions t1 (cost=0.56..8.58 rows=1 width=33)
Index Cond: (item_id = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'::uuid)
-> Index Only Scan using item_sessions_session_idx on item_sessions t2 (cost=0.56..110.11 rows=91 width=49)
Index Cond: (session_key = (t1.session_key)::text)
**Why is the loop estimating 7398 rows when the two child nodes estimate 1 and 91 respectively?** I would have expected the loop total to be less than 1 * 91
FWIW, the child estimates seem correct. item_id has n_distinct at -0.77649677, so the expected row count is 1.3, and session_key has n_distinct at 149555 out of an estimated 1.36e+07 tuples, which gives 90.9 expected tuples per session_key.
The indexes referenced in the plan are:
- item_sessions_session_idx btree (session_key, item_id)
- item_sessions_item_id_session_key_uniq UNIQUE CONSTRAINT, btree (item_id, session_key)
ETA: I created a minimal reproduction [here](https://github.com/felipeochoa/pg-plan-selectivity-gt1) . The failure is visible [in the job logs](https://github.com/felipeochoa/pg-plan-selectivity-gt1/actions/runs/16359411460/job/46224463766) on 17.5, 16.9, and 15.13
Felipe
(317 rep)
Jul 17, 2025, 04:41 AM
• Last activity: Jul 18, 2025, 09:14 AM
0
votes
1
answers
181
views
PostgreSQL CTE Recursion - How to Improve Performance for Parent->Child Tree (8 levels deep, 890 roots)
I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table). https...
I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table).
https://explain.dalibo.com/plan/g2d2f5fg9c563b0d#plan
https://explain.depesz.com/s/eaRu
It seems like both the CTE table construction and moreso the scanning of the CTE to apply the constraints are the most time consuming parts of the query. Most work appears to be done in memory with the given settings. We still seem to scan 780GB of cache (cache hit) which seems like duplicate work.
The actual number of nodes in any given tree may be 1000-5000. In this operation, we are starting from 890 root nodes. There are both 1:many and many:1 relationships in this tree (splitting, recombining)
Doing some experiments in a blank environment I notice that:
- To get all parent->child resource_ids in single column, we need to do left join so all leaf nodes will have a
null
child row with their calculated depth.
- We track path to ensure we don't get in an infinite cycle. While path is important for avoiding cycles, it it not relevant for the final calculation. Some paths will be duplicating information for the same root -> descendant. This makes the CTE table much larger
Are there any techniques that can be used here to make this recursion less painful? We need to calculate the deepest leaf node from the root matching the starting root to only the deepest leaf with the 'begat' relationship and matching 'sample type' property.
Justin Lowen
(68 rep)
Oct 17, 2024, 05:14 PM
• Last activity: Jul 16, 2025, 06:02 PM
1
votes
1
answers
177
views
Shredding Query Plan XML For Potential Skewed Estimates - Data Type Conversions
I'm using the below T-SQL to check for root level estimate skews in execution plans by pulling the estimated rows from the execution plans and comparing to the last rows from the sys.dm_exec_query_stats DMV. Obviously I can't do this calculation due to the data type mismatch, converting from varchar...
I'm using the below T-SQL to check for root level estimate skews in execution plans by pulling the estimated rows from the execution plans and comparing to the last rows from the sys.dm_exec_query_stats DMV. Obviously I can't do this calculation due to the data type mismatch, converting from varchar to bigint won't work in this scenario. Is there any way around this?
SELECT DB_NAME(qt.dbid) AS [Database], cp.objtype AS [Cached_Plan_Type],
cp.usecounts AS [Use_Counts],
qp.query_plan.value('(//@CardinalityEstimationModelVersion)','INT') AS [CE_Version],
qp.query_plan.value('(//@EstimateRows)', 'varchar(128)') AS [Estimated_Rows],
qs.last_rows AS [Last Rows],
--(qp.query_plan.value('(//@EstimateRows)', 'varchar(128)') - qs.last_rows) AS [Estimate_Skew],
qs.total_logical_reads / qs.execution_count AS [Avg_Logical_Reads],
CAST((qs.total_elapsed_time ) / 1000000.0/ qs.execution_count AS DECIMAL(28, 2)) AS [Average Execution time(s)],
CAST((qs.total_worker_time) / 1000000.0/ qs.execution_count AS DECIMAL(28, 2)) AS [CPU Time Average (s)],
qt.text AS [SQL_Statement],
qs.query_hash AS [QueryHash],
qp.query_plan AS [QueryPlan]
FROM sys.dm_exec_cached_plans cp WITH (NOLOCK)
CROSS APPLY sys.dm_exec_query_plan (cp.plan_handle) qp
CROSS APPLY sys.dm_exec_sql_text (cp.plan_handle) qt
INNER JOIN sys.dm_exec_query_stats qs ON qs.plan_handle = cp.plan_handle
WHERE qt.text NOT LIKE '%sys.%'
ORDER BY [CPU Time Average (s)] DESC OPTION(RECOMPILE);
GO
Fza
(652 rep)
Nov 18, 2016, 04:35 PM
• Last activity: Jul 15, 2025, 05:10 PM
0
votes
1
answers
172
views
Query plan XML: one stmt's DegreeOfParallelism is 1 for any maxdop other than 1 (maxdop 1 leads to DegreeOfParallelism = 0 with reason MaxDopSetToOne)
I have a procedure with two SELECTs. One statement respects the server's/database's/query hint's maxdop setting and uses parallelism, the other is being difficult and never going parallel. The 8 core server and this db are configured for maxdop 0 (not great, and I can change it, but I'd like to leav...
I have a procedure with two SELECTs. One statement respects the server's/database's/query hint's maxdop setting and uses parallelism, the other is being difficult and never going parallel.
The 8 core server and this db are configured for maxdop 0 (not great, and I can change it, but I'd like to leave it and figure out what's going on). Cost threshold is 5.
Auto update stats is on and stats on the table are showing as being updated earlier today. I may try updating the stats on the table manually after hours tonight.
This server and db are an AG secondary replica. The same procedure on the primary runs fast and both statements go parallel. The primary has more cores, and its maxdop is set explicitly to to 8.
Statement 1 query hints
- No query hint -> DegreeOfParallelism = 1
- Maxdop 0 -> DegreeOfParallelism = 1
- Maxdop 1 -> DegreeOfParallelism = 0 (reason MaxDopSetToOne)
- Maxdop 8 -> DegreeOfParallelism = 1
- Maxdop 7 -> DegreeOfParallelism = 1
Statement 2 query hints
- No query hint -> DegreeOfParallelism = 8
- Maxdop 0 -> DegreeOfParallelism = 8
- Maxdop 1 -> DegreeOfParallelism = 0 (reason MaxDopSetToOne)
- Maxdop 8 -> DegreeOfParallelism = 8
- Maxdop 7 -> DegreeOfParallelism = 7
Thanks for any help!
Bobogator
(95 rep)
Jan 24, 2022, 05:31 PM
• Last activity: Jul 9, 2025, 12:05 PM
1
votes
2
answers
163
views
How to make Postgres use the correct index here?
Here's a test setup: ```sql CREATE TABLE t ( filename int, cropped bool not null default false, resized bool not null default false, create_date date not null default '1970-01-01' ); INSERT INTO t SELECT generate_series(1, 1000000); UPDATE t SET cropped = true, resized = true WHERE filename IN (SELE...
Here's a test setup:
I create some partial indexes for other queries and one index specifically for the query below:
CREATE TABLE t (
filename int,
cropped bool not null default false,
resized bool not null default false,
create_date date not null default '1970-01-01'
);
INSERT INTO t
SELECT generate_series(1, 1000000);
UPDATE t SET cropped = true, resized = true
WHERE filename IN (SELECT filename FROM t ORDER BY random() LIMIT 900000);
UPDATE t SET resized = false
WHERE filename IN (SELECT filename FROM t WHERE cropped = true ORDER BY random() LIMIT 1000);
VACUUM FULL t;
ANALYZE t;
[Fiddle](https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f356fd56a920ea8a93c192f5a8c16b1c)
Data now looks like this:
SELECT cropped, resized, count(*)
FROM t
GROUP BY 1,2;

CREATE INDEX idx_cropped ON t(cropped) WHERE NOT cropped;
CREATE INDEX idx_resized ON t(resized) WHERE NOT resized;
CREATE INDEX specific ON t(cropped,resized) WHERE cropped AND NOT resized;
Now I run my query:
EXPLAIN ANALYZE
SELECT count(*) FROM t WHERE cropped AND NOT resized AND create_date Index Scan using idx_resized on t (cost=0.29..3777.71 rows=89415 width=0) (actual time=478.177..478.480 rows=1000 loops=1)
Filter: (cropped AND (create_date Bitmap Heap Scan on t (cost=35.50..11652.73 rows=89415 width=0) (actual time=0.054..0.250 rows=1000 loops=1)
Recheck Cond: (cropped AND (NOT resized))
Filter: (create_date Bitmap Index Scan on specific (cost=0.00..13.15 rows=89415 width=0) (actual time=0.040..0.040 rows=1000 loops=1)
which uses the index specific
and completes in less than a ms on both dbfiddle.uk and my machine.
-----
Edit: Additional mystery - when I set the values not with an UPDATE but with a DEFAULT, then the correct index is chosen. What is going on? [Fiddle](https://dbfiddle.uk/?rdbms=postgres_14&fiddle=dc7d8aea14e90f08ab6537a855f34d8c)
AndreKR
(607 rep)
Apr 18, 2022, 01:20 AM
• Last activity: Jul 6, 2025, 04:06 PM
0
votes
1
answers
190
views
Is there any option to get actual execution plan as xml value to insert do different table?
Do you know If there is any option to get actual execution plan [![enter image description here][1]][1] [1]: https://i.sstatic.net/AlS30.png and insert this plan to another table ? I would like to have table with history of actual plans.
Do you know If there is any option to get actual execution plan
and insert this plan to another table ? I would like to have table with history of actual plans.

grochu
(45 rep)
Sep 7, 2021, 09:11 AM
• Last activity: Jun 25, 2025, 02:01 AM
0
votes
1
answers
89
views
Gather Streams plan operator bad performance
How to explain and troubleshoot the "final" Gather Streams query plan operator, which takes seconds, while the previous operators make it within one second? These are the left-most plan operators. I can't share the full plan for security reasons (sorry about that) [![enter image description here][1]...
How to explain and troubleshoot the "final" Gather Streams query plan operator, which takes seconds, while the previous operators make it within one second?
These are the left-most plan operators. I can't share the full plan for security reasons (sorry about that)
The plan metrics I follow don't look suspicious to me:
- Threads are pretty evenly distributed
- No prevalent wait type (as far as the plan info goes)
- Duration captured via Extended Event fits the Duration inside the plan stats
Is it that the gathering itself is so time-consuming on that number of rows ("just" 100k+)?
**Additional info:**
Here is the obfuscated query plan
SQL Sentry Plan Explorer gives a different view of the problematic operator with Costs by CPU, so this could be a wrong interpretation of numbers (which numbers(?)) by SSMS.
**Additional info #2:**
The obfuscated query plan I provide comes from the *query_post_execution_showplan* XE.
The query typically (in most executions) runs for about 1 second. Only some executions go beyond >5 seconds (my XE is filtered for Duration > 5s).



jericzech
(977 rep)
Jun 17, 2025, 11:40 AM
• Last activity: Jun 20, 2025, 07:06 PM
1
votes
0
answers
25
views
Dynamic sampling works differently than documented
Dynamic sampling in Oracle versions 12c to 19 is documented as functioning the same if it is enabled via the init parameter or by hinting. This does not seem to be true. I have tested this in customer databases, personal databases, on Linux and windows, and it always results in the same conclusion....
Dynamic sampling in Oracle versions 12c to 19 is documented as functioning the same if it is enabled via the init parameter or by hinting.
This does not seem to be true.
I have tested this in customer databases, personal databases, on Linux and windows, and it always results in the same conclusion.
When dynamic sampling at level 2 is only enabled via the init parameter, dynamic sampling occurs for objects without statistics, as advertised. When there is a
/*+ dynamic_sampling(2) */
hint in the query, you get sampling for functions, multi-column joins and maybe more.
Here is an example to demonstrate my point:
- Connect to the database and query the sampling queries:
select sql_id, executions, sql_text from v$sql
where parsing_schema_name = user and sql_text like 'SAMPLESUB';
- Confirm the init parameter is set to 2
show parameter optimizer_dynamic_sampling;
- Put something into the plan table
explain plan for select * from user_tables;
- Look at the sampling queries again
- Query the plan table with no hints
select * from table(dbms_xplan.display);
- Look at the sampling queries again
- Query the plan table with the hint
select /*+ dynamic_sampling(2) */ * from table(dbms_xplan.display);
- Look at the sampling queries again, you will see more sampling queries after using the hint
Explain plans and optimizer traces provide much more concrete data to confirm this.
andrew123
(11 rep)
Jun 17, 2025, 07:43 PM
• Last activity: Jun 17, 2025, 09:19 PM
14
votes
1
answers
2948
views
How does SQL Server's optimizer estimate the number of rows in a joined table?
I am running this query in the [AdventureWorks2012][1] database: SELECT s.SalesOrderID, d.CarrierTrackingNumber, d.ProductID, d.OrderQty FROM Sales.SalesOrderHeader s JOIN Sales.SalesOrderDetail d ON s.SalesOrderID = d.SalesOrderID WHERE s.CustomerID = 11077 If I look at the estimated execution plan...
I am running this query in the AdventureWorks2012 database:
SELECT
s.SalesOrderID,
d.CarrierTrackingNumber,
d.ProductID,
d.OrderQty
FROM Sales.SalesOrderHeader s
JOIN Sales.SalesOrderDetail d
ON s.SalesOrderID = d.SalesOrderID
WHERE s.CustomerID = 11077
If I look at the estimated execution plan, I see the following:
The initial index seek (top right) is using the IX_SalesOrderHeader_CustomerID index and searching on the literal 11077. It has an estimate of 2.6192 rows.
If I use
The average number of distinct rows between 11019 and 11091 is 2.619718, or rounded to 2.61972 which is the value of estimated rows shown for the index seek.
The part I don't understand is the estimated number of rows for the clustered index seek against the SalesOrderDetail table.
If I run
So the density of the SalesOrderID (which I am joining on) is 3.178134E-05. That means that 1/3.178134E-05 (31465) equals the number of unique SalesOrderID values in the SalesOrderDetail table.
If there are 31465 unique SalesOrderID's in the SalesOrderDetail, then with an even distribution, the average number of rows per SalesOrderID is 121317 (total number of rows) divided by 31465. The average is 3.85561
So if the estimated number of rows to be loop through is 2.61972, and the average to be returned in 3.85561, the I would think the estimated number of rows would be 2.61972 * 3.85561 = 10.10062.
But the estimated number of rows is 11.4867.
I think my understanding of the second estimate is incorrect and the differing numbers seems to indicate that. What am I missing?


DBCC SHOW_STATISTICS ('Sales.SalesOrderHeader', 'IX_SalesOrderHeader_CustomerID') WITH HISTOGRAM
, it shows that the value 11077 is between the two sampled keys 11019 and 11091.


DBCC SHOW_STATISTICS ('Sales.SalesOrderDetail', 'PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID')
:

8kb
(2639 rep)
Apr 2, 2015, 04:46 PM
• Last activity: Jun 10, 2025, 08:26 AM
0
votes
1
answers
5079
views
Costly HASH MATCH Aggregate
I have a view,which is slow when i query a simple select statement with a `where` clause.The where clause column is indexed(`Non-clustered` index) Here is the view and the plan. [execution plan][1] CREATE VIEW [dbo].[CurrentIncidentStatus] AS SELECT [incidentDetails].[IncidentStatus], [incidentDetai...
I have a view,which is slow when i query a simple select statement with a
Info:
When i select TOP 100 or 1000 it doesn't show HASH MATCH (Aggregate).
Thanks
where
clause.The where clause column is indexed(Non-clustered
index)
Here is the view and the plan.
execution plan
CREATE VIEW [dbo].[CurrentIncidentStatus]
AS
SELECT [incidentDetails].[IncidentStatus], [incidentDetails].[IncidentStatus_FieldValue], [incidentDetails].[IncidentStatus_Description], Report.Id ReportId, Form.Id FormId
FROM
[dbo].[IncidentDetailsPage_Incident] incidentDetails WITH (NOLOCK)
INNER JOIN [dbo].[IncidentDetailsPages] detailsPage WITH (NOLOCK)
ON incidentDetails.PageId = detailsPage.Id
INNER JOIN Form WITH (NOLOCK)
ON detailsPage.FormId = Form.Id
INNER JOIN Report WITH (NOLOCK)
ON Form.ReportId = Report.Id
LEFT OUTER JOIN dbo.IncidentDetailsPage_Supplements supplement WITH (NOLOCK)
ON detailsPage.Id = supplement.PageId
INNER JOIN
(
SELECT ReportId, Max(FormNumber) RecentFormNumber FROM
(
SELECT Report.Id ReportId, FormId, COALESCE(SupplementNumber, '0000') formNumber
FROM dbo.IncidentDetailsPages detailPage WITH (NOLOCK)
INNER JOIN Form WITH (NOLOCK)
ON detailPage.FormId = Form.Id
INNER JOIN Report WITH (NOLOCK)
ON Form.ReportId = Report.Id
INNER JOIN dbo.IncidentDetailsPage_Incident incident WITH (NOLOCK)
ON detailPage.Id = incident.PageId
LEFT OUTER JOIN dbo.IncidentDetailsPage_Supplements supplement WITH (NOLOCK)
ON detailPage.Id = supplement.PageId) FormNumbers
GROUP By ReportId) RecentForm
ON Report.Id = RecentForm.ReportId AND
RecentForm.RecentFormNumber = COALESCE(supplement.SupplementNumber, '0000')
GO
I analyzed the plan in Sentry Plan explorer
and learned that the HASH JOIN Aggregate
is the costly operation in the plan.
I am trying to avoid/remove that HASH JOIN
someway.?
If any of you have experienced similar situation please give some suggestion.
Additional Info:
The plan says Hash key is built on Table Report
and column Id
.
In-fact the Id
column is the primary key
in the Report
table ,so a clustered index
is present also a Non-clustered
index is created for Id column.
Still why Hashing is required for Report.Id?
Here is the screenshot of the HASH MATCH Aggregate operation.

user9516827
(1345 rep)
Nov 1, 2018, 03:05 PM
• Last activity: Jun 5, 2025, 06:07 AM
0
votes
1
answers
261
views
Query performance discrepancy between Azure and on premise SQL Server
I have the same database on both SQL Azure and on premise. Most of my queries run in similar time on the two instances but I have a handful of queries where the performance difference is on the order of 60x. I suspected that maybe I just didn't have a sufficiently large instance so I scaled to a 4 v...
I have the same database on both SQL Azure and on premise. Most of my queries run in similar time on the two instances but I have a handful of queries where the performance difference is on the order of 60x. I suspected that maybe I just didn't have a sufficiently large instance so I scaled to a 4 vCore instance with ~20 gig of memory. As my entire database is only 4.5 GB this seemed like it would be able to hold the entire database in memory.
Turning on IO stats and running the query on the on premise box yields results like
Now that
**Update**
I reduced the query down to a simple query which shows some of the same performance issues
Table 'tblCurrentPlateActivity'. Scan count 1, logical reads 201, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblViolationStatus'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblCostStructureTier'. Scan count 101, logical reads 401, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblViolation'. Scan count 474, logical reads 2106236, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblCostStructure'. Scan count 0, logical reads 190, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblProvinceState'. Scan count 0, logical reads 200, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblGroup'. Scan count 0, logical reads 200, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#GroupIDs___________________________________________________________________________________________________________00000000011F'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblLot'. Scan count 0, logical reads 215, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#LotIDs_____________________________________________________________________________________________________________000000000120'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#PlateProvinceStateIDs______________________________________________________________________________________________000000000122'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#ProvinceStatesIDs__________________________________________________________________________________________________000000000121'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblViolationType'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#LotDate____________________________________________________________________________________________________________00000000011E'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see all reads are logical and the larges culprit is tblViolation
which I would expect. On SQL Azure I see similar results for most tables
Table 'tblCurrentPlateActivity'. Scan count 1, logical reads 201, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblViolationStatus'. Scan count 1, logical reads 300, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblCostStructureTier'. Scan count 101, logical reads 401, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 200, logical reads 797, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblViolation'. Scan count 1, logical reads 2082323, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblCostStructure'. Scan count 0, logical reads 190, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#LotDate____________________________________________________________________________________________________________000000000072'. Scan count 1, logical reads 1336728, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblProvinceState'. Scan count 0, logical reads 1336728, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblGroup'. Scan count 0, logical reads 1336728, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#GroupIDs___________________________________________________________________________________________________________000000000073'. Scan count 1, logical reads 668364, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblLot'. Scan count 0, logical reads 1378509, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#PlateProvinceStateIDs______________________________________________________________________________________________000000000076'. Scan count 1, logical reads 668364, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#ProvinceStatesIDs__________________________________________________________________________________________________000000000075'. Scan count 1, logical reads 673785, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#LotIDs_____________________________________________________________________________________________________________000000000074'. Scan count 1, logical reads 871241, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblViolationType'. Scan count 0, logical reads 1347570, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
The query plans generated are pretty different likely because the on premise server is 2012. But looking at the query execution in SQL Azure I see some odd discrepancies for example

#tblLot
is generated as part of the query and I know that is only has those 474 records in it. What could be causing such a weird discrepancy and how would I solve it? I imagine the query plan is super inefficient because of this miscalculation of number of rows. The on premise database shows the rows correctly.
I'm happy to include query plans here but I'm not really sure what the best way is to do that. Export the .sqlplan?

SELECT top 100 Case when ISNULL(cpa.tow, 0) = 0 then '-'
else 'TW'
end as PlateActivityFormatted,
Case when v.ViolationStatusID = 6 then 1
else 0 end AS IsOverdueToday,
v.ViolationStatusID AS OriginalViolationStatusID,
v.ViolationStatusID AS RevisedViolationStatusID,
vs.StatusShortName AS RevisedViolationStatusDescription,
tblGroup.Name AS LotGroupName,
cpa.tow As PlateTowing,
[tblProvinceState].Name as ProvinceState,
[tblLot].TaxNumber,
[tblLot].Name LotName,
v.*
FROM [tblViolation] v INNER JOIN
[ViolationFinanceInformation] on [ViolationFinanceInformation].ViolationID = v.ID INNER JOIN
[tblViolationType] on [tblViolationType].ID = v.ViolationTypeID JOIN
[tblLot] ON v.LotID = [tblLot].ID JOIN
[tblGroup] ON [tblLot].GroupID = [tblGroup].ID JOIN
[tblProvinceState] ON [tblLot].ProvinceStateID = [tblProvinceState].ID LEFT JOIN
tblViolationStatus vs ON vs.StatusID = v.ViolationStatusID LEFT JOIN
tblCurrentPlateActivity cpa ON cpa.vendorId = tblGroup.vendorid and cpa.plateNumber = v.plateNumber
INNER JOIN #LotDate on #LotDate.ID = tblLot.ID
where Issued >= #LotDate.Start
Both databases perform equally well without the where clause, returning in sub 1s, but as soon as the where clause is added the on premise results remain about the same but the Azure results increase to 22 seconds.
I put the good and bad query plans here:
Good: https://www.brentozar.com/pastetheplan/?id=SyijStC88
Bad: https://www.brentozar.com/pastetheplan/?id=BymeUK0UI
https://1drv.ms/u/s!AmN_cJhXT0rP4WuYMib6bhCMoPon?e=YvFOQb
stimms
(353 rep)
Mar 29, 2020, 02:31 PM
• Last activity: Jun 2, 2025, 10:04 PM
0
votes
1
answers
879
views
Performance issues after upgrading to Mysql 8.0.19 version
Few queries are performing very poorly after the upgrade. Here are the examples. ****Example 1**:** SELECT x.CI_TITLE 'ciTitle', x.DEPARTMENT_MEANING 'departmentMeaning', x.RESIDUAL_FINANCIAL_IMPACT_LOSS 'residualFinancialImpactLoss', x.INHERENT_CONFIDENTIALITY 'inherentConfidentiality', x.PRIMARY_F...
Few queries are performing very poorly after the upgrade. Here are the examples.
****Example 1**:**
SELECT
x.CI_TITLE 'ciTitle',
x.DEPARTMENT_MEANING 'departmentMeaning',
x.RESIDUAL_FINANCIAL_IMPACT_LOSS 'residualFinancialImpactLoss',
x.INHERENT_CONFIDENTIALITY 'inherentConfidentiality',
x.PRIMARY_FLAG 'primaryFlag',
x.MEDIA_IMPACT_COMMENT 'mediaImpactComment',
x.OPEN_CHALLENGE 'openChallenge'
FROM
omb_risk_latest_eval_v x
WHERE
x.COMPANY_ID = 187
ORDER BY x.LAST_UPDATE_DATE DESC
LIMIT 30 OFFSET 0
The above query is taking average of 1.2 sec in 5.7.24, however the same query is taking 26 secs in 8.0.19.
Here are certain things I tried my level best.
1) All the tables have indexes present
2)Rebuilt all the tables in 8 version using alter table engine=innodb;
3) Checked the optimizer switch variables. Compared in both versions. Found the below 3 were extra
skip_scan=on
hash_join=on
Tried by disabling them, but no impact
4)googled many forums for such kind of issues. They suggested the below but these didn't work.
condition_fanout_filter=off
internal_tmp_mem_storage_engine=MEMORY
suggestions from the below sites..
http://dimitrik.free.fr/blog/archives/2018/04/mysql-performance-testing-80-with-less-blood.html
https://www.percona.com/blog/2018/05/04/how-binary-logs-affect-mysql-8-0-performance/
Also we are confident that there could be some server side settings which should be disabled explicitly in 8.x version of MYSQL, as these queries are performing very well in 5.7.
Wanted to share the exec plan for both versions but not sure how to do it as we have only image as attachment
Any suggestions/help please...
I
Satya Ammu
(1 rep)
Sep 24, 2021, 01:24 PM
• Last activity: May 30, 2025, 06:03 PM
0
votes
1
answers
250
views
Why is below 2 query generating different execution plan in MYSQL?
Only difference in the below query is the order by column. The resultset of both query is same. However, the 2nd query execution plan shows 'Using temporary; Using filesort' for ordering where as for 1st query it does not require filesort. Apart from id I have index on c.table_b_id and c.some_indexe...
Only difference in the below query is the order by column. The resultset of both query is same.
However, the 2nd query execution plan shows 'Using temporary; Using filesort' for ordering where as for 1st query it does not require filesort.
Apart from id I have index on c.table_b_id and c.some_indexed_id
SELECT * FROM TableA a
inner join TableB b on a.id = b.id
inner join TableC c on c.table_b_id = b.id
where c.some_indexed_id = 1
order by c.table_b_id asc;
SELECT * FROM TableA a
inner join TableB b on a.id = b.id
inner join TableC c on c.table_b_id = b.id
where c.indexed_id = 1
order by b.id asc;
sagar
(1 rep)
Apr 21, 2021, 06:01 AM
• Last activity: May 21, 2025, 12:06 PM
1
votes
1
answers
309
views
Testing the estimated query plan of a portion of a trigger that uses inserted/deleted
I'm performance tuning a SQL Server trigger that has several statements using the "inserted" and "deleted" tables. How can I break out those individual queries from the trigger for testing in order to see their estimated execution plans? The "inserted" and "deleted" tables have no context outside of...
I'm performance tuning a SQL Server trigger that has several statements using the "inserted" and "deleted" tables.
How can I break out those individual queries from the trigger for testing in order to see their estimated execution plans?
The "inserted" and "deleted" tables have no context outside of a trigger. I want to make sure that it mimics the behavior of the trigger as closely as possible.
The trigger I'm testing is
INSTEAD OF DELETE
. I was hoping not to actually delete the record so that I could easily test the before and after without deleting the row.
skeletank
(301 rep)
Nov 16, 2022, 03:52 PM
• Last activity: May 16, 2025, 09:04 AM
2
votes
1
answers
289
views
random_page_cost and memoize plan relation
Postgres 15.2. I have the following simple query: SELECT mark.last_modified FROM mark INNER JOIN element ON mark.element_id = element.id INNER JOIN model ON element.model_id = model.id WHERE model.brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39' ORDER BY mark.last_modified DESC LIMIT 1; and my DB i...
Postgres 15.2. I have the following simple query:
SELECT mark.last_modified FROM mark
INNER JOIN element ON mark.element_id = element.id
INNER JOIN model ON element.model_id = model.id
WHERE model.brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39'
ORDER BY mark.last_modified DESC LIMIT 1;
and my DB is configured with page_random_cost = 3.66 (should be closer to 1 - I know).
I analyze the query and notice it use memoize and the following plan:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=20338.19..30531.32 rows=1 width=8) (actual time=10588.047..10588.137 rows=0 loops=1)
Buffers: shared hit=1209004 read=8403
I/O Timings: shared/local read=8475.963
-> Nested Loop (cost=20338.19..1335251.72 rows=129 width=8) (actual time=10588.046..10588.135 rows=0 loops=1)
Buffers: shared hit=1209004 read=8403
I/O Timings: shared/local read=8475.963
-> Nested Loop (cost=20337.90..1326623.17 rows=278958 width=24) (actual time=101.060..10457.145 rows=277857 loops=1)
Buffers: shared hit=1205893 read=8403
I/O Timings: shared/local read=8475.963
-> Gather Merge (cost=20337.32..52826.58 rows=278958 width=24) (actual time=101.016..183.845 rows=277857 loops=1)
Workers Planned: 2
Workers Launched: 0
Buffers: shared hit=8396
-> Sort (cost=19337.30..19627.88 rows=116232 width=24) (actual time=100.708..147.960 rows=277857 loops=1)
Sort Key: mark.last_modified DESC
Sort Method: quicksort Memory: 29655kB
Buffers: shared hit=8396
-> Parallel Seq Scan on mark (cost=0.00..9558.33 rows=116232 width=24) (actual time=0.008..43.455 rows=277857 loops=1)
Buffers: shared hit=8396
-> Memoize (cost=0.57..5.24 rows=1 width=32) (actual time=0.037..0.037 rows=1 loops=277857)
Cache Key: mark.element_id
Cache Mode: logical
Hits: 36677 Misses: 241180 Evictions: 0 Overflows: 0 Memory Usage: 33916kB
Buffers: shared hit=1197497 read=8403
I/O Timings: shared/local read=8475.963
-> Index Scan using activity_pkey on element (cost=0.56..5.23 rows=1 width=32) (actual time=0.041..0.041 rows=1 loops=241180)
Index Cond: (id = mark.element_id)
Buffers: shared hit=1197497 read=8403
I/O Timings: shared/local read=8475.963
-> Memoize (cost=0.29..0.31 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=277857)
Cache Key: element.model_id
Cache Mode: logical
Hits: 276820 Misses: 1037 Evictions: 0 Overflows: 0 Memory Usage: 82kB
Buffers: shared hit=3111
-> Index Scan using model_pkey on model (cost=0.28..0.30 rows=1 width=16) (actual time=0.005..0.005 rows=0 loops=1037)
Index Cond: (id = element.model_id)
Filter: (brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39'::uuid)
Rows Removed by Filter: 1
Buffers: shared hit=3111
Planning:
Buffers: shared hit=1045
Planning Time: 1.985 ms
Execution Time: 10598.922 ms
(43 rows)
However, when I change random_page_cost to a lower value (=1.1) it change the plan and it doesn't use the memoize anyone:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=16808.48..16808.49 rows=1 width=8) (actual time=425.764..425.765 rows=0 loops=1)
Buffers: shared hit=348291 read=212
I/O Timings: shared/local read=173.481
-> Sort (cost=16808.48..16808.81 rows=129 width=8) (actual time=425.763..425.764 rows=0 loops=1)
Sort Key: mark.last_modified DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=348291 read=212
I/O Timings: shared/local read=173.481 QUERY PLAN
-> Nested Loop (cost=1.27..16807.84 rows=129 width=8) (actual time=425.738..425.739 rows=0 loops=1)
Buffers: shared hit=348288 read=212
I/O Timings: shared/local read=173.481
-> Nested Loop (cost=0.84..12253.80 rows=10110 width=16) (actual time=0.041..55.468 rows=111456 loops=1)
Buffers: shared hit=14132
-> Index Scan using model_brand_id_95f0c5ac on model (cost=0.28..4.33 rows=3 width=16) (actual time=0.028..0.033 rows=3 loops=1)
Index Cond: (brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39'::uuid)
Buffers: shared hit=4
-> Index Scan using element_model_id_c798104e on element (cost=0.56..4043.31 rows=3984 width=32) (actual time=0.009..14.143 rows=37152 loops=3)
Index Cond: (model_id = model.id)
Buffers: shared hit=14128
-> Index Scan using mark_element_id_4d370815 on mark (cost=0.42..0.44 rows=1 width=24) (actual time=0.003..0.003 rows=0 loops=111456)
Index Cond: (element_id = element.id)
Buffers: shared hit=334156 read=212
I/O Timings: shared/local read=173.481
Planning:
Buffers: shared hit=514
Planning Time: 0.649 ms
Execution Time: 425.799 ms
(27 rows)
Besides the fact that now the query performance is better, I would like to understand the relation between random_page_cost and the lake of use of memoize in the plan.
Cowabunga
(145 rep)
Jun 16, 2023, 04:31 PM
• Last activity: May 13, 2025, 05:03 PM
0
votes
1
answers
759
views
What exactly happens when optimizer_prune_level = 1?
This documentation [link][1] states that `optimizer_prune_level` >Controls the heuristics applied during query optimization to prune less-promising partial plans from the optimizer search space. Can anyone help with the knowledge of what kind of heuristic principles are applied? Consider this simple...
This documentation link states that
optimizer_prune_level
>Controls the heuristics applied during query optimization to prune less-promising partial plans from the optimizer search space.
Can anyone help with the knowledge of what kind of heuristic principles are applied?
Consider this simple schema:
CREATE TABLE college
(
colg_id
int(11) NOT NULL AUTO_INCREMENT,
colg_name
varchar(20) DEFAULT NULL,
colg_address
varchar(20) DEFAULT NULL,
avg_fees
int(11) DEFAULT NULL,
PRIMARY KEY (colg_id
)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4
CREATE TABLE department
(
dept_id
int(11) NOT NULL AUTO_INCREMENT,
dept_name
varchar(20) DEFAULT NULL,
dept_address
varchar(20) DEFAULT NULL,
dept_hod
varchar(20) DEFAULT NULL,
colg_id
int(11) DEFAULT NULL,
PRIMARY KEY (dept_id
),
KEY colg_id_fk
(colg_id
),
KEY dept_name
(dept_name
),
CONSTRAINT department_ibfk_1
FOREIGN KEY (colg_id
) REFERENCES college
(colg_id
)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4
CREATE TABLE student
(
stud_id
int(11) NOT NULL AUTO_INCREMENT,
stud_name
varchar(20) DEFAULT NULL,
stud_address
varchar(20) DEFAULT NULL,
dept_id
int(11) DEFAULT NULL,
year
int(11) DEFAULT NULL,
PRIMARY KEY (stud_id
),
KEY dept_id_fk
(dept_id
),
CONSTRAINT student_ibfk_1
FOREIGN KEY (dept_id
) REFERENCES department
(dept_id
)
) ENGINE=InnoDB AUTO_INCREMENT=1201 DEFAULT CHARSET=utf8mb4
Relation college
contains 10 rows, department
100 rows and student
1100 rows.
For the following query
select * from college c, department d, student s where c.colg_id = d.colg_id and d.dept_id = s.dept_id;
-the optimizer trace is attached: trace.json .
Going off of the trace (with my perception abilities), the optimizer has even tried calculation of cost for joining college
with student
. A cross product join order plan is also considered amongst others. So is cross product avoidance not part of heuristics? At least for less number of joins? So heuristics change with number of joins? If yes how and any examples?
Arun S R
(207 rep)
Aug 21, 2019, 10:03 AM
• Last activity: May 11, 2025, 08:07 PM
17
votes
1
answers
1711
views
Union does not always eliminate duplicates
I have the following query and expect that as a result I will have a list of IDs without duplicates. But sometimes it produces duplicates (1-2 on 4 million rows). Why can it happen? I run it with the default (read committed) isolation level. I can't use tablock/serializible because these are OLTP ta...
I have the following query and expect that as a result I will have a list of IDs without duplicates. But sometimes it produces duplicates (1-2 on 4 million rows).
Why can it happen? I run it with the default (read committed) isolation level. I can't use tablock/serializible because these are OLTP tables with hundreds of changes.
```
CREATE TABLE #characterId (
CharacterId BIGINT
)
DECLARE @dateTimeFrom DATETIME = '2025-05-04'
, @dateTimeTo DATETIME = '2025-05-07'
INSERT INTO #characterId (CharacterId)
SELECT Id
FROM table1 u
WHERE u.DateUpdated >= @dateTimeFrom
AND u.DateUpdated = @dateTimeFrom
AND usi.DateUpdated = @dateTimeFrom
AND ust.DateCreated (Build 20348: ) (Hypervisor)
Plan with
UNION ALL
instead of first UNION
:

Novitskiy Denis
(331 rep)
May 6, 2025, 07:33 AM
• Last activity: May 8, 2025, 06:09 PM
Showing page 1 of 20 total questions