Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes
1 answers
1861 views
MariaDB Inner join is slow
Here is my SQL (with explain): EXPLAIN SELECT * FROM GBI gbi INNER JOIN VF_KRED kred ON gbi.vendor = kred.vendor; Giving below the outcome: [![enter image description here][1]][1] Now, the `select` SQL takes ~10 seconds to produce the outcome. Below are the indexes on tables: - VF_KRED table: [![ent...
Here is my SQL (with explain): EXPLAIN SELECT * FROM GBI gbi INNER JOIN VF_KRED kred ON gbi.vendor = kred.vendor; Giving below the outcome: enter image description here Now, the select SQL takes ~10 seconds to produce the outcome. Below are the indexes on tables: - VF_KRED table: enter image description here - GBI table: enter image description here Any clues?
Akshay Lokur (121 rep)
Jun 25, 2019, 04:06 PM • Last activity: Aug 5, 2025, 04:05 PM
0 votes
1 answers
1561 views
Query with left join not using index
The below query is taking about 1 second to run SELECT `taxi_driver_mapping`.*, concat("dpk00", taxi_driver_mapping.mapping_payment_plan_id) AS package_text, `people`.*, `taxi`.*, `company`.`company_name`, `country`.`country_name`, `state`.`state_name`, `city`.`city_name`, `peoplecreated`.`name` AS...
The below query is taking about 1 second to run SELECT taxi_driver_mapping.*, concat("dpk00", taxi_driver_mapping.mapping_payment_plan_id) AS package_text, people.*, taxi.*, company.company_name, country.country_name, state.state_name, city.city_name, peoplecreated.name AS created_name, peopleupdated.name AS updated_name, device_owner FROM taxi_driver_mapping LEFT JOIN company ON taxi_driver_mapping.mapping_companyid = company.cid LEFT JOIN taxi ON taxi_driver_mapping.mapping_taxiid = taxi.taxi_id LEFT JOIN country ON taxi_driver_mapping.mapping_countryid = country.country_id LEFT JOIN state ON taxi_driver_mapping.mapping_stateid = state.state_id LEFT JOIN city ON taxi_driver_mapping.mapping_cityid = city.city_id LEFT JOIN people ON taxi_driver_mapping.mapping_driverid = people.id LEFT JOIN people AS peoplecreated ON taxi_driver_mapping.mapping_createdby = peoplecreated.id LEFT JOIN people AS peopleupdated ON taxi_driver_mapping.mapping_updatedby = peopleupdated.id LEFT JOIN driver_information ON taxi_driver_mapping.mapping_driverid = driver_information.driver_id WHERE (people.name LIKE '%abm1173%' OR people.lastname LIKE '%abm1173%' OR people.email LIKE '%abm1173%' OR company.company_name LIKE '%abm1173%' OR people.phone LIKE '%abm1173%' OR people.id LIKE '%abm1173%' OR people.username LIKE '%abm1173%' OR taxi.taxi_no LIKE '%abm1173%') ORDER BY mapping_id DESC limit 10 offset 0 Below is the explain plan for it: +----+-------------+---------------------+--------+------------------+---------+---------+----------------------------------------------+-------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------------+--------+------------------+---------+---------+----------------------------------------------+-------+----------------------------------------------------+ | 1 | SIMPLE | taxi_driver_mapping | ALL | NULL | NULL | NULL | NULL | 78718 | Using temporary; Using filesort | | 1 | SIMPLE | company | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_companyid | 1 | NULL | | 1 | SIMPLE | taxi | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_taxiid | 1 | NULL | | 1 | SIMPLE | country | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_countryid | 1 | NULL | | 1 | SIMPLE | state | ALL | PRIMARY,state_id | NULL | NULL | NULL | 3 | Using where; Using join buffer (Block Nested Loop) | | 1 | SIMPLE | city | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where; Using join buffer (Block Nested Loop) | | 1 | SIMPLE | people | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_driverid | 1 | Using where | | 1 | SIMPLE | peoplecreated | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_createdby | 1 | NULL | | 1 | SIMPLE | peopleupdated | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_updatedby | 1 | NULL | | 1 | SIMPLE | driver_information | eq_ref | PRIMARY | PRIMARY | 4 | Eztaxi.taxi_driver_mapping.mapping_driverid | 1 | NULL | +----+-------------+---------------------+--------+------------------+---------+---------+----------------------------------------------+-------+----------------------------------------------------+ Below is the table definition for taxi_driver_mapping table: | taxi_driver_mapping | CREATE TABLE taxi_driver_mapping ( mapping_id int(100) NOT NULL AUTO_INCREMENT, mapping_driverid int(100) NOT NULL, mapping_taxiid int(100) NOT NULL, mapping_taxi_model_id int(11) NOT NULL, mapping_companyid int(100) NOT NULL, mapping_countryid int(100) NOT NULL, mapping_stateid int(100) NOT NULL, mapping_cityid int(100) NOT NULL, mapping_startdate timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', mapping_enddate timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', mapping_createdby int(100) NOT NULL, mapping_createdate timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, mapping_status varchar(3) NOT NULL DEFAULT 'A', mapping_updatedby int(11) DEFAULT NULL, mapping_updatedate timestamp NULL DEFAULT NULL, mapping_payment_plan_id int(11) NOT NULL DEFAULT '0', PRIMARY KEY (mapping_id), KEY ndx_mapping_driverid (mapping_driverid), KEY ndx_mapping_taxiid (mapping_taxiid), KEY ndx_driver_status_taxi_cid (mapping_driverid,mapping_status,mapping_taxiid,mapping_companyid), KEY idx_test (mapping_status,mapping_driverid,mapping_taxiid) ) ENGINE=InnoDB AUTO_INCREMENT=123027 DEFAULT CHARSET=latin1 | Table definition for company table: | company | CREATE TABLE company ( cid int(100) NOT NULL AUTO_INCREMENT, company_name varchar(250) NOT NULL, company_address varchar(250) NOT NULL, current_location text NOT NULL, latitude varchar(250) NOT NULL, longitude varchar(250) NOT NULL, bankname varchar(100) NOT NULL, bankaccount_no varchar(250) NOT NULL, company_country int(11) NOT NULL, company_state int(11) NOT NULL, company_city int(11) NOT NULL, header_bgcolor varchar(25) NOT NULL, menu_color varchar(25) NOT NULL, mouseover_color varchar(25) NOT NULL, time_zone varchar(250) NOT NULL, userid int(11) NOT NULL, company_status varchar(3) NOT NULL DEFAULT 'A', company_type enum('0','1','2','4') NOT NULL DEFAULT '0' COMMENT '0-Company , 1-Kiosk , 2-Corporate 4-Restaurant', drivers_count int(11) NOT NULL DEFAULT '500', created_by int(11) DEFAULT NULL, created_at datetime DEFAULT NULL, updated_by int(11) DEFAULT NULL, updated_at datetime DEFAULT NULL, PRIMARY KEY (cid) ) ENGINE=InnoDB AUTO_INCREMENT=4122 DEFAULT CHARSET=latin1 | I thought that the column mapping_companyid used in the first left join not being indexed is the reason for full table scan. But even after adding the index to it, the query is taking 1 second. Can someone please let me know what could be the issue here? Let me know if any more info. is needed. Thanks
user5594148 (43 rep)
Jan 29, 2020, 06:36 PM • Last activity: Aug 2, 2025, 03:02 PM
6 votes
1 answers
1035 views
PostgreSQL predicate not pushed down (through join conditions)
Consider the following data model in a PostgreSQL v13 system; [![parent-child data model][1]][1] Here, parent table `dim` contains a small set of reference data, and child table `fact` contains a much higher volume of records. A typical use case for these data sets would be to query all `fact::value...
Consider the following data model in a PostgreSQL v13 system; parent-child data model Here, parent table dim contains a small set of reference data, and child table fact contains a much higher volume of records. A typical use case for these data sets would be to query all fact::value's data belonging to a dim::name. Note that dim::name holds a UNIQUE constraint. While I think this is a very common scenario, I was somewhat taken aback that the style of queries I've been using for years on other RDBMS's (Oracle, MSSQL) didn't perform _at all_ on PostgreSQL the way I imagined they would. That is, when querying a dataset (fact) using a highly selective, but implicit, predicate (fact::dim_id eq X) through a join condition, I expect the index on fact::dim_id to be used (in a nested-loop). Instead, a hash-join is used, requiring a full table scan of fact. **Question:** is there some way I can nudge the query planner into considering any predicate I issue on a joined relation to not need a full table scan? (without impacting other DB loads) To illustrate the problem with an example, these tables are populated with some random data;
CREATE TABLE dim(
  id       SERIAL NOT NULL
, name     TEXT   NOT NULL
, CONSTRAINT pk_dim PRIMARY KEY (id)
, CONSTRAINT uq_dim UNIQUE (name)
);

CREATE TABLE fact(
  id        SERIAL  NOT NULL
, dim_id    INTEGER NOT NULL
, value     TEXT
, CONSTRAINT pk_fact PRIMARY KEY (id)
, CONSTRAINT fk_facts_dim FOREIGN KEY (dim_id) REFERENCES dim (id)
);

CREATE INDEX idx_fact_dim ON fact(dim_id);

INSERT INTO dim(name)
SELECT SUBSTRING(md5(random()::TEXT) FOR 5)
FROM   generate_series(1,50)
UNION
SELECT 'key';

INSERT INTO fact(dim_id, value)
SELECT (SELECT id FROM dim ORDER BY random() LIMIT 1)
,      md5(random()::TEXT)
FROM   generate_series(1,1000000);

ANALYZE dim;
ANALYZE fact;
EXPLAIN ANALYZE
SELECT f.*
FROM   fact AS f
JOIN   dim  AS d
       ON (d.id = f.dim_id)
WHERE  d.name = 'key';       -- Note: UNIQUE

                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1001.65..18493.29 rows=20588 width=41) (actual time=319.331..322.582 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=1.65..15434.49 rows=8578 width=41) (actual time=306.193..306.195 rows=0 loops=3)
         Hash Cond: (f.dim_id = d.id)
         ->  Parallel Seq Scan on fact f  (cost=0.00..14188.98 rows=437498 width=41) (actual time=0.144..131.050 rows=350000 loops=3)
         ->  Hash  (cost=1.64..1.64 rows=1 width=4) (actual time=0.138..0.139 rows=1 loops=3)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on dim d  (cost=0.00..1.64 rows=1 width=4) (actual time=0.099..0.109 rows=1 loops=3)
                     Filter: (name = 'key'::text)
                     Rows Removed by Filter: 50
 Planning Time: 1.059 ms
 Execution Time: 322.662 ms
Now, we execute the same question, but instead of filtering using an inner join, we filter using a scalar subquery;
EXPLAIN ANALYZE
SELECT *
FROM   fact
WHERE  dim_id = (SELECT id FROM dim WHERE name = 'key');

                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_fact_dim on fact  (cost=2.07..15759.53 rows=524998 width=41) (actual time=0.096..0.097 rows=0 loops=1)
   Index Cond: (dim_id = $0)
   InitPlan 1 (returns $0)
     ->  Seq Scan on dim  (cost=0.00..1.64 rows=1 width=4) (actual time=0.046..0.054 rows=1 loops=1)
           Filter: (name = 'key'::text)
           Rows Removed by Filter: 50
 Planning Time: 0.313 ms
 Execution Time: 0.156 ms
As shown, the performance difference is huge. Somehow, the query planner did not consider the predicate on the unique dim::name attribute to be equal to a predicate on fact::dim_id in the first query.
Michiel T (161 rep)
Jan 15, 2021, 10:55 PM • Last activity: Jul 25, 2025, 03:10 AM
1 votes
1 answers
48 views
Postgres query planner join selectivity greater than 1?
I am using PostgreSQL 14.17. I am trying to debug a query planner failure in a bigger query, but I think I've narrowed down the problem to a self-join on a join table: ```sql SELECT t2.item_id FROM item_sessions t1 JOIN item_sessions t2 ON t1.session_key = t2.session_key WHERE t1.item_id = 'xxxxxxxx...
I am using PostgreSQL 14.17. I am trying to debug a query planner failure in a bigger query, but I think I've narrowed down the problem to a self-join on a join table:
SELECT t2.item_id
  FROM item_sessions t1
  JOIN item_sessions t2
       ON t1.session_key = t2.session_key
 WHERE t1.item_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
After running ANALYZE on the table, EXPLAIN gives this plan (which matches the subplan in the larger query):
Nested Loop  (cost=1.12..119.60 rows=7398 width=16)
   ->  Index Only Scan using item_sessions_item_id_session_key_uniq on item_sessions t1  (cost=0.56..8.58 rows=1 width=33)
         Index Cond: (item_id = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'::uuid)
   ->  Index Only Scan using item_sessions_session_idx on item_sessions t2  (cost=0.56..110.11 rows=91 width=49)
         Index Cond: (session_key = (t1.session_key)::text)
**Why is the loop estimating 7398 rows when the two child nodes estimate 1 and 91 respectively?** I would have expected the loop total to be less than 1 * 91 FWIW, the child estimates seem correct. item_id has n_distinct at -0.77649677, so the expected row count is 1.3, and session_key has n_distinct at 149555 out of an estimated 1.36e+07 tuples, which gives 90.9 expected tuples per session_key. The indexes referenced in the plan are: - item_sessions_session_idx btree (session_key, item_id) - item_sessions_item_id_session_key_uniq UNIQUE CONSTRAINT, btree (item_id, session_key) ETA: I created a minimal reproduction [here](https://github.com/felipeochoa/pg-plan-selectivity-gt1) . The failure is visible [in the job logs](https://github.com/felipeochoa/pg-plan-selectivity-gt1/actions/runs/16359411460/job/46224463766) on 17.5, 16.9, and 15.13
Felipe (317 rep)
Jul 17, 2025, 04:41 AM • Last activity: Jul 18, 2025, 09:14 AM
0 votes
1 answers
181 views
PostgreSQL CTE Recursion - How to Improve Performance for Parent->Child Tree (8 levels deep, 890 roots)
I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table). https...
I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table). https://explain.dalibo.com/plan/g2d2f5fg9c563b0d#plan https://explain.depesz.com/s/eaRu It seems like both the CTE table construction and moreso the scanning of the CTE to apply the constraints are the most time consuming parts of the query. Most work appears to be done in memory with the given settings. We still seem to scan 780GB of cache (cache hit) which seems like duplicate work. The actual number of nodes in any given tree may be 1000-5000. In this operation, we are starting from 890 root nodes. There are both 1:many and many:1 relationships in this tree (splitting, recombining) Doing some experiments in a blank environment I notice that: - To get all parent->child resource_ids in single column, we need to do left join so all leaf nodes will have a null child row with their calculated depth. - We track path to ensure we don't get in an infinite cycle. While path is important for avoiding cycles, it it not relevant for the final calculation. Some paths will be duplicating information for the same root -> descendant. This makes the CTE table much larger Are there any techniques that can be used here to make this recursion less painful? We need to calculate the deepest leaf node from the root matching the starting root to only the deepest leaf with the 'begat' relationship and matching 'sample type' property.
Justin Lowen (68 rep)
Oct 17, 2024, 05:14 PM • Last activity: Jul 16, 2025, 06:02 PM
1 votes
1 answers
177 views
Shredding Query Plan XML For Potential Skewed Estimates - Data Type Conversions
I'm using the below T-SQL to check for root level estimate skews in execution plans by pulling the estimated rows from the execution plans and comparing to the last rows from the sys.dm_exec_query_stats DMV. Obviously I can't do this calculation due to the data type mismatch, converting from varchar...
I'm using the below T-SQL to check for root level estimate skews in execution plans by pulling the estimated rows from the execution plans and comparing to the last rows from the sys.dm_exec_query_stats DMV. Obviously I can't do this calculation due to the data type mismatch, converting from varchar to bigint won't work in this scenario. Is there any way around this? SELECT DB_NAME(qt.dbid) AS [Database], cp.objtype AS [Cached_Plan_Type], cp.usecounts AS [Use_Counts], qp.query_plan.value('(//@CardinalityEstimationModelVersion)','INT') AS [CE_Version], qp.query_plan.value('(//@EstimateRows)', 'varchar(128)') AS [Estimated_Rows], qs.last_rows AS [Last Rows], --(qp.query_plan.value('(//@EstimateRows)', 'varchar(128)') - qs.last_rows) AS [Estimate_Skew], qs.total_logical_reads / qs.execution_count AS [Avg_Logical_Reads], CAST((qs.total_elapsed_time ) / 1000000.0/ qs.execution_count AS DECIMAL(28, 2)) AS [Average Execution time(s)], CAST((qs.total_worker_time) / 1000000.0/ qs.execution_count AS DECIMAL(28, 2)) AS [CPU Time Average (s)], qt.text AS [SQL_Statement], qs.query_hash AS [QueryHash], qp.query_plan AS [QueryPlan] FROM sys.dm_exec_cached_plans cp WITH (NOLOCK) CROSS APPLY sys.dm_exec_query_plan (cp.plan_handle) qp CROSS APPLY sys.dm_exec_sql_text (cp.plan_handle) qt INNER JOIN sys.dm_exec_query_stats qs ON qs.plan_handle = cp.plan_handle WHERE qt.text NOT LIKE '%sys.%' ORDER BY [CPU Time Average (s)] DESC OPTION(RECOMPILE); GO
Fza (652 rep)
Nov 18, 2016, 04:35 PM • Last activity: Jul 15, 2025, 05:10 PM
0 votes
1 answers
172 views
Query plan XML: one stmt's DegreeOfParallelism is 1 for any maxdop other than 1 (maxdop 1 leads to DegreeOfParallelism = 0 with reason MaxDopSetToOne)
I have a procedure with two SELECTs. One statement respects the server's/database's/query hint's maxdop setting and uses parallelism, the other is being difficult and never going parallel. The 8 core server and this db are configured for maxdop 0 (not great, and I can change it, but I'd like to leav...
I have a procedure with two SELECTs. One statement respects the server's/database's/query hint's maxdop setting and uses parallelism, the other is being difficult and never going parallel. The 8 core server and this db are configured for maxdop 0 (not great, and I can change it, but I'd like to leave it and figure out what's going on). Cost threshold is 5. Auto update stats is on and stats on the table are showing as being updated earlier today. I may try updating the stats on the table manually after hours tonight. This server and db are an AG secondary replica. The same procedure on the primary runs fast and both statements go parallel. The primary has more cores, and its maxdop is set explicitly to to 8. Statement 1 query hints - No query hint -> DegreeOfParallelism = 1 - Maxdop 0 -> DegreeOfParallelism = 1 - Maxdop 1 -> DegreeOfParallelism = 0 (reason MaxDopSetToOne) - Maxdop 8 -> DegreeOfParallelism = 1 - Maxdop 7 -> DegreeOfParallelism = 1 Statement 2 query hints - No query hint -> DegreeOfParallelism = 8 - Maxdop 0 -> DegreeOfParallelism = 8 - Maxdop 1 -> DegreeOfParallelism = 0 (reason MaxDopSetToOne) - Maxdop 8 -> DegreeOfParallelism = 8 - Maxdop 7 -> DegreeOfParallelism = 7 Thanks for any help!
Bobogator (95 rep)
Jan 24, 2022, 05:31 PM • Last activity: Jul 9, 2025, 12:05 PM
1 votes
2 answers
163 views
How to make Postgres use the correct index here?
Here's a test setup: ```sql CREATE TABLE t ( filename int, cropped bool not null default false, resized bool not null default false, create_date date not null default '1970-01-01' ); INSERT INTO t SELECT generate_series(1, 1000000); UPDATE t SET cropped = true, resized = true WHERE filename IN (SELE...
Here's a test setup:
CREATE TABLE t (
    filename int,
    cropped bool not null default false,
    resized bool not null default false,
    create_date date not null default '1970-01-01'
);

INSERT INTO t
SELECT generate_series(1, 1000000);

UPDATE t SET cropped = true, resized = true
WHERE filename IN (SELECT filename FROM t ORDER BY random() LIMIT 900000);
UPDATE t SET resized = false
WHERE filename IN (SELECT filename FROM t WHERE cropped = true ORDER BY random() LIMIT 1000);

VACUUM FULL t;
ANALYZE t;
[Fiddle](https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f356fd56a920ea8a93c192f5a8c16b1c) Data now looks like this:
SELECT cropped, resized, count(*)
FROM t
GROUP BY 1,2;
table I create some partial indexes for other queries and one index specifically for the query below:
CREATE INDEX idx_cropped ON t(cropped) WHERE NOT cropped;
CREATE INDEX idx_resized ON t(resized) WHERE NOT resized;

CREATE INDEX specific ON t(cropped,resized) WHERE cropped AND NOT resized;
Now I run my query:
EXPLAIN ANALYZE
    SELECT count(*) FROM t WHERE cropped AND NOT resized AND create_date   Index Scan using idx_resized on t  (cost=0.29..3777.71 rows=89415 width=0) (actual time=478.177..478.480 rows=1000 loops=1)
        Filter: (cropped AND (create_date   Bitmap Heap Scan on t  (cost=35.50..11652.73 rows=89415 width=0) (actual time=0.054..0.250 rows=1000 loops=1)
        Recheck Cond: (cropped AND (NOT resized))
        Filter: (create_date   Bitmap Index Scan on specific  (cost=0.00..13.15 rows=89415 width=0) (actual time=0.040..0.040 rows=1000 loops=1)
which uses the index specific and completes in less than a ms on both dbfiddle.uk and my machine. ----- Edit: Additional mystery - when I set the values not with an UPDATE but with a DEFAULT, then the correct index is chosen. What is going on? [Fiddle](https://dbfiddle.uk/?rdbms=postgres_14&fiddle=dc7d8aea14e90f08ab6537a855f34d8c)
AndreKR (607 rep)
Apr 18, 2022, 01:20 AM • Last activity: Jul 6, 2025, 04:06 PM
0 votes
1 answers
190 views
Is there any option to get actual execution plan as xml value to insert do different table?
Do you know If there is any option to get actual execution plan [![enter image description here][1]][1] [1]: https://i.sstatic.net/AlS30.png and insert this plan to another table ? I would like to have table with history of actual plans.
Do you know If there is any option to get actual execution plan enter image description here and insert this plan to another table ? I would like to have table with history of actual plans.
grochu (45 rep)
Sep 7, 2021, 09:11 AM • Last activity: Jun 25, 2025, 02:01 AM
0 votes
1 answers
89 views
Gather Streams plan operator bad performance
How to explain and troubleshoot the "final" Gather Streams query plan operator, which takes seconds, while the previous operators make it within one second? These are the left-most plan operators. I can't share the full plan for security reasons (sorry about that) [![enter image description here][1]...
How to explain and troubleshoot the "final" Gather Streams query plan operator, which takes seconds, while the previous operators make it within one second? These are the left-most plan operators. I can't share the full plan for security reasons (sorry about that) enter image description here The plan metrics I follow don't look suspicious to me: - Threads are pretty evenly distributed - No prevalent wait type (as far as the plan info goes) - Duration captured via Extended Event fits the Duration inside the plan stats enter image description here Is it that the gathering itself is so time-consuming on that number of rows ("just" 100k+)? **Additional info:** Here is the obfuscated query plan SQL Sentry Plan Explorer gives a different view of the problematic operator with Costs by CPU, so this could be a wrong interpretation of numbers (which numbers(?)) by SSMS. enter image description here **Additional info #2:** The obfuscated query plan I provide comes from the *query_post_execution_showplan* XE. The query typically (in most executions) runs for about 1 second. Only some executions go beyond >5 seconds (my XE is filtered for Duration > 5s).
jericzech (977 rep)
Jun 17, 2025, 11:40 AM • Last activity: Jun 20, 2025, 07:06 PM
1 votes
0 answers
25 views
Dynamic sampling works differently than documented
Dynamic sampling in Oracle versions 12c to 19 is documented as functioning the same if it is enabled via the init parameter or by hinting. This does not seem to be true. I have tested this in customer databases, personal databases, on Linux and windows, and it always results in the same conclusion....
Dynamic sampling in Oracle versions 12c to 19 is documented as functioning the same if it is enabled via the init parameter or by hinting. This does not seem to be true. I have tested this in customer databases, personal databases, on Linux and windows, and it always results in the same conclusion. When dynamic sampling at level 2 is only enabled via the init parameter, dynamic sampling occurs for objects without statistics, as advertised. When there is a /*+ dynamic_sampling(2) */ hint in the query, you get sampling for functions, multi-column joins and maybe more. Here is an example to demonstrate my point: - Connect to the database and query the sampling queries:
select sql_id, executions, sql_text from v$sql 
  where parsing_schema_name = user and sql_text like 'SAMPLESUB';
- Confirm the init parameter is set to 2 show parameter optimizer_dynamic_sampling; - Put something into the plan table explain plan for select * from user_tables; - Look at the sampling queries again - Query the plan table with no hints select * from table(dbms_xplan.display); - Look at the sampling queries again - Query the plan table with the hint select /*+ dynamic_sampling(2) */ * from table(dbms_xplan.display); - Look at the sampling queries again, you will see more sampling queries after using the hint Explain plans and optimizer traces provide much more concrete data to confirm this.
andrew123 (11 rep)
Jun 17, 2025, 07:43 PM • Last activity: Jun 17, 2025, 09:19 PM
14 votes
1 answers
2948 views
How does SQL Server's optimizer estimate the number of rows in a joined table?
I am running this query in the [AdventureWorks2012][1] database: SELECT s.SalesOrderID, d.CarrierTrackingNumber, d.ProductID, d.OrderQty FROM Sales.SalesOrderHeader s JOIN Sales.SalesOrderDetail d ON s.SalesOrderID = d.SalesOrderID WHERE s.CustomerID = 11077 If I look at the estimated execution plan...
I am running this query in the AdventureWorks2012 database: SELECT s.SalesOrderID, d.CarrierTrackingNumber, d.ProductID, d.OrderQty FROM Sales.SalesOrderHeader s JOIN Sales.SalesOrderDetail d ON s.SalesOrderID = d.SalesOrderID WHERE s.CustomerID = 11077 If I look at the estimated execution plan, I see the following: enter image description here The initial index seek (top right) is using the IX_SalesOrderHeader_CustomerID index and searching on the literal 11077. It has an estimate of 2.6192 rows. enter image description here If I use DBCC SHOW_STATISTICS ('Sales.SalesOrderHeader', 'IX_SalesOrderHeader_CustomerID') WITH HISTOGRAM, it shows that the value 11077 is between the two sampled keys 11019 and 11091. enter image description here The average number of distinct rows between 11019 and 11091 is 2.619718, or rounded to 2.61972 which is the value of estimated rows shown for the index seek. The part I don't understand is the estimated number of rows for the clustered index seek against the SalesOrderDetail table. enter image description here If I run DBCC SHOW_STATISTICS ('Sales.SalesOrderDetail', 'PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID'): enter image description here So the density of the SalesOrderID (which I am joining on) is 3.178134E-05. That means that 1/3.178134E-05 (31465) equals the number of unique SalesOrderID values in the SalesOrderDetail table. If there are 31465 unique SalesOrderID's in the SalesOrderDetail, then with an even distribution, the average number of rows per SalesOrderID is 121317 (total number of rows) divided by 31465. The average is 3.85561 So if the estimated number of rows to be loop through is 2.61972, and the average to be returned in 3.85561, the I would think the estimated number of rows would be 2.61972 * 3.85561 = 10.10062. But the estimated number of rows is 11.4867. I think my understanding of the second estimate is incorrect and the differing numbers seems to indicate that. What am I missing?
8kb (2639 rep)
Apr 2, 2015, 04:46 PM • Last activity: Jun 10, 2025, 08:26 AM
0 votes
1 answers
5079 views
Costly HASH MATCH Aggregate
I have a view,which is slow when i query a simple select statement with a `where` clause.The where clause column is indexed(`Non-clustered` index) Here is the view and the plan. [execution plan][1] CREATE VIEW [dbo].[CurrentIncidentStatus] AS SELECT [incidentDetails].[IncidentStatus], [incidentDetai...
I have a view,which is slow when i query a simple select statement with a where clause.The where clause column is indexed(Non-clustered index) Here is the view and the plan. execution plan CREATE VIEW [dbo].[CurrentIncidentStatus] AS SELECT [incidentDetails].[IncidentStatus], [incidentDetails].[IncidentStatus_FieldValue], [incidentDetails].[IncidentStatus_Description], Report.Id ReportId, Form.Id FormId FROM [dbo].[IncidentDetailsPage_Incident] incidentDetails WITH (NOLOCK) INNER JOIN [dbo].[IncidentDetailsPages] detailsPage WITH (NOLOCK) ON incidentDetails.PageId = detailsPage.Id INNER JOIN Form WITH (NOLOCK) ON detailsPage.FormId = Form.Id INNER JOIN Report WITH (NOLOCK) ON Form.ReportId = Report.Id LEFT OUTER JOIN dbo.IncidentDetailsPage_Supplements supplement WITH (NOLOCK) ON detailsPage.Id = supplement.PageId INNER JOIN ( SELECT ReportId, Max(FormNumber) RecentFormNumber FROM ( SELECT Report.Id ReportId, FormId, COALESCE(SupplementNumber, '0000') formNumber FROM dbo.IncidentDetailsPages detailPage WITH (NOLOCK) INNER JOIN Form WITH (NOLOCK) ON detailPage.FormId = Form.Id INNER JOIN Report WITH (NOLOCK) ON Form.ReportId = Report.Id INNER JOIN dbo.IncidentDetailsPage_Incident incident WITH (NOLOCK) ON detailPage.Id = incident.PageId LEFT OUTER JOIN dbo.IncidentDetailsPage_Supplements supplement WITH (NOLOCK) ON detailPage.Id = supplement.PageId) FormNumbers GROUP By ReportId) RecentForm ON Report.Id = RecentForm.ReportId AND RecentForm.RecentFormNumber = COALESCE(supplement.SupplementNumber, '0000') GO I analyzed the plan in Sentry Plan explorer and learned that the HASH JOIN Aggregate is the costly operation in the plan. I am trying to avoid/remove that HASH JOIN someway.? If any of you have experienced similar situation please give some suggestion. Additional Info: The plan says Hash key is built on Table Report and column Id. In-fact the Idcolumn is the primary key in the Report table ,so a clustered index is present also a Non-clustered index is created for Id column. Still why Hashing is required for Report.Id? Here is the screenshot of the HASH MATCH Aggregate operation. enter image description here Info: When i select TOP 100 or 1000 it doesn't show HASH MATCH (Aggregate). Thanks
user9516827 (1345 rep)
Nov 1, 2018, 03:05 PM • Last activity: Jun 5, 2025, 06:07 AM
0 votes
1 answers
261 views
Query performance discrepancy between Azure and on premise SQL Server
I have the same database on both SQL Azure and on premise. Most of my queries run in similar time on the two instances but I have a handful of queries where the performance difference is on the order of 60x. I suspected that maybe I just didn't have a sufficiently large instance so I scaled to a 4 v...
I have the same database on both SQL Azure and on premise. Most of my queries run in similar time on the two instances but I have a handful of queries where the performance difference is on the order of 60x. I suspected that maybe I just didn't have a sufficiently large instance so I scaled to a 4 vCore instance with ~20 gig of memory. As my entire database is only 4.5 GB this seemed like it would be able to hold the entire database in memory. Turning on IO stats and running the query on the on premise box yields results like
Table 'tblCurrentPlateActivity'. Scan count 1, logical reads 201, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblViolationStatus'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblCostStructureTier'. Scan count 101, logical reads 401, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblViolation'. Scan count 474, logical reads 2106236, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblCostStructure'. Scan count 0, logical reads 190, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblProvinceState'. Scan count 0, logical reads 200, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblGroup'. Scan count 0, logical reads 200, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#GroupIDs___________________________________________________________________________________________________________00000000011F'. Scan count 1, logical reads 100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblLot'. Scan count 0, logical reads 215, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#LotIDs_____________________________________________________________________________________________________________000000000120'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#PlateProvinceStateIDs______________________________________________________________________________________________000000000122'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#ProvinceStatesIDs__________________________________________________________________________________________________000000000121'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblViolationType'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#LotDate____________________________________________________________________________________________________________00000000011E'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see all reads are logical and the larges culprit is tblViolation which I would expect. On SQL Azure I see similar results for most tables
Table 'tblCurrentPlateActivity'. Scan count 1, logical reads 201, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblViolationStatus'. Scan count 1, logical reads 300, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblCostStructureTier'. Scan count 101, logical reads 401, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 200, logical reads 797, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblViolation'. Scan count 1, logical reads 2082323, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblCostStructure'. Scan count 0, logical reads 190, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#LotDate____________________________________________________________________________________________________________000000000072'. Scan count 1, logical reads 1336728, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblProvinceState'. Scan count 0, logical reads 1336728, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblGroup'. Scan count 0, logical reads 1336728, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#GroupIDs___________________________________________________________________________________________________________000000000073'. Scan count 1, logical reads 668364, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblLot'. Scan count 0, logical reads 1378509, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#PlateProvinceStateIDs______________________________________________________________________________________________000000000076'. Scan count 1, logical reads 668364, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#ProvinceStatesIDs__________________________________________________________________________________________________000000000075'. Scan count 1, logical reads 673785, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#LotIDs_____________________________________________________________________________________________________________000000000074'. Scan count 1, logical reads 871241, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'tblViolationType'. Scan count 0, logical reads 1347570, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
The query plans generated are pretty different likely because the on premise server is 2012. But looking at the query execution in SQL Azure I see some odd discrepancies for example Number of rows Now that #tblLot is generated as part of the query and I know that is only has those 474 records in it. What could be causing such a weird discrepancy and how would I solve it? I imagine the query plan is super inefficient because of this miscalculation of number of rows. The on premise database shows the rows correctly. I'm happy to include query plans here but I'm not really sure what the best way is to do that. Export the .sqlplan? enter image description here **Update** I reduced the query down to a simple query which shows some of the same performance issues
SELECT top 100 Case when ISNULL(cpa.tow, 0) = 0 then '-'			
			else 'TW'  	
			end as PlateActivityFormatted,			
			Case when v.ViolationStatusID = 6 then 1
			else 0 end  AS IsOverdueToday,
			v.ViolationStatusID AS OriginalViolationStatusID,				
			v.ViolationStatusID AS RevisedViolationStatusID,
			vs.StatusShortName AS RevisedViolationStatusDescription,  
			tblGroup.Name AS LotGroupName, 
			cpa.tow As PlateTowing, 
			[tblProvinceState].Name as ProvinceState, 
			[tblLot].TaxNumber, 
			[tblLot].Name LotName, 
			v.* 
	   FROM [tblViolation] v INNER JOIN 
			[ViolationFinanceInformation] on [ViolationFinanceInformation].ViolationID = v.ID INNER JOIN 
			[tblViolationType] on [tblViolationType].ID = v.ViolationTypeID JOIN 
			[tblLot] ON v.LotID = [tblLot].ID JOIN 
			[tblGroup] ON [tblLot].GroupID = [tblGroup].ID JOIN 
			[tblProvinceState] ON [tblLot].ProvinceStateID = [tblProvinceState].ID LEFT JOIN 
			tblViolationStatus vs ON vs.StatusID = v.ViolationStatusID LEFT JOIN 
			tblCurrentPlateActivity cpa ON cpa.vendorId = tblGroup.vendorid and cpa.plateNumber = v.plateNumber 
			INNER JOIN #LotDate on #LotDate.ID = tblLot.ID 
		where Issued >= #LotDate.Start
Both databases perform equally well without the where clause, returning in sub 1s, but as soon as the where clause is added the on premise results remain about the same but the Azure results increase to 22 seconds. I put the good and bad query plans here: Good: https://www.brentozar.com/pastetheplan/?id=SyijStC88 Bad: https://www.brentozar.com/pastetheplan/?id=BymeUK0UI https://1drv.ms/u/s!AmN_cJhXT0rP4WuYMib6bhCMoPon?e=YvFOQb
stimms (353 rep)
Mar 29, 2020, 02:31 PM • Last activity: Jun 2, 2025, 10:04 PM
0 votes
1 answers
879 views
Performance issues after upgrading to Mysql 8.0.19 version
Few queries are performing very poorly after the upgrade. Here are the examples. ****Example 1**:** SELECT x.CI_TITLE 'ciTitle', x.DEPARTMENT_MEANING 'departmentMeaning', x.RESIDUAL_FINANCIAL_IMPACT_LOSS 'residualFinancialImpactLoss', x.INHERENT_CONFIDENTIALITY 'inherentConfidentiality', x.PRIMARY_F...
Few queries are performing very poorly after the upgrade. Here are the examples. ****Example 1**:** SELECT x.CI_TITLE 'ciTitle', x.DEPARTMENT_MEANING 'departmentMeaning', x.RESIDUAL_FINANCIAL_IMPACT_LOSS 'residualFinancialImpactLoss', x.INHERENT_CONFIDENTIALITY 'inherentConfidentiality', x.PRIMARY_FLAG 'primaryFlag', x.MEDIA_IMPACT_COMMENT 'mediaImpactComment', x.OPEN_CHALLENGE 'openChallenge' FROM omb_risk_latest_eval_v x WHERE x.COMPANY_ID = 187 ORDER BY x.LAST_UPDATE_DATE DESC LIMIT 30 OFFSET 0 The above query is taking average of 1.2 sec in 5.7.24, however the same query is taking 26 secs in 8.0.19. Here are certain things I tried my level best. 1) All the tables have indexes present 2)Rebuilt all the tables in 8 version using alter table engine=innodb; 3) Checked the optimizer switch variables. Compared in both versions. Found the below 3 were extra skip_scan=on hash_join=on Tried by disabling them, but no impact 4)googled many forums for such kind of issues. They suggested the below but these didn't work. condition_fanout_filter=off internal_tmp_mem_storage_engine=MEMORY suggestions from the below sites.. http://dimitrik.free.fr/blog/archives/2018/04/mysql-performance-testing-80-with-less-blood.html https://www.percona.com/blog/2018/05/04/how-binary-logs-affect-mysql-8-0-performance/ Also we are confident that there could be some server side settings which should be disabled explicitly in 8.x version of MYSQL, as these queries are performing very well in 5.7. Wanted to share the exec plan for both versions but not sure how to do it as we have only image as attachment Any suggestions/help please... I
Satya Ammu (1 rep)
Sep 24, 2021, 01:24 PM • Last activity: May 30, 2025, 06:03 PM
0 votes
1 answers
250 views
Why is below 2 query generating different execution plan in MYSQL?
Only difference in the below query is the order by column. The resultset of both query is same. However, the 2nd query execution plan shows 'Using temporary; Using filesort' for ordering where as for 1st query it does not require filesort. Apart from id I have index on c.table_b_id and c.some_indexe...
Only difference in the below query is the order by column. The resultset of both query is same. However, the 2nd query execution plan shows 'Using temporary; Using filesort' for ordering where as for 1st query it does not require filesort. Apart from id I have index on c.table_b_id and c.some_indexed_id SELECT * FROM TableA a inner join TableB b on a.id = b.id inner join TableC c on c.table_b_id = b.id where c.some_indexed_id = 1 order by c.table_b_id asc; SELECT * FROM TableA a inner join TableB b on a.id = b.id inner join TableC c on c.table_b_id = b.id where c.indexed_id = 1 order by b.id asc;
sagar (1 rep)
Apr 21, 2021, 06:01 AM • Last activity: May 21, 2025, 12:06 PM
1 votes
1 answers
309 views
Testing the estimated query plan of a portion of a trigger that uses inserted/deleted
I'm performance tuning a SQL Server trigger that has several statements using the "inserted" and "deleted" tables. How can I break out those individual queries from the trigger for testing in order to see their estimated execution plans? The "inserted" and "deleted" tables have no context outside of...
I'm performance tuning a SQL Server trigger that has several statements using the "inserted" and "deleted" tables. How can I break out those individual queries from the trigger for testing in order to see their estimated execution plans? The "inserted" and "deleted" tables have no context outside of a trigger. I want to make sure that it mimics the behavior of the trigger as closely as possible. The trigger I'm testing is INSTEAD OF DELETE. I was hoping not to actually delete the record so that I could easily test the before and after without deleting the row.
skeletank (301 rep)
Nov 16, 2022, 03:52 PM • Last activity: May 16, 2025, 09:04 AM
2 votes
1 answers
289 views
random_page_cost and memoize plan relation
Postgres 15.2. I have the following simple query: SELECT mark.last_modified FROM mark INNER JOIN element ON mark.element_id = element.id INNER JOIN model ON element.model_id = model.id WHERE model.brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39' ORDER BY mark.last_modified DESC LIMIT 1; and my DB i...
Postgres 15.2. I have the following simple query: SELECT mark.last_modified FROM mark INNER JOIN element ON mark.element_id = element.id INNER JOIN model ON element.model_id = model.id WHERE model.brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39' ORDER BY mark.last_modified DESC LIMIT 1; and my DB is configured with page_random_cost = 3.66 (should be closer to 1 - I know). I analyze the query and notice it use memoize and the following plan:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=20338.19..30531.32 rows=1 width=8) (actual time=10588.047..10588.137 rows=0 loops=1)
Buffers: shared hit=1209004 read=8403
I/O Timings: shared/local read=8475.963
->  Nested Loop  (cost=20338.19..1335251.72 rows=129 width=8) (actual time=10588.046..10588.135 rows=0 loops=1)
        Buffers: shared hit=1209004 read=8403
        I/O Timings: shared/local read=8475.963
        ->  Nested Loop  (cost=20337.90..1326623.17 rows=278958 width=24) (actual time=101.060..10457.145 rows=277857 loops=1)
            Buffers: shared hit=1205893 read=8403
            I/O Timings: shared/local read=8475.963
            ->  Gather Merge  (cost=20337.32..52826.58 rows=278958 width=24) (actual time=101.016..183.845 rows=277857 loops=1)
                    Workers Planned: 2
                    Workers Launched: 0
                    Buffers: shared hit=8396
                    ->  Sort  (cost=19337.30..19627.88 rows=116232 width=24) (actual time=100.708..147.960 rows=277857 loops=1)
                        Sort Key: mark.last_modified DESC
                        Sort Method: quicksort  Memory: 29655kB
                        Buffers: shared hit=8396
                        ->  Parallel Seq Scan on mark  (cost=0.00..9558.33 rows=116232 width=24) (actual time=0.008..43.455 rows=277857 loops=1)
                                Buffers: shared hit=8396
            ->  Memoize  (cost=0.57..5.24 rows=1 width=32) (actual time=0.037..0.037 rows=1 loops=277857)
                    Cache Key: mark.element_id
                    Cache Mode: logical
                    Hits: 36677  Misses: 241180  Evictions: 0  Overflows: 0  Memory Usage: 33916kB
                    Buffers: shared hit=1197497 read=8403
                    I/O Timings: shared/local read=8475.963
                    ->  Index Scan using activity_pkey on element  (cost=0.56..5.23 rows=1 width=32) (actual time=0.041..0.041 rows=1 loops=241180)
                        Index Cond: (id = mark.element_id)
                        Buffers: shared hit=1197497 read=8403
                        I/O Timings: shared/local read=8475.963
        ->  Memoize  (cost=0.29..0.31 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=277857)
            Cache Key: element.model_id
            Cache Mode: logical
            Hits: 276820  Misses: 1037  Evictions: 0  Overflows: 0  Memory Usage: 82kB
            Buffers: shared hit=3111
            ->  Index Scan using model_pkey on model  (cost=0.28..0.30 rows=1 width=16) (actual time=0.005..0.005 rows=0 loops=1037)
                    Index Cond: (id = element.model_id)
                    Filter: (brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39'::uuid)
                    Rows Removed by Filter: 1
                    Buffers: shared hit=3111
Planning:
Buffers: shared hit=1045
Planning Time: 1.985 ms
Execution Time: 10598.922 ms
(43 rows)
However, when I change random_page_cost to a lower value (=1.1) it change the plan and it doesn't use the memoize anyone:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=16808.48..16808.49 rows=1 width=8) (actual time=425.764..425.765 rows=0 loops=1)
Buffers: shared hit=348291 read=212
I/O Timings: shared/local read=173.481
->  Sort  (cost=16808.48..16808.81 rows=129 width=8) (actual time=425.763..425.764 rows=0 loops=1)
        Sort Key: mark.last_modified DESC
        Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=348291 read=212
        I/O Timings: shared/local read=173.481                                                                                   QUERY PLAN
        ->  Nested Loop  (cost=1.27..16807.84 rows=129 width=8) (actual time=425.738..425.739 rows=0 loops=1)
            Buffers: shared hit=348288 read=212
            I/O Timings: shared/local read=173.481
            ->  Nested Loop  (cost=0.84..12253.80 rows=10110 width=16) (actual time=0.041..55.468 rows=111456 loops=1)
                    Buffers: shared hit=14132
                    ->  Index Scan using model_brand_id_95f0c5ac on model  (cost=0.28..4.33 rows=3 width=16) (actual time=0.028..0.033 rows=3 loops=1)
                        Index Cond: (brand_id = '9cb22c38-af8a-4347-988e-1b2287122d39'::uuid)
                        Buffers: shared hit=4
                    ->  Index Scan using element_model_id_c798104e on element  (cost=0.56..4043.31 rows=3984 width=32) (actual time=0.009..14.143 rows=37152 loops=3)
                        Index Cond: (model_id = model.id)
                        Buffers: shared hit=14128
            ->  Index Scan using mark_element_id_4d370815 on mark  (cost=0.42..0.44 rows=1 width=24) (actual time=0.003..0.003 rows=0 loops=111456)
                    Index Cond: (element_id = element.id)
                    Buffers: shared hit=334156 read=212
                    I/O Timings: shared/local read=173.481
Planning:
Buffers: shared hit=514
Planning Time: 0.649 ms
Execution Time: 425.799 ms
(27 rows)
Besides the fact that now the query performance is better, I would like to understand the relation between random_page_cost and the lake of use of memoize in the plan.
Cowabunga (145 rep)
Jun 16, 2023, 04:31 PM • Last activity: May 13, 2025, 05:03 PM
0 votes
1 answers
759 views
What exactly happens when optimizer_prune_level = 1?
This documentation [link][1] states that `optimizer_prune_level` >Controls the heuristics applied during query optimization to prune less-promising partial plans from the optimizer search space. Can anyone help with the knowledge of what kind of heuristic principles are applied? Consider this simple...
This documentation link states that optimizer_prune_level >Controls the heuristics applied during query optimization to prune less-promising partial plans from the optimizer search space. Can anyone help with the knowledge of what kind of heuristic principles are applied? Consider this simple schema:
CREATE TABLE college (
  colg_id int(11) NOT NULL AUTO_INCREMENT,
  colg_name varchar(20) DEFAULT NULL,
  colg_address varchar(20) DEFAULT NULL,
  avg_fees int(11) DEFAULT NULL,
  PRIMARY KEY (colg_id)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4

CREATE TABLE department (
  dept_id int(11) NOT NULL AUTO_INCREMENT,
  dept_name varchar(20) DEFAULT NULL,
  dept_address varchar(20) DEFAULT NULL,
  dept_hod varchar(20) DEFAULT NULL,
  colg_id int(11) DEFAULT NULL,
  PRIMARY KEY (dept_id),
  KEY colg_id_fk (colg_id),
  KEY dept_name (dept_name),
  CONSTRAINT department_ibfk_1 FOREIGN KEY (colg_id) REFERENCES college (colg_id)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8mb4

CREATE TABLE student (
  stud_id int(11) NOT NULL AUTO_INCREMENT,
  stud_name varchar(20) DEFAULT NULL,
  stud_address varchar(20) DEFAULT NULL,
  dept_id int(11) DEFAULT NULL,
  year int(11) DEFAULT NULL,
  PRIMARY KEY (stud_id),
  KEY dept_id_fk (dept_id),
  CONSTRAINT student_ibfk_1 FOREIGN KEY (dept_id) REFERENCES department (dept_id)
) ENGINE=InnoDB AUTO_INCREMENT=1201 DEFAULT CHARSET=utf8mb4
Relation college contains 10 rows, department 100 rows and student 1100 rows. For the following query
select * from college c, department d, student s where c.colg_id = d.colg_id and d.dept_id = s.dept_id;
-the optimizer trace is attached: trace.json . Going off of the trace (with my perception abilities), the optimizer has even tried calculation of cost for joining college with student. A cross product join order plan is also considered amongst others. So is cross product avoidance not part of heuristics? At least for less number of joins? So heuristics change with number of joins? If yes how and any examples?
Arun S R (207 rep)
Aug 21, 2019, 10:03 AM • Last activity: May 11, 2025, 08:07 PM
17 votes
1 answers
1711 views
Union does not always eliminate duplicates
I have the following query and expect that as a result I will have a list of IDs without duplicates. But sometimes it produces duplicates (1-2 on 4 million rows). Why can it happen? I run it with the default (read committed) isolation level. I can't use tablock/serializible because these are OLTP ta...
I have the following query and expect that as a result I will have a list of IDs without duplicates. But sometimes it produces duplicates (1-2 on 4 million rows). Why can it happen? I run it with the default (read committed) isolation level. I can't use tablock/serializible because these are OLTP tables with hundreds of changes. ``` CREATE TABLE #characterId ( CharacterId BIGINT ) DECLARE @dateTimeFrom DATETIME = '2025-05-04' , @dateTimeTo DATETIME = '2025-05-07' INSERT INTO #characterId (CharacterId) SELECT Id FROM table1 u WHERE u.DateUpdated >= @dateTimeFrom AND u.DateUpdated = @dateTimeFrom AND usi.DateUpdated = @dateTimeFrom AND ust.DateCreated (Build 20348: ) (Hypervisor) Plan with UNION ALL instead of first UNION: enter image description here
Novitskiy Denis (331 rep)
May 6, 2025, 07:33 AM • Last activity: May 8, 2025, 06:09 PM
Showing page 1 of 20 total questions