Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

-1 votes

0 answers

20 views

How can I enable partition elimination to reduce the number of partitions read in my procedure?

Question : How can I enable partition elimination so it runs faster and scans fewer partitions? I'm working with a large partitioned table and trying to improve performance by ensuring only the relevant partition is scanned. However, the execution plan shows all 590 partitions are being accessed. En...

                                  Question : How can I enable partition elimination so it runs faster and scans fewer partitions?

I'm working with a large partitioned table and trying to improve performance by ensuring only the relevant partition is scanned. However, the execution plan shows all 590 partitions are being accessed.

Environment:

azure sql paas 

Table is partitioned by Created using RANGE RIGHT

Clustered primary key on (Id, Created)

The procedure:


    CREATE PROCEDURE dbo.GetNextPage
    (
        @Next BIGINT,
        @Limit INT
    )
    AS
    BEGIN
        SELECT TOP (@Limit) [RowId], [RecordKey], [GroupId], [SourceSystem], [SchemaVersion], [RecordType], [Payload], [CreatedDate]
        FROM dbo.Event
        WHERE RowId > @Next
        ORDER BY RowId ASC
    END

The query returns results, but the execution plan shows it seeks all partitions.


                                

dexon (65 rep)

Aug 3, 2025, 03:16 PM

0 votes

1 answers

115 views

Did performance when querying partitioned tables using min/max functions or TOP improve after SQL Server 2022?

sql-server partitioning max top sql-server-2025

With partitioned tables in SQL Server, there is a [notorious major performance issue](https://web.archive.org/web/20130412223317/https://connect.microsoft.com/SQLServer/feedback/details/240968/partition-table-using-min-max-functions-and-top-n-index-selection-and-performance) when using using min/max...

                                  With partitioned tables in SQL Server, there is a [notorious major performance issue](https://web.archive.org/web/20130412223317/https://connect.microsoft.com/SQLServer/feedback/details/240968/partition-table-using-min-max-functions-and-top-n-index-selection-and-performance)  when using using min/max functions or TOP. Microsoft document workarounds for it [here](https://learn.microsoft.com/en-us/troubleshoot/sql/database-engine/performance/decreased-performance-run-aggregating-clause) . I am confident that this was not fixed in SQL Server 2022. Microsoft surely would have updated the workaround list if giving them more money was a workaround.

However, was this changed after SQL Server 2022? I am sure that I saw a working link to [this Connect item](https://connect.microsoft.com/SQLServer/feedback/details/240968/partition-table-using-min-max-functions-and-top-n-index-selection-and-performance)  in 2024. Today, I cannot find it even on the modern [Azure suggestions thing](https://feedback.azure.com/d365community/forum/04fe6ee0-3b25-ec11-b6e6-000d3a4f0da0)  that all of the Connect items were migrated to. This suggests to me that something has happened with this decade-old bug in the last few years.

I cannot answer this myself, since I do not have access to SQL Server 2025 or any bleeding-edge Azure stuff, I hear that preview builds for SQL Server 2025 have been released.

J. Mini (1225 rep)

Feb 2, 2025, 12:47 PM • Last activity: Aug 2, 2025, 10:54 PM

3 votes

2 answers

499 views

Partitioning heaps - Why?

sql-server partitioning index-tuning sql-server-2022 heap

I know, at least theoretically, that heap tables can be partitioned. As I learn more about table partitioning, I realise that the use case is very limited. Similarly, the use case for heaps in SQL Server is also very limited. Whenever I intersect my list of heap use cases and partitioning use cases,...

                                  I know, at least theoretically, that heap tables can be partitioned. As I learn more about table partitioning, I realise that the use case is very limited. Similarly, the use case for heaps in SQL Server is also very limited. Whenever I intersect my list of heap use cases and partitioning use cases, I find myself with an empty list.

Are there any sets of circumstances where a partitioned heap would appear strongly desirable?

J. Mini (1225 rep)

Jul 26, 2025, 11:37 PM • Last activity: Jul 28, 2025, 04:48 AM

3 votes

1 answers

474 views

On PostgreSQL 9.5, does partitioning a table prevent autovacuum from automatically running ANALYZE?

postgresql partitioning autovacuum

I'm trying to grasp the following sentence from https://www.postgresql.org/docs/9.5/static/sql-analyze.html > The autovacuum daemon, however, will only consider inserts or updates on the parent table itself when deciding whether to trigger an automatic analyze for that table. If that table is rarely...

                                  I'm trying to grasp the following sentence from https://www.postgresql.org/docs/9.5/static/sql-analyze.html 

> The autovacuum daemon, however, will only consider inserts or updates on the parent table itself
when deciding whether to trigger an automatic analyze for that table. If that table is rarely inserted
into or updated, the inheritance statistics will not be up to date unless you run ANALYZE manually.

Since partitioning is usually implemented through inheritance, initially, I thought this sentence implies that, being the master table empty in a partitioned table, autovacuum would never trigger an automatic ANALYZE on the whole partitioned table (including all children) leaving the planner with no statistics at all for the partitions.

On the other hand, "that table" seems to refer only to the parent table (master).  If this second reading is correct, autovacuum does trigger ANALYZE automatically on the children tables and, obviously, ignores the parent, empty table. There would be no need to run ANALYZE manually in this case.

Which of the two interpretations is the right one?

ARX (1509 rep)

Aug 24, 2016, 10:55 PM • Last activity: Jul 26, 2025, 06:03 PM

0 votes

1 answers

920 views

MySQL - Move data between partitions aka re-partition

mysql partitioning

I have a mysql table whose partitions look as below p2015h1 - Contains data where date < 2015-07-01 (Has data from 2016-06-01. Hence only month worth of data) p2015h2 - Contains data where date < 2016-01-01 p2016h1 - Contains data where date < 2016-07-01 p2016h2 - Contains data where date < 2017-01-...

                                  I have a mysql table whose partitions look as below

    p2015h1 - Contains data where date < 2015-07-01 (Has data from 2016-06-01. Hence only month worth of data)
    p2015h2 - Contains data where date < 2016-01-01
    p2016h1 - Contains data where date < 2016-07-01
    p2016h2 - Contains data where date < 2017-01-01

I'd like the new partitions to be quarterly based as below - 

    p0 - Contains data where date < 2015-10-01
    p1 - Contains data where date < 2016-01-01
    p2 - Contains data where date < 2016-04-01
    p3 - Contains data where date < 2016-07-01

I started by reorganizing the first partition & executed the below command. All went well.

    alter table table1 reorganize partition p2015half1 into (partition p0 values less than ('2015-10-01'));

Now as the existing partition p2015h2 has data that includes data upto 2015-10-01, how could I move this part into the partition p0 ? I would need to do the same thing with the other partitions too as I continue building the new ones.

I did try to remove partitioning on the table fully, but, the table is billions of rows in size & hence the operation will take days. Post this I will have to rebuild the partitions which will take days again. Hence, I decided to take the approach of splitting partitions.

I'm stuck at this point in time. I'd fully appreciate any guidance here please.
                                

usert4jju7 (101 rep)

Jun 25, 2016, 04:59 AM • Last activity: Jul 26, 2025, 08:06 AM

0 votes

1 answers

759 views

Creating and dropping partition in PostgreSQL "on the fly"?

postgresql database-design partitioning postgresql-12

In our web-applicatons we have PostgreSQL database. Users write into and delete from this database ecology forecasts. Because data amount is significant (more than 100 Gb), we use declarative partitioning for largest tables. Tables partitioned by forecast identifier. Partition creating and dropping...

                                  In our web-applicatons we have PostgreSQL database. Users write into and delete from this database ecology forecasts. Because data amount is significant (more than 100 Gb), we use declarative partitioning for largest tables. Tables partitioned by forecast identifier. Partition creating and dropping is made "on the fly" when users create or delete forecasts. However, I doubt if creating partitions in this way is good idea.
 
**EDITED.** We do not use creating partition by BEFORE INSERT trigger. We create or drop section on backend after user start creating or deleting forecast on frontend of our web-application.

**EDITED 2.** Backend in our web-application is web-server, that works with PostgreSQL 12 database.

ScienceKitten (1 rep)

Sep 16, 2020, 10:08 PM • Last activity: Jul 25, 2025, 03:03 PM

2 votes

1 answers

147 views

Performance, RAM improvement with partitions

mysql index partitioning

I have a table like: ```sql create table table_a ( id_a mediumint unsigned not null, id_b tinyint unsigned not null, id_c char(10) not null, value tinyint unsigned not null, TYPE enum ('a', 'b', 'c') not null, DATE date not null, constraint `unique` unique (id_b, id_c, id_a, TYPE, DATE), constraint...

I have a table like:

create table table_a
(
    id_a      mediumint unsigned   not null,
    id_b      tinyint unsigned     not null,
    id_c      char(10)             not null,
    value     tinyint unsigned     not null,
    TYPE      enum ('a', 'b', 'c') not null,
    DATE      date                 not null,
    constraint unique
        unique (id_b, id_c, id_a, TYPE, DATE),
    constraint fk_id_b_id_c
        foreign key (id_b, id_c) references table_b (id_b, id_c)
            on delete cascade,
    constraint id_a
        foreign key (id_a) references table_c (ID)
            on delete cascade
);

create index table_a_value_index
    on table_a (value);

create index table_a_new
    on table_a (id_b, id_c, DATE, id_a);

create index mks_date
    on table_a (DATE);

Now I would like to create partitions, but I am not sure how they will impact the RAM usage or the performance. I have tried the partitions like:

ALTER TABLE table_a
 PARTITION BY RANGE (to_days(date)) (
    PARTITION 2021_H1 VALUES LESS THAN (to_days('2021-07-01')),
    PARTITION 2021_H2 VALUES LESS THAN (to_days('2022-01-01')),
    PARTITION 2022_H1 VALUES LESS THAN (to_days('2022-07-01')),
    PARTITION current VALUES LESS THAN (MAXVALUE)
 ) ;

But that hadn't the expected RAM, performance improvement. My research has shown that smaller partitions could be better, but you should not have more than 50 partitions. If I would partition after each month, I could store the last 4 years (when max. 50 partitions are recommended), which would be enough. But how much would that impact my RAM usage and/or the performance? As far as I understood, the partitions are treated as separated tables, does that mean, that each partition will have their own indexes? The table has a size of 20GB+, but the indexes are 40GB+. It would be beneficial to reduce the loaded index size. The most used indexes are *unique* and *table_a_new*. The filter for *date* is a specific date or a range of 6 months. It is fine, that I will lose my foreign keys.

Zystrix (45 rep)

Dec 5, 2022, 11:16 AM • Last activity: Jul 17, 2025, 03:09 AM

0 votes

1 answers

158 views

Adding FK to a newly attached partition slow

postgresql partitioning foreign-key aws-aurora

We have a partitioned table from which we attach and detach partitions for archiving purposes. When attempting to add a foreign key on the partitioned table, the process is taking a long time (up to 20 minutes) which is causing many problems. We have discussed using a `NOT VALID` option, but unfortu...

                                  We have a partitioned table from which we attach and detach partitions for archiving purposes. When attempting to add a foreign key on the partitioned table, the process is taking a long time (up to 20 minutes) which is causing many problems.

We have discussed using a NOT VALID option, but unfortunately this is not an option for partition tables, so just wanted to see if there were any options to add a foreign key to a partition

Krishnp92 (19 rep)

Jul 16, 2024, 12:50 PM • Last activity: Jul 10, 2025, 10:06 PM

0 votes

1 answers

60 views

Truncate partitions with foreign key to a different partitioned table

postgresql partitioning foreign-key truncate

I am using PostgreSQL with two tables, called records and flags, that are partitioned exactly the same -- by range. I want to truncate old partitions in both. We use pg_partman, and they are created as so (toy example): ```sql CREATE TABLE logging.records ( id BIGSERIAL, record VARCHAR, PRIMARY KEY...

CREATE TABLE logging.records (
    id BIGSERIAL,
    record VARCHAR,
    PRIMARY KEY (id)
) PARTITION BY RANGE (id);
CREATE TABLE logging.records_template (LIKE logging.records);
ALTER TABLE logging.records_template ADD PRIMARY KEY (id);
SELECT logging.create_parent(
    p_parent_table := 'logging.records'
    , p_control := 'id'
    , p_interval := '10'
    , p_premake := '20'
    , p_default_table := false
    , p_template_table := 'logging.records_template');


CREATE TABLE logging.flags(
    id BIGSERIAL PRIMARY KEY,
    record_id BIGSERIAL,
    archived BOOLEAN,
    FOREIGN KEY (record_id) REFERENCES logging.records(id) ON DELETE CASCADE
) PARTITION BY RANGE (id);
SELECT logging.create_parent(
    p_parent_table := 'logging.flags'
    , p_control := 'id'
    , p_interval := '10'
    , p_premake := '20'
    , p_default_table := false);

If I:

TRUNCATE logging.records_p0;

I get the error: > ERROR: cannot truncate a table referenced by a foreign key constraint If I use CASCADE, then it truncates all partitioned flags, which I do not want. If I:

TRUNCATE logging.flags_p0;

it works. But of course that leaves logging.records full of stuff. If I:

ALTER TABLE logging.records DETACH PARTITION logging.records_p0 CONCURRENTLY;

I get the error: > ERROR: removing partition "records_p0" violates foreign key constraint "flags_record_id_fkey2" If I:

ALTER TABLE logging.flags DETACH PARTITION logging.flags_p0 CONCURRENTLY;
ALTER TABLE logging.records DETACH PARTITION logging.records_p0 CONCURRENTLY;
TRUNCATE logging.flags_p0;
TRUNCATE logging.records_p0;

it works. However, at this point I may as well just drop the tables instead of truncating them since they won't be used again. But I have the deadlock issue with detaching, because while this is all going on, there is heavy activity with other partitions getting inserts and updates. In my ideal world I could just truncate with cascade and it would just wipe out the partition in records and the corresponding one in flags and not have to detach. Is there another way? Thank you

zhackwyatt (3 rep)

Jul 3, 2025, 07:25 PM • Last activity: Jul 7, 2025, 11:28 PM

0 votes

1 answers

481 views

Performance impact when querying without partition key on a partitioned table

query-performance partitioning mysql-8.0 performance-testing benchmark

Assuming I partition a table of large size (say 6TB) by month. We are serving very high traffic of around 300 RPS. Sample schema Create table mydata( id int not null, date_created not null, uuid_val not null, Primary key (id,date_created), key uuid_idx uuid_val) Partition by range(to_days(date_creat...

                                  Assuming I partition a table of large size (say 6TB) by month. We are serving very high traffic of around 300 RPS.

Sample schema

    Create table mydata(
    id int not null,
    date_created not null,
    uuid_val not null,
    Primary key (id,date_created),
    key uuid_idx uuid_val)
    Partition by range(to_days(date_created)



What will be the performance impact if I query on this table without date_created in the where clause. It will not give a better performance for sure as it cannot do partition pruning but what will the impact if I query it based on index uuid_val as compared to a non-partitioned table with same schema? 
I tried benchmarking it with 6GB data and querying(500,000 times) by index and not partition key the overhead to search in multiple partitions seemed very less (0.1ms) as compared to running the same query on the non-partitioned table.

But why is the overhead so less. Also can I expect such numbers with 6TB data too when serving high traffic?
Can I query partitioned table without partition key as I might not have in the flow.

My MySQL version is 8.

                                

Harshita Mishra (1 rep)

Oct 7, 2023, 11:34 AM • Last activity: Jul 1, 2025, 01:07 PM

3 votes

1 answers

181 views

How do distributed databases follow unique constraints?

replication partitioning distributed-databases

Lets say i have an application where user can register, and the username has to be unqiue value. Now lets say i have `N partitions` and for each partition i have `M replicas with multiple leaders`. Now i have questions regarding these scenarios: First: 1) User 1 attempts to register with username `u...

                                  Lets say i have an application where user can register, and the username has to be unqiue value.

Now lets say i have N partitions and for each partition i have M replicas with multiple leaders.

Now i have questions regarding these scenarios:

First:

1) User 1 attempts to register with username user1 - the write request gets routed to partition1 and to leader1
2) User 2 attempts to register with username user1 - the write request gets routed to the same partition1 and also to the leader1,

In this scenario the behavior is same as we had just one database. First transaction occures and the second one fails since the user1 value is already here and we are operating on the same replika 

Second:
1) User 1 attempts to register with username user1 - the write request gets routed to partition1 and to leader1
2) User 2 attempts to register with username user1 - the write request gets routed to the same partition1 and to  leader2,

In this case we have concurrent write. How does this determine what registration fails and what not? We can look at this as no partition and  multiple leader and as far as i researched in this case the typical solution is to either 1) prevent this by doing first scenario or 2) merge the values which is not acceptable in this case. Or solve conflicts on application level that is also not acceptable. How do DB's deal with this ?

Third:
1) User 1 attempts to register with username user1 - the write request gets routed to partition1 and to leader1
2) User 2 attempts to register with username user1 - the write request gets routed to the same partition2 and to  leader3,

In this case all writes go to different partitions ( what makes sense to me that this will probably not happen in real life since they have same value and thus should be routed to one partition ). How would the DB resolve what registration would succeed and which one would fail? How would it lock stuff or check if the value exists and so on?

The more i read about distributed DB's and how it works (even on high level ) im more and more confused.

Thanks for answers!

Johnyb (131 rep)

May 11, 2023, 01:39 PM • Last activity: Jun 30, 2025, 03:02 PM

0 votes

2 answers

614 views

Trying to find a way fit an extremely large index in to memory

mysql index partitioning memory

**Description:** I am having trouble with the on-growing data size in my MySQL database. I am using Ejabberd and MAM function which will make use of an `archive` table to store messages sent between users and the table keeps growing. It now takes over 10 seconds to query something like ``` SELECT CO...

**Description:** I am having trouble with the on-growing data size in my MySQL database. I am using Ejabberd and MAM function which will make use of an archive table to store messages sent between users and the table keeps growing. It now takes over 10 seconds to query something like

SELECT COUNT(*) FROM archive
    WHERE username = ''
      and bare_peer = ''
      and timestamp >= ''
      and bare_peer = ''
      and timestamp >= ''
      and timestamp '

These are very common SQL that would execute thousands of times each day, and since the SQL are executed from within Ejabberd, I cannot change the syntax. **Current Situation:** - Instance Specification: 8 core CPU, 64 GB RAM innodb_buffer_pool_size: 49392123904 bytes (roughly around 49GB) - With references to this post, I got the result of 1005383M (roughly 1TB) estimated requirement of memory size. - The archive table size: 700GB of data, and ~200GB of index, around 0.9b of rows (yes, a lot of rows) - Here is the table creation SQL:

CREATE TABLE archive (
  username varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  timestamp bigint(20) unsigned NOT NULL,
  peer varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  bare_peer varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  xml text COLLATE utf8mb4_unicode_ci NOT NULL,
  txt text COLLATE utf8mb4_unicode_ci,
  id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  kind varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  nick varchar(191) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  created_at timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY id (id),
  KEY i_username (username) USING BTREE,
  KEY i_timestamp (timestamp) USING BTREE,
  KEY i_peer (peer) USING BTREE,
  KEY i_bare_peer (bare_peer) USING BTREE,
  FULLTEXT KEY i_text (txt)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

**Discussion:** With the below information, one thing I could think of is to: - Partition the archive table with Primary Key (RANGE / Per 5m of rows), but from my understanding, since MySQL doesn't support fulltext index in Partitions, I would be required to drop the fulltext index in the txt column, which I think is ok. - Unfortunately, since MySQL could only partition on Primary Keys, and I cannot change the SQL. I therefore cannot utilize the partition directly on the SQL. What I could do is to drop the entire partition regularly and keep the remaining index size to fit into memory as much as possible. I am posting to seek for a second opinion on whether: 1. Is this the best way I could do with the above limitations? 2. If so, How can I partition such a big table without downtime, by using possibility pt-online-schema-change ? Thank you all for your time.

Vincent Lam (3 rep)

Jun 7, 2020, 03:51 AM • Last activity: Jun 27, 2025, 11:09 AM

1 votes

1 answers

210 views

Partition by hash on two levels

postgresql partitioning

I'm trying to partition a large table by a generated ID(unfortunately it's a mongoDB, so the first few bytes are not evenly distributed) I would like to shard it (using a foreign data wrapper) on this ID. Then within the shard partition again on this ID. How can I set up the hash so that the within...

                                  I'm trying to partition a large table by a generated ID(unfortunately it's a mongoDB, so the first few bytes are not evenly distributed)

I would like to shard it (using a foreign data wrapper) on this ID.
Then within the shard partition again on this ID. 

How can I set up the hash so that the within the shard so that the data is evenly distributed?

If we could use a function like RIGHT, I would shard by the 24th char, then partition by the 23rd. Or if it was a UUID, i'd would do the same but with char1, and char2.

KevinA (111 rep)

Sep 27, 2021, 04:44 PM • Last activity: Jun 21, 2025, 01:06 PM

0 votes

1 answers

218 views

Using ENUM as partition key when RANGE Partitioning PostgreSQL

postgresql partitioning postgresql-14 enum

I am working with PostgreSQL 14. I have a table called `Order` with a status column: ```sql CREATE TABLE "Order" ( "orderId" BIG SERIAL PRIMARY KEY NOT NULL, "orderDescription" TEXT NOT NULL, "statusId" SMALLINT NOT NULL ) PARTITION BY RANGE ("statusId"); ``` As shown, this table is partitioned by `...

I am working with PostgreSQL 14. I have a table called Order with a status column:

CREATE TABLE "Order" (
    "orderId" BIG SERIAL PRIMARY KEY NOT NULL,
    "orderDescription" TEXT NOT NULL,
    "statusId" SMALLINT NOT NULL
) PARTITION BY RANGE ("statusId");

As shown, this table is partitioned by RANGE based on the statusId. We need to place orders that have moved to a closed status in a separate partition. To achieve this, I implemented a simple trick. I defined a table called OrderStatus and assigned open statuses in the range of 10 to 20 and closed statuses in the range of 20 to 30:

CREATE TABLE "OrderStatus" (
    "statusId" SMALLINT PRIMARY KEY NOT NULL,
    "statusName" VARCHAR UNIQUE NOT NULL
);

INSERT INTO "OrderStatus" ("statusId", "statusName") VALUES (11, 'WAITING');   -- open
INSERT INTO "OrderStatus" ("statusId", "statusName") VALUES (12, 'OPEN');      -- open
INSERT INTO "OrderStatus" ("statusId", "statusName") VALUES (13, 'CANCELING'); -- open
INSERT INTO "OrderStatus" ("statusId", "statusName") VALUES (14, 'SENDING');   -- open
INSERT INTO "OrderStatus" ("statusId", "statusName") VALUES (21, 'FINISHED');  -- close
INSERT INTO "OrderStatus" ("statusId", "statusName") VALUES (22, 'CANCELED');  -- close

Based on this, the partitions are defined as follows:

CREATE TABLE "Order_Open" PARTITION OF "Order" FOR VALUES FROM (10) TO (20);
CREATE TABLE "Order_Close" PARTITION OF "Order" FOR VALUES FROM (20) TO (30);

So far, everything works as expected. Now, we are undergoing a system refactor, and one of the proposed changes is to convert the order statuses to an enum type. However, if we do this, since the numeric values of enums (enumsortorder column in pg_enum table) are defined by PostgreSQL, it seems that the trick we used for categorizing open and closed statuses will no longer be applicable. Here are the potential solutions I have considered, each with its own explanation, but I'm not sure if they are the best approaches: 1. Add a new boolean column isOpen to indicate whether the order is open or closed, and partition the table based on this column. I feel that having an additional column for this information is not ideal. 2. Manipulate the enumsortorder values of the enum in the pg_enum table to retain our categorization trick (I haven't tried this yet to see if it's possible). This approach seems risky and not quite right to me. 3. Change the partitioning type to LIST. While this does not pose significant issues, if we need to add a new value to the enum in the future, we would have to change the table definitions, which is not appealing. 4. Abandon the idea of using enum altogether and steel use statusId of OrderStatus table. I feel that for cases where we need to partition a table based on a column, using enum might not be suitable, especially for RANGE partitioning. Maybe there's a best practice for this situation that I'm not aware of. If anyone can provide insights or advice on this, I would greatly appreciate it.

Mofarah (35 rep)

Jun 22, 2024, 01:06 PM • Last activity: Jun 20, 2025, 10:03 AM

0 votes

1 answers

196 views

Index not used in a partitioned Oracle table

index oracle-11g-r2 partitioning

I have a table partitioned by list create table ICT_ENCOURS_BRUT ( code_exercice VARCHAR2(4), CODE_PRODUIT VARCHAR2(2), ....... ) partition by list (CODE_EXERCICE) ( partition CODE_EXERCICE_2018 values ('2018') tablespace DATA_BI_PARTITION_2018 , partition CODE_EXERCICE_2019 values ('2019') tablespa...

                                  I have a table partitioned by list 

    create table ICT_ENCOURS_BRUT
    (
      code_exercice           VARCHAR2(4),
      CODE_PRODUIT            VARCHAR2(2),
      .......
    )
    partition by list (CODE_EXERCICE)
    (
      partition CODE_EXERCICE_2018 values ('2018')
        tablespace DATA_BI_PARTITION_2018
      ,
      partition CODE_EXERCICE_2019 values ('2019')
        tablespace DATA_BI_PARTITION_2019
        ,
      partition CODE_EXERCICE_2020 values ('2020')
        tablespace DATA_BI_PARTITION_2020
       ,
      partition CODE_EXERCICE_2021 values ('2021')
        tablespace DATA_BI_PARTITION_2021
        
    );


and i have 2 indexes :

    CREATE INDEX EKIPIC.IDX_CODE_EXERCICE ON EKIPIC.ICT_ENCOURS_BRUT
    (CODE_EXERCICE)
      TABLESPACE INDEX_BI
      
    LOGGING
    LOCAL (  
      PARTITION CODE_EXERCICE_2018
        NOCOMPRESS 
        TABLESPACE INDEX_BI_PARTITION_2018,  
      PARTITION CODE_EXERCICE_2019
        NOCOMPRESS 
        TABLESPACE INDEX_BI_PARTITION_2019,  
      PARTITION CODE_EXERCICE_2020
        NOCOMPRESS 
        TABLESPACE INDEX_BI_PARTITION_2020,  
      PARTITION CODE_EXERCICE_2021
        NOCOMPRESS 
        TABLESPACE INDEX_BI_PARTITION_2021
    )
    
    ONLINE;

    create index IDX_CODE_PRODUIT on ICT_ENCOURS_BRUT (CODE_PRODUIT)
      tablespace INDEX_BI
     ;

the explain plan show a full scan access when i perform

     SELECT * from ICT_ENCOURS_BRUT
    where code_produit='LOA'

    -----------------------------------------------------------------------------------------
    | Id  | Operation            | Name             | Rows   | Bytes     | Cost  | Time     |
    -----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |                  | 812259 | 242053182 | 32268 | 00:07:32 |
    |   1 |   PARTITION LIST ALL |                  | 812259 | 242053182 | 32268 | 00:07:32 |
    | * 2 |    TABLE ACCESS FULL | ICT_ENCOURS_BRUT | 812259 | 242053182 | 32268 | 00:07:32 |
    -----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ------------------------------------------
    * 2 - filter("CODE_PRODUIT"='LOA')

the same way when i try to use 

    select * from ICT_ENCOURS_BRUT i where i.code_exercice='2019'

     Plan Hash Value  : 2304474508 
    
    --------------------------------------------------------------------------------------------
    | Id | Operation               | Name             | Rows    | Bytes     | Cost  | Time     |
    --------------------------------------------------------------------------------------------
    |  0 | SELECT STATEMENT        |                  | 1604226 | 474850896 | 12511 | 00:02:56 |
    |  1 |   PARTITION LIST SINGLE |                  | 1604226 | 474850896 | 12511 | 00:02:56 |
    |  2 |    TABLE ACCESS FULL    | ICT_ENCOURS_BRUT | 1604226 | 474850896 | 12511 | 00:02:56 |
    --------------------------------------------------------------------------------------------


Why cannot use the indexes  assuming that stats is updated?

                                

Merouan Ball (1 rep)

Oct 1, 2021, 03:40 PM • Last activity: Jun 19, 2025, 02:10 PM

0 votes

1 answers

210 views

How do I improve the performance of this particular delete function? Is table Partitioning the answer?

mysql partitioning

I am taking over a database that is... not great. When this DB was originally built, it was built without too much understanding about the volumes of data that were going to be involved. The initial estimates were off by... a lot. So, this DB is extremely simple. It has one chunky table that gets wr...

                                  I am taking over a database that is... not great. When this DB was originally built, it was built without too much understanding about the volumes of data that were going to be involved. The initial estimates were off by... a lot. 

So, this DB is extremely simple. It has one chunky table that gets written to consistently by a number of application hosts. This DB has one way replication and sits behind a keepalived VIP. We have a BI host that logs in once an hour and reads the data entered in the last hour. 

Once a day, the BI host runs a delete on all data that is older than 7 days. We are required to keep the data for 7 days, but not for longer. 

The problem is that we were expecting the DB to get thousands of lines per day, instead we are getting millions and millions. Space is not an issue, but the delete is now taking 20-25 minutes, during which time the application hosts are not able to write to the DB as the table is locked for the delete.

I need to improve this DB performance and I am unsure how of to do it PROPERLY. I am not a DBA by trade, We are a small shop and my experience in database usage has kind of landed me this particular task. 

I am thinking that some form of table partitioning on this table might work, but I have never really implemented this on a production system (I have a staging host I can use though for testing).

If I were to create a system of partitions, would I make 31 partitions, one for each day of the month, and then use a date function to store the record (which has a datetime field of course) and segregate the data day by day?

Is it reasonable to create 31 partitions on a table in mysql 8 for this kind of utilization? Should I do a mod 10 and 10 partitions instead?
 
Would this improve the speed of my delete, or at least make the rest of the table unlocked?
Am I missing some super obvious system such as adding a (NOLOCK) hint to my delete?

Thank you all in advance for any help.
Mark

M.Barbara (11 rep)

Mar 9, 2023, 07:53 AM • Last activity: Jun 15, 2025, 03:04 AM

2 votes

2 answers

455 views

Does the Query Optimizer use Incremental Statistics yet?

sql-server partitioning statistics

I was considering using incremental statistics in my data warehouse but I read [an article by Erin Stellato][1] that says the query optimizer doesn't use incremental statistics. The article was written in May of 2015 but I haven't seen anything recanting her position in the following 6 years. There...

                                  I was considering using incremental statistics in my data warehouse but I read an article by Erin Stellato  that says the query optimizer doesn't use incremental statistics. The article was written in May of 2015 but I haven't seen anything recanting her position in the following 6 years. There are a number of articles in different communities that show how to set it up, but if it's not useful, why bother? Does anyone know if subsequent versions of the query optimizer in 2016, 2017, or 2019 support the use of incremental statistics? If not, should we even use them? If they won't help the engine make a good decision about how to query a table with 10 billion records what good is it? Thanx for any help! 
                                

Mike Lemay (57 rep)

Oct 22, 2021, 04:32 PM • Last activity: Jun 8, 2025, 06:49 PM

1 votes

1 answers

756 views

SQL query to get the names of the partitions which are detached from the parent table in postgres?

postgresql query partitioning

I used [pg_partman][1] to create and drop partitions from my 2 postgres tables. Since there existed a foreign key relations among the tables, I couldn't drop the tables using partman(does not allow). However, I was able to detach the older partitioned tables(data older than 30 days) from the parent...

                                  I used pg_partman  to create and drop partitions from my 2 postgres tables. Since there existed a foreign key relations among the tables, I couldn't drop the tables using partman(does not allow). 
However, I was able to detach the older partitioned tables(data older than 30 days) from the parent table. Now, in order to clean these tables, I need to get the names of the detached tables. 

What sql query would help me achieve this ? 
                                

Kushagra Kumar (11 rep)

Jun 24, 2024, 07:15 PM • Last activity: Jun 6, 2025, 07:03 PM

0 votes

1 answers

268 views

Optimizing Cassandra Table Design for Queries Involving Two Partition Keys

partitioning cassandra

I’m designing a Cassandra table and need advice on the best schema design to handle my query patterns efficiently. The primary queries are as follows: 1. 90% of the queries involve both account_id and actor_id. 2. 10% of the queries involve only account_id. Given these requirements, I initially thou...

                                  I’m designing a Cassandra table and need advice on the best schema design to handle my query patterns efficiently. The primary queries are as follows:

1.	90% of the queries involve both account_id and actor_id.
2.	10% of the queries involve only account_id.

Given these requirements, I initially thought of using both account_id and actor_id as partition keys. Here are my specific questions and concerns:

Current Considerations and Schema

**Option 1**: Composite Partition Key

    CREATE TABLE search_table (
        account_id UUID,
        actor_id UUID,
        other_columns ...,
        PRIMARY KEY ((account_id, actor_id))
    );

•	Pros: Efficient for queries with both account_id and actor_id.

•	Cons: Inefficient for queries with only account_id since it requires scanning multiple partitions.

**Option 2**: Single Partition Key with Clustering Column

    CREATE TABLE search_table (
        account_id UUID,
        actor_id UUID,
        other_columns ...,
        PRIMARY KEY (account_id, actor_id)
    );



 - Pros: Efficient for both types of queries without requiring ALLOW
   FILTERING
 - Cons: Risk of very large partitions due to billions of records.

**Considering Bucketing**

To manage partition sizes effectively, I considered using bucketing:j

    CREATE TABLE search_table (
        bucket_id INT,
        account_id UUID,
        actor_id UUID,
        other_columns ...,
        PRIMARY KEY ((bucket_id, account_id), actor_id)
    );

•	Pros: Balanced partition sizes, efficient for both query types.

•	Cons: Requires calculating bucket_id in application logic and might add complexity.

**Using IN Clause for Efficient Queries**

If both account_id and actor_id are partition keys:

    SELECT * FROM search_table 
    WHERE account_id = ? AND actor_id IN (?, ?, ...);

•	Pros: Simplifies querying multiple actor_id values within the same account_id.

•	Cons: Performance overhead if the list of actor_id values is large.

Questions

1.	What is the best table design to handle my query patterns efficiently?
2.	Is using both account_id and actor_id as partition keys recommended, or should I rely on a different approach like bucketing?
3.	How do IN queries perform in Cassandra when used with partition keys, and are there practical limits on their usage?

Any insights, recommendations, or alternative strategies to manage partition sizes while maintaining efficient queries would be greatly appreciated!
                                

Chillax (131 rep)

May 22, 2024, 06:55 PM • Last activity: Jun 6, 2025, 04:04 AM

0 votes

1 answers

261 views

Error "No catalog entry found for partition ID ..."

partitioning sql-server-2022 standard-edition

When trying to delete disabled index or enable the same disabled index on one heavily utilised table, I am getting an error: No catalog entry found for partition ID 72057594967228416 in database 6. The metadata is inconsistent. Run DBCC CHECKDB to check for a metadata corruption. The database has di...

                                  When trying to delete disabled index or enable the same disabled index on one heavily utilised table, I am getting an error: 

    No catalog entry found for partition ID 72057594967228416 in database 6. The metadata is inconsistent. Run DBCC CHECKDB to check for a metadata corruption.

The database has different filegroups (one for table, one for indexes, ...) but no partitioning, no partition schemes, no partition functions. I can flawlessly create different index in the same filegroup with no error. 

DBCC CHECKDB('myDB') WITH ALL_ERRORMSGS, NO_INFOMSGS returns exactly **0** errors. 

My question: is the metadata damaged as in this case: Question DBA stack exchange  or is there a way out? 

Thanks!

Sranda (360 rep)

Jun 28, 2024, 08:28 AM • Last activity: Jun 5, 2025, 10:04 PM

Showing page 1 of 20 total questions