Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes

0 answers

26 views

Postgres Aurora index creation using to much resources

Is there a way to slow down index creation or tune it somehow to use fewer resources? A bit of context, we have a pretty big database with 200M rows in the root table, and probably some tables have even billions of rows due to one-to-many relations. Of course, for index creation, we use CONCURRENTLY...

                                  Is there a way to slow down index creation or tune it somehow to use fewer resources? 

A bit of context, we have a pretty big database with 200M rows in the root table, and probably some tables have even billions of rows due to one-to-many relations. Of course, for index creation, we use CONCURRENTLY index, and recently, without any config changes, index creation behaviour has changed a lot. For example, a month ago, to create an index it took ~12 hours, but last 3 indexes (they were not created together, few days passed between each index) were created in 2-5 minutes. Which is great, but during that time CPU spiked to 80% (it usually around 15-20% ) and shred memory from 2-3GB to almost 200GB. The number of maintenance workers is 2, and worker memory is 4Gb. 

Is there a way to make index creation slower but less resource-hungry???

Bogdan Dubyk (111 rep)

Jul 21, 2025, 10:22 AM • Last activity: Jul 21, 2025, 11:06 AM

0 votes

1 answers

898 views

How do I truncate performance_schema tables on a read replica?

mysql performance aurora read-only-database

On occasion, you have to truncate events_statements_summary_by_digest, when it's hit the row limit. On a write master, this is easy, but on a read replica, there doesn't seem to be a way to do it. The read-only enforcement extends to the performance_schema tables. Since the data are created locally,...

                                  On occasion, you have to truncate events_statements_summary_by_digest, when it's hit the row limit. On a write master, this is easy, but on a read replica, there doesn't seem to be a way to do it. The read-only enforcement extends to the performance_schema tables. Since the data are created locally, it seems to me that there'd be a way that I could exempt those tables (or even the entire performance_schema DB) from the read-only enforcement?

I have already set the performance_schema_events_statements_history_long_size to 10000, which is getting pretty large. I could set it larger (well, if it weren't read-only), but the size of that table affects overall system performance, and I'd rather not just crank the limit to something obscene - especially when so many of the entries in the history haven't been used in a mont or two. It's better to toss out the garbage.

Sniggerfardimungus (101 rep)

Apr 18, 2018, 08:47 PM • Last activity: Jun 28, 2025, 04:03 PM

0 votes

1 answers

1084 views

What is the difference between Aurora Mysql 5.7.x and Aurora 2.x.x

mysql mysql-5.7 amazon-rds aws aurora

I have an RDS instance with Aurora Mysql 5.7.12 and I noticed that there are other versions of Aurora Mysql 5.7, but I don't understand what is the difference between them. Some of the version listed are: - Aurora (MySQL)-5.7.12 <- The one that I have. - Aurora (MySQL 5.7)-2.03.2 - Aurora (MySQL 5.7...

                                  I have an RDS instance with Aurora Mysql 5.7.12 and I noticed that there are other versions of Aurora Mysql 5.7, but I don't understand what is the difference between them. 

Some of the version listed are: 

 - Aurora (MySQL)-5.7.12   <- The one that I have.
 - Aurora (MySQL 5.7)-2.03.2
 - Aurora (MySQL 5.7)-2.03.3
 - Aurora (MySQL 5.7)-2.03.4
 -  Aurora (MySQL 5.7)-2.04.1
 - Etc...

I attach an image of the AWS Console:

I want to chose the right version (the one that have most bug fixes in order to have better stability in my system).

Roberto Briones Argüelles (1 rep)

Apr 22, 2020, 12:12 AM • Last activity: Jun 4, 2025, 01:04 AM

0 votes

1 answers

270 views

Amazon Aurora MySQL conditional comment queries

mysql aws aurora

How does Amazon Aurora handle conditional queries based on MySQL version? I was unable to find any documentation on this. For example, the code below inserts into a database for MySQL 5.6.4 or newer. What is result in Aurora? I could spin up an instance but wanted to read the documentation about thi...

                                  How does Amazon Aurora handle conditional queries based on MySQL version? I was unable to find any documentation on this.

For example, the code below inserts into a database for MySQL 5.6.4 or newer. What is result in Aurora?

I could spin up an instance but wanted to read the documentation about this to understand what incompatible features there are.

    /*!50604 REPLACE INTO phppos_app_config (key, value) VALUES ('supports_full_text', '1')*/;

Chris Muench (711 rep)

Jan 13, 2018, 06:27 PM • Last activity: May 23, 2025, 03:04 PM

1 votes

0 answers

53 views

Aurora PostgreSQL Severe Performance Degradation Under Concurrent Load

postgresql-performance aws-aurora aurora amazon-rds-aurora sqlalchemy

**Environment:** - Database: AWS Aurora PostgreSQL - ORM: SQLAlchemy - API Framework: Python FastAPI **Issue:** I'm experiencing significant query performance degradation when my API receives concurrent requests. I ran a performance test comparing single execution vs. concurrent execution of the same query, and the results are concerning. **Real-World Observations:** When monitoring our production API endpoint during load tests with 100 concurrent users, I've observed concerning behavior: When running the same complex query through PGAdmin without concurrent load, it consistently completes in ~60ms However, during periods of high concurrency (100 simultaneous users), response times for this same query become wildly inconsistent: Some executions still complete in 60-100ms Others suddenly take up to 2 seconds No clear pattern to which queries are slow **Test Results:** Single query execution time: 0.3098 seconds Simulating 100 concurrent clients - all requests starting simultaneously... Results Summary: Total execution time: 32.7863 seconds Successful queries: 100 out of 100 Failed queries: 0 Average query time: 0.5591 seconds (559ms) Min time: 0.2756s, Max time: 1.9853s Queries exceeding 500ms threshold: 21 (21.0%) 50th percentile (median): 0.3114s (311ms) 95th percentile: 1.7712s (1771ms) 99th percentile: 1.9853s (1985ms) With 100 concurrent threads: - Each query takes ~12.4x longer on average (3.62s vs 0.29s) - Huge variance between fastest (0.5s) and slowest (4.8s) query - Overall throughput is ~17.2 queries/second (better than sequential, but still concerning) **Query Details:** The query is moderately complex, involving: Several JOINs across multiple tables, a subquery using EXISTS, ORDER BY and LIMIT clauses. **My Setup** **SQLAlchemy Configuration:**

engine = create_async_engine(
    settings.ASYNC_DATABASE_URL,
    echo=settings.SQL_DEBUG,
    pool_pre_ping=True,
    pool_use_lifo=True,
    pool_size=20,
    max_overflow=100,
    pool_timeout=30,
    pool_recycle=30,
)

AsyncSessionLocal = async_sessionmaker(
    bind=engine,
    class_=AsyncSession,
    expire_on_commit=False,
    autocommit=False,
    autoflush=False,
)

**FastAPI Dependency:**

async def get_db() -> AsyncGenerator[AsyncSession, None]:
    """Get database session"""
    async with AsyncSessionLocal() as session:
        try:
            yield session
            await session.commit()
        except Exception:
            await session.rollback()
            raise

**Questions:** - **Connection Pool Settings:** Are my SQLAlchemy pool settings appropriate for handling 100 concurrent requests? What would be optimal? - **Aurora Configuration:** What Aurora PostgreSQL parameters should I tune to improve concurrent query performance? - **Query Optimization:** Is there a standard approach to optimize complex queries with JOINs and EXISTS subqueries for better concurrency? - **ORM vs Raw SQL:** Would bypassing SQLAlchemy ORM help performance? Any guidance or best practices would be greatly appreciated. I'd be happy to provide additional details if needed. **Update:** **Hardware Configuration** 1. Aurora regional cluster with 1 instance 2. Capacity Type: Provisioned (Min: 0.5 ACUs (1GiB), Max: 16 ACUs (32 GiB)) 3. Storage Config: Standard **Performance Insights** 1. Max ACU utilization: 70% 2. Max CPU Utilization: 45% 3. Max DB connection: 111 4. EBS IO Balance: 100% 5. Buffer Cache Hit Ratio: 100%

Abhishek Tyagi (11 rep)

May 20, 2025, 07:18 PM • Last activity: May 21, 2025, 02:50 PM

0 votes

1 answers

297 views

Should I add my index after the table is populated or before (MySQL. Aurora)

mysql index aurora

I know this has been asked in the past, but sometimes things change over time and I wanted to double check. I have a table with about 9 Billion rows. Should I add the index before inserting the data or after. I'm using Aurora. Does it matter if I'm adding more than one index? Everything, I'm aware o...

                                  I know this has been asked in the past, but sometimes things change over time and I wanted to double check.

I have a table with about 9 Billion rows. Should I add the index before inserting the data or after. I'm using Aurora. Does it matter if I'm adding more than one index?

Everything, I'm aware of says you should do this after the insert, but one of my colleagues is insisting it's faster to do it on the insert.

John (9 rep)

Oct 7, 2019, 05:13 PM • Last activity: May 10, 2025, 02:04 PM

0 votes

3 answers

773 views

Create user on Master does not create it on Slave

mysql replication mysql-8.0 aurora

We have mysql replication set between two aurora clusters. Everything looks good and the replication is working well. However, I created a few hours ago a user on the master and I still don't see it on the slave. I see that Aurora excluding `mysql` schema which might be the issue. But I don't know i...

                                  We have mysql replication set between two aurora clusters.
Everything looks good and the replication is working well.
However, I created a few hours ago a user on the master and I still don't see it on the slave.
I see that Aurora excluding mysql schema which might be the issue. But I don't know if I should create the user myself or what.


    MySQL [(none)]> show replica status\G
    *************************** 1. row ***************************
                 Replica_IO_State: Waiting for master to send event
                      Source_Host: .rds.amazonaws.com
                      Source_User: user
                      Source_Port: port
                    Connect_Retry: 60
                  Source_Log_File: mysql-bin-changelog.111
              Read_Source_Log_Pos: 133216906
                   Relay_Log_File: relaylog.000548
                    Relay_Log_Pos: 133217133
            Relay_Source_Log_File: mysql-bin-changelog.111
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: mysql.%
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Source_Log_Pos: 133216906
                  Relay_Log_Space: 133217380
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Source_SSL_Allowed: No
               Source_SSL_CA_File: 
               Source_SSL_CA_Path: 
                  Source_SSL_Cert: 
                Source_SSL_Cipher: 
                   Source_SSL_Key: 
            Seconds_Behind_Source: 0
    Source_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Source_Server_Id: 58182133
                      Source_UUID: cdc9f787-fa44-376f-bceb-a16cd8c88c38
                 Source_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
        Replica_SQL_Running_State: Slave has read all relay log; waiting for more updates
               Source_Retry_Count: 86400
                      Source_Bind: 
          Last_IO_Error_Timestamp: 
         Last_SQL_Error_Timestamp: 
                   Source_SSL_Crl: 
               Source_SSL_Crlpath: 
               Retrieved_Gtid_Set: 
                Executed_Gtid_Set: 
                    Auto_Position: 0
             Replicate_Rewrite_DB: 
                     Channel_Name: 
               Source_TLS_Version: 
           Source_public_key_path: 
            Get_Source_public_key: 0
                Network_Namespace: 

on slave:
    MySQL [(none)]> select user from mysql.user where user='dms_u';
    Empty set (0.00 sec)

on master:

    MySQL [(none)]> select user from mysql.user where user='dms_u';
    +-------+
    | user  |
    +-------+
    | dms_u |
    +-------+
    1 row in set (0.00 sec)
                                

Nir (529 rep)

Aug 30, 2022, 01:01 PM • Last activity: Apr 12, 2025, 05:03 AM

34 votes

3 answers

45961 views

Why do I get a PostgreSQL permission error when specifying a tablespace in the "create database" command?

postgresql permissions amazon-rds tablespaces aurora

When I create a database in PostgreSQL without explicitly specifying a default tablespace the database is created without issue (I'm logged in as the **pgsys** user): postgres=> create database rich1; CREATE DATABASE postgres=> \l+ List of databases Name | Owner | Encoding | Collation | Ctype | Acce...

When I create a database in PostgreSQL without explicitly specifying a default tablespace the database is created without issue (I'm logged in as the **pgsys** user):

postgres=> create database rich1;
CREATE DATABASE
postgres=> \l+
                                                                            List of databases
   Name    |  Owner   | Encoding |  Collation  |    Ctype    |          Access privileges          |   Size    | Tablespace |                Description
-----------+----------+----------+-------------+-------------+-------------------------------------+-----------+------------+--------------------------------------------
 postgres  | pgsys    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                                     | 7455 kB   | pg_default | default administrative connection database
 rdsadmin  | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin               | No Access | pg_default |
 rich1     | pgsys    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                                     | 7233 kB   | pg_default |
 template0 | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin                         | 7345 kB   | pg_default | unmodifiable empty database
                                                             : rdsadmin=CTc/rdsadmin
 template1 | pgsys    | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/pgsys                            | 7345 kB   | pg_default | default template for new databases
                                                             : pgsys=CTc/pgsys
(5 rows)

As you can see, the database is put into the pg_default tablespace, but if I specify the default tablespace in the tablespace clause (also still logged in as **pgsys**) I get a permission error:

postgres=> create database rich2 tablespace pg_default;
ERROR:  permission denied for tablespace pg_default

Here's the permissions for that user:

postgres=> \du pgsys
               List of roles
 Role name | Attributes  |    Member of
-----------+-------------+-----------------
 pgsys     | Create role | {rds_superuser}
           : Create DB

This is PostgreSQL error, but I should mention that this an AWS Aurora instance in case that makes a difference.

HuggieRich (441 rep)

Apr 24, 2018, 03:46 PM • Last activity: Mar 30, 2025, 03:20 AM

0 votes

1 answers

1159 views

Cannot connect to Aurora RDS using workbench over ssh

mysql-workbench aws aurora

I set up an environment on AWS that consists of a VPC with a private and public subnet following the instructions for Magento Quick Start: http://docs.aws.amazon.com/quickstart/latest/magento/architecture.html The private subnet includes: - an EC2 instance (web server) - an Aurora RDS that connects...

                                  I set up an environment on AWS that consists of a VPC with a private and public subnet following the instructions for Magento Quick Start: http://docs.aws.amazon.com/quickstart/latest/magento/architecture.html 

The private subnet includes:

 - an EC2 instance (web server)
 - an Aurora RDS that connects to the web server

The public subnet includes:

 - a bastion instance for ssh connections to the EC2 instance in the private subnet

Generally, I can connect to Aurora RDS via ssh by connecting to the bastion host, then connecting to the private EC2 instance, and then to the RDS, however this makes it hard to perform simple tasks on the RDS.

Is it possible to connect to the RDS using mySQL workbench over ssh through the bastion?

I've tried everything but nothing seems to work.

Thanks.

Dimitris (11 rep)

Dec 5, 2017, 03:33 PM • Last activity: Mar 2, 2025, 07:03 PM

0 votes

1 answers

1452 views

AWS Aurora Mysql different instance types for writer/reader

mysql aws aws-aurora aurora amazon-rds-aurora

I have a db.r5.large for the master/writer. I want to add some readers. Do they need to be the same instance type as the master? Or can i add for example 3x db.t3.medium ?

                                  I have a db.r5.large for the master/writer. 

I want to add some readers. 

Do they need to be the same instance type as the master? Or can i add for example 3x db.t3.medium ?

AFRC (101 rep)

Nov 27, 2020, 11:59 PM • Last activity: Jan 16, 2025, 11:04 PM

0 votes

1 answers

40 views

Aurora3 RDS MySQL Stored function/trigger performance when using information_schema

stored-procedures trigger amazon-rds information-schema aurora

Since moving to Aurora3 RDS (MySQL 8 based) vs Aurora 2 (MySQL 5.7 based) noticed that execution time of certain triggers and stored functions performance has degraded. Correlation between triggers and functions was the select query on information_schema.table something like SELECT `AUTO_INCREMENT`...

                                  Since moving to Aurora3 RDS (MySQL 8 based) vs Aurora 2 (MySQL 5.7 based) noticed that execution time of certain triggers and stored functions performance has degraded. Correlation between triggers and functions was the select query on information_schema.table something like 

    SELECT AUTO_INCREMENT FROM information_schema.tables WHERE table_schema = DATABASE() AND table_name LIKE tablename LIMIT 1

RDS Stack has 50+ schemas with close to 2000+ tables in each. Can take > 25s (unloaded system, no other users/connections). Query above run outside of stored function/trigger takes < 0.1s (same as it did in 5.7 stored/trigger). 

Performs best when single database stack. Can also get better performance by changing the definer of the function/trigger to be schema based user. Unable to reproduce locally on Community Edition of MySQL8.

BigKiwiDev (1 rep)

Feb 9, 2024, 02:17 AM • Last activity: Dec 31, 2024, 07:31 AM

0 votes

0 answers

49 views

Getting OOM when trying to change id from int to bigint in MySQL w/ percona

mysql mysql-8.0 alter-table percona aurora

We have 2.5tb table with id column as int in Aurora 3 (MySQL 8.0.26). We need to change it to bigint as ids will run out soon. We tested it with `pt-online-schema-change` and the server crashed because of memory. I was a bit surprise to see it crashing on memory (it's not our first time running perc...

                                  We have 2.5tb table with id column as int in Aurora 3 (MySQL 8.0.26).
We need to change it to bigint as ids will run out soon.
We tested it with pt-online-schema-change and the server crashed because of memory.
I was a bit surprise to see it crashing on memory (it's not our first time running percona) because the work is being done in bulks. Moreover it crashed in dev, where the activity on that table is low. This table is insert intensive, with some updates and deletes. 

Approaches that will not work/not an option:
1. Recreating the table in the side and copy deltas.
2. CTAS 
3. Add column, copy ids, downtime, rename columns.
 

Would love to hear another approach or how to investigate/fix the percona OOM issue.

Table DDL:

    CREATE TABLE tab (
      id int unsigned NOT NULL AUTO_INCREMENT,
      a varchar(50) DEFAULT '',
      b int DEFAULT NULL,
      c varchar(255) DEFAULT NULL,
      d varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
      e timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
      f tinyint NOT NULL DEFAULT 0,
      g int DEFAULT NULL,
      h text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,
      PRIMARY KEY (id),
      KEY a (a,c),
      KEY b (b),
      KEY c_2 (c,a),
      KEY g (g,c,e),
      KEY e (e),
      KEY g_2 (g,d(191)),
      KEY ix_c_h (c,h(255)),
      KEY d (d(20),c,a),
      KEY ix_c_composite (c,f,e,g,a,id),
      KEY ix_c_e (c,e)
    ) charset=latin1

Percona code:

    ./pt-online-schema-change --host $host D=db,t=tab \
         --alter "CHANGE COLUMN id id bigint unsigned NOT NULL AUTO_INCREMENT" \
         --execute --recursion-method=none
                                

Nir (529 rep)

Feb 25, 2024, 03:00 PM • Last activity: Feb 26, 2024, 07:26 AM

1 votes

0 answers

144 views

Force index not working in Aurora MySQL 8

query-performance index-tuning mysql-8.0 aws-aurora aurora

I have recently upgraded to `Aurora MySQL engine version 8.0.mysql_aurora.3.05.2` from Aurora MySQL 5.7. I see huge performance issue after upgrading. Recently I found that I have queries whose indexes are being ignored and doing full table scan My `orders` table row count is `20 million` Here is th...

                                  I have recently upgraded to Aurora MySQL engine version 8.0.mysql_aurora.3.05.2 from Aurora MySQL 5.7. I see huge performance issue after upgrading. Recently I found that I have queries whose indexes are being ignored and doing full table scan

My orders table row count is 20 million

Here is the query

    select * 
    from orders 
    FORCE INDEX (PRIMARY)
    where orders.id in (/* almost 64000 comma separated ids which are primary keys */)

When I do EXPLAIN on this it says that it is doing full table scan with `type = ALL

    show create table orders

show this

    CREATE TABLE orders (
      id bigint unsigned NOT NULL AUTO_INCREMENT,
       ... other columns,
       PRIMARY KEY (id)
    ) ENGINE=InnoDB AUTO_INCREMENT=24105494 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

I checked on some blog and put optimizer_switch to index_condition_pushdown=on in parameter groups but it doesn't work

How can I force the index it so that query is fast

Ali Mehdi (11 rep)

Feb 14, 2024, 01:53 PM • Last activity: Feb 14, 2024, 02:02 PM

0 votes

1 answers

435 views

Does Aurora Serverless support Babelfish?

postgresql aws-aurora aurora

If running Amazon Aurora Serverless on PostgreSQL, is it possible to use the Babelfish for PostgreSQL feature?

                                  If running Amazon Aurora Serverless on PostgreSQL, is it possible to use the Babelfish for PostgreSQL feature?
                                

SilentSteel (113 rep)

Apr 25, 2022, 08:49 PM • Last activity: Dec 29, 2023, 04:01 PM

4 votes

2 answers

6539 views

Are the MariaDB and MySql drivers interchangeable when connecting to a MySQL backend Aurora cluster?

mysql mariadb amazon-rds jdbc aurora

I saw a [similar question asked before][1] but I didn't get enough supporting information Does anyone know any new information on this? Can MariaDB connector (driver) be used instead of the MySQL connector (driver) when connecting to AWS Aurora cluster MySQL instance? [1]: https://dba.stackexchange....

                                  I saw a similar question asked before  but I didn't get enough supporting information

Does anyone know any new information on this?

Can MariaDB connector (driver) be used instead of the MySQL connector (driver) when connecting to AWS Aurora cluster MySQL instance?

Hariharan Thavachelvam (41 rep)

Nov 30, 2018, 05:08 PM • Last activity: Mar 2, 2023, 07:37 PM

3 votes

2 answers

4450 views

What's the advantage of sharding in AWS Aurora?

mysql amazon-rds aws-aurora aurora amazon-rds-aurora

[AWS' RDS docs on multi-master Aurora](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#:~:text=In%20an%20Aurora%20multi%2Dmaster%20cluster%2C%20each%20shard%20is%20managed%20by%20a%20specific%20DB%20instance%2C%20and%20a%20DB%20instance%20can%20be%20responsible%...

                                  [AWS' RDS docs on multi-master Aurora](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#:~:text=In%20an%20Aurora%20multi%2Dmaster%20cluster%2C%20each%20shard%20is%20managed%20by%20a%20specific%20DB%20instance%2C%20and%20a%20DB%20instance%20can%20be%20responsible%20for%20multiple%20shards.)  state the following:
> In an Aurora multi-master cluster, each shard is managed by a specific DB instance, and **a DB instance can be responsible for multiple shards.**

[Later in the same document](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#:~:text=You%20can%20avoid%20resharding%20operations%20because%20all%20DB%20instances%20in%20a%20cluster%20can%20access%20all%20databases%20and%20tables%20through%20the%20shared%20storage%20volume.) , we read:

> You can avoid resharding operations because **all DB instances in a cluster can access all databases and tables through the shared storage volume.**

So, a multi-master Aurora instance can be responsible for multiple shards. This is made possible in part because all DB instances access the same shared storage volume.

**If multi-master Aurora instances can manage multiple shards, what's the advantage of using sharding at all? Why not just configure all instances to manage *all* shards?** (Essentially obviating the need for sharding)

# A theory
My suspicion is that not using sharding would lead to more deadlocks within Aurora's internals if different masters write to the same page at the same time. This, in turn, would lead to increased latency while Aurora retries the contentious write. Or, perhaps Aurora would generate an error and the application itself would have to retry the query. (I'm clearly not familiar enough with Aurora to know what happens, hence this question 😁)

rinogo (447 rep)

Oct 12, 2022, 05:12 PM • Last activity: Oct 12, 2022, 09:48 PM

0 votes

0 answers

502 views

Fastest way to migrate largish (15 TB) PostgreSQL 11 instance to Aurora 14 in AWS

postgresql amazon-rds migration aws aurora

We have an AWS RDS PostgreSQL 11 cluster (currently version 11.13), in which we have a few databases and some 1500 tables. The RDS cluster is of size "db.r5.xlarge", with 4 vCPUs, 32 GB RAM and general purpose SSD (gp2) storage. Currently the cluster has about 15 TB of data. We now need to upgrade t...

                                  We have an AWS RDS PostgreSQL 11 cluster (currently version 11.13), in which we have a few databases and some 1500 tables. The RDS cluster is of size "db.r5.xlarge", with 4 vCPUs, 32 GB RAM and general purpose SSD (gp2) storage. Currently the cluster has about 15 TB of data.

We now need to upgrade to the latest database server version, and are planning to take this opportunity to switch from a pure PostgreSQL cluster to an Aurora PostgreSQL instance, in the hope that this will be more efficient.

According to AWS documentation , there are four options that can be used:
1. Migrating an RDS for PostgreSQL DB instance using a snapshot
2. Migrating an RDS for PostgreSQL DB instance using an Aurora read replica
3. Importing S3 data into Aurora PostgreSQL
4. Migrating from a database that is not PostgreSQL-compatible

Option 4 does not apply to our case as our existing database server is PostgreSQL.
Option 3 is tedious and time-consuming, as tables need to be imported one at a time, with several steps required per table.
The only feasible remaining options for us thus are 1 and 2.

As I understand it, the migration process will involve following steps:
1. Create new Aurora PostgreSQL instance with version 11.13 (target Aurora PostgreSQL server must have the same major and the same or higher minor version as the source PostgreSQL server).
2. Import data using either option 1. or 2. above.
3. Upgrade Aurora PostgreSQL instance from version 11.13 to 14.3.

My questions now are:
1. Is my understanding of all of the above correct?
2. Which of the options for importing data, i.e. either using a snapshot or using an Aurora read replica, is likely to be faster given the size of our database?
                                

Robert (125 rep)

Aug 24, 2022, 08:04 AM • Last activity: Aug 24, 2022, 10:29 AM

8 votes

1 answers

6959 views

Why is Insert-Ignore so expensive in MySQL?

mysql index bulk-insert upsert aurora

**TL;DR:** Inserting a large number of rows with no collisions is MUCH faster than (not) re-inserting the same rows, **BOTH** using the `INSERT IGNORE` syntax. Why is this? I would assume that the index lookup cost would be the same for an insert and an "ignored" insert, given that MySQL does not kn...

                                  **TL;DR:**
Inserting a large number of rows with no collisions is MUCH faster than (not) re-inserting the same rows, **BOTH** using the INSERT IGNORE syntax.

Why is this? I would assume that the index lookup cost would be the same for an insert and an "ignored" insert, given that MySQL does not know if the incoming data has repeated/conflicting data (and therefore needs to be ignored)...thus, indexing occurs in both the initial insert and the ignored insert runs.

Furthermore, I would assume that an "ignored" row should be CHEAPER in that it does not require any disk writes.

But this is most definitely **NOT** the case.

**Long Version:**
In this question, we use AWS's Aurora/MySQL and the LOAD DATA FROM S3 FILE syntax to remove any transfer or performance variables. We load a 4 megarow CSV file corresponding to the schema below, and load it in twice, both times with LOAD ... IGNORE.

Please note that the issue also occurred with standard INSERT ... IGNORE, but with use of batched row inserts. The use of LOAD ... IGNORE here is to steer the discussion towards the counter-intuitive nature of the measured results, and not "how to perform large number of ignored inserts". That is not the issue here, as a domain-specific method was worked out.

In the model being tested, there is a three-tier index: the first two being enumerable categorization columns with very low cardinality, and a third column that is essentially "actual" data. For ease of qualifying this question, I am just sticking to my current setup.

Assume the following trivial schema:

our_table:

    id: basic auto-increment bigint, primary key
    constkey1: varchar(50) -- this is a constant in the insert
    constkey2: varchar(50) -- this is a constant in the insert
    datakey:   varchar(50) -- this is pulled in from the CSV file
    datafield: varchar(50) -- this is pulled in from the CSV file
    consttime: datetime    -- this is a constant in the insert

    Unique Index: (constkey1, constkey2, datakey)

And the following query, run twice:

    LOAD DATA FROM S3 FILE  's3://...'
        IGNORE
        INTO TABLE our_table
        COLUMNS TERMINATED BY ','
        (datakey, datafield)
        SET constkey1 = 'some string',
            constkey2 = 'some other string',
            consttime = CURRENT_TIMESTAMP  -- so that I can verify changes
  
Also assume that our_table already contains several million rows, NONE of which have have **NEITHER** the same constkey1 **NOR** constkey2 as that given in the query.

 - The first time I run the query above, the 4 megarows take 3 to 5 minutes to insert (presumably based on the vagrancies of the current database environment at the time of the run), and all 4 megarows are properly inserted. This is an acceptable baseline.
   
 - The SECOND time I run the query, the 4 megarows take an insane amount of time to (not) insert, and as expected, all 4 megarows are properly **NOT** updated. The time is long enough to not even be worth qualifying here (one run nearly an hour).

This test was run a number of times, always yielding the same result.

**UPDATE:**
While modifying this question, the following question was suggested by S.E.

mysql insert into indexed table taking long time after few million records 

 While I do not think it completely generic here, perhaps it has something to do with memory size of the index range? In the example here, the "sub index" size of ('some string', 'some other string', * ) was ZERO in run one, while full of 4 megaentries in run 2. Perhaps That is an issue?

Please also note that the eventual solution was of the form given in the question below. This is immaterial to the question at hand, as I wish to know why the original solution did not work. This is merely given for those curious.

Can I identify duplicates in a large MySQL insert? 

                                

Mark Gerolimatos (247 rep)

Jun 29, 2018, 10:55 PM • Last activity: Mar 25, 2022, 07:51 PM

1 votes

1 answers

1370 views

How to tell if I'm connected to an Amazon Aurora PostgreSQL cluster?

aws aurora

I've created few different instances, one instance is of Amazon Aurora PostgreSQL and other instances are of PostgreSQL, MySQL etc. How do I find our the Postgres instance is running within an AWS Aurora cluster or not, from CLI? When I say from CLI this means terminal not AWS CLI.

                                  I've created few different instances, one instance is of Amazon Aurora PostgreSQL and other instances are of PostgreSQL, MySQL etc. How do I find our the Postgres instance is running within an AWS Aurora cluster or not, from CLI? When I say from CLI this means terminal not AWS CLI.

Dharmik Gadhiya (29 rep)

Mar 12, 2018, 06:55 AM • Last activity: Dec 8, 2021, 12:03 PM

3 votes

1 answers

4048 views

Aurora MySQL 5.7 randomly fails

amazon-rds aurora

This is the 5th time. It happens once a week (Tuesday or Wednesday within 03:00-07:00 UTC+0). On the console, it shows available but inaccessible. We try to wait if the instance will recover itself, after ~30 min nothing happens. So I reboot it manually, then it came online again after rebooting (~5...

                                  This is the 5th time. It happens once a week (Tuesday or Wednesday within 03:00-07:00 UTC+0). On the console, it shows available but inaccessible. We try to wait if the instance will recover itself, after ~30 min nothing happens. So I reboot it manually, then it came online again after rebooting (~5 min). 

It would be helpful to know what actually went wrong. This is only a dev server with few users and records.

    Engine: Aurora MySQL 5.7.12
    DB instance class: db.t2.small
    Backup time: 16:00-16:30 UTC+0
    Maintenance time: sun:17:00-sun:17:30 UTC+0


Below is the only list of available logs after rebooting the instance. 

    error/mysql-error-running.log.2018-07-24.03	Tue Jul 24 11:14:06 GMT+800 2018	11.8 kB
    error/mysql-error-running.log.2018-07-24.04	Tue Jul 24 11:30:00 GMT+800 2018	285.5 kB
    error/mysql-error-running.log.2018-07-24.05	Tue Jul 24 12:30:00 GMT+800 2018	31.1 kB
    error/mysql-error-running.log.2018-07-24.06	Tue Jul 24 13:30:00 GMT+800 2018	31.8 kB
    error/mysql-error-running.log.2018-07-24.07	Tue Jul 24 14:30:00 GMT+800 2018	32.9 kB
    error/mysql-error-running.log.2018-07-24.08	Tue Jul 24 15:30:00 GMT+800 2018	29 kB
    error/mysql-error-running.log.2018-07-24.09	Tue Jul 24 16:30:00 GMT+800 2018	32.1 kB
    error/mysql-error-running.log.2018-07-24.10	Tue Jul 24 17:30:00 GMT+800 2018	27.5 kB
    error/mysql-error-running.log.2018-07-24.11	Tue Jul 24 18:30:00 GMT+800 2018	31.7 kB
    error/mysql-error-running.log.2018-07-24.12	Tue Jul 24 19:30:00 GMT+800 2018	27.1 kB
    error/mysql-error-running.log.2018-07-24.13	Tue Jul 24 20:30:00 GMT+800 2018	22.4 kB
    error/mysql-error-running.log.2018-07-24.14	Tue Jul 24 21:30:00 GMT+800 2018	22.8 kB
    error/mysql-error-running.log.2018-07-24.15	Tue Jul 24 22:30:00 GMT+800 2018	24.7 kB
    error/mysql-error-running.log.2018-07-24.16	Tue Jul 24 23:30:00 GMT+800 2018	24.7 kB
    error/mysql-error.log	Wed Jul 25 00:34:45 GMT+800 2018	2.6 kB
    external/mysql-external.log	Wed Jul 25 00:30:00 GMT+800 2018	7.6 kB



*external/mysql-external.log* 

    /rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with:
    Tcp port: 3306 Unix socket: /tmp/mysql.sock
    Time,ServerHost,User,UserHost,Command,Payload
    /rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with:
    Tcp port: 3306 Unix socket: /tmp/mysql.sock
    Time,ServerHost,User,UserHost,Command,Payload
    /rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with:
    Tcp port: 3306 Unix socket: /tmp/mysql.sock
    Time,ServerHost,User,UserHost,Command,Payload
    ----------------------- END OF LOG ----------------------


*error/mysql-error-running.log.2018-07-24.03* shows: https://pastebin.com/ywmXLR5g .

*error/mysql-error-running.log.2018-07-24.04* shows: https://pastebin.com/g1dkR6rj .

*error/mysql-error-running.log.2018-07-24.18* shows: https://pastebin.com/g0aAXfaT .

All other logs shows nothing(see photo).




**Event Logs**

    July 24, 2018 at 11:14:14 AM UTC+8	DB instance restarted
    July 24, 2018 at 11:13:31 AM UTC+8	Error restarting mysql: Engine bootstrap failed with no mysqld process running...
    July 24, 2018 at 11:12:01 AM UTC+8	Recovery of the DB instance is complete.
    July 24, 2018 at 11:04:26 AM UTC+8	Recovery of the DB instance has started. Recovery time will vary with the amount of data to be recovered.


CPU Utilization (07-24-2018)



CPU Utilization (07-11-2018 to 07-24-2018)



                                

John Paulo Rodriguez (151 rep)

Jul 24, 2018, 03:42 AM • Last activity: Oct 28, 2021, 01:45 AM

Showing page 1 of 20 total questions