Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes
0 answers
26 views
Postgres Aurora index creation using to much resources
Is there a way to slow down index creation or tune it somehow to use fewer resources? A bit of context, we have a pretty big database with 200M rows in the root table, and probably some tables have even billions of rows due to one-to-many relations. Of course, for index creation, we use CONCURRENTLY...
Is there a way to slow down index creation or tune it somehow to use fewer resources? A bit of context, we have a pretty big database with 200M rows in the root table, and probably some tables have even billions of rows due to one-to-many relations. Of course, for index creation, we use CONCURRENTLY index, and recently, without any config changes, index creation behaviour has changed a lot. For example, a month ago, to create an index it took ~12 hours, but last 3 indexes (they were not created together, few days passed between each index) were created in 2-5 minutes. Which is great, but during that time CPU spiked to 80% (it usually around 15-20% ) and shred memory from 2-3GB to almost 200GB. The number of maintenance workers is 2, and worker memory is 4Gb. Is there a way to make index creation slower but less resource-hungry???
Bogdan Dubyk (111 rep)
Jul 21, 2025, 10:22 AM • Last activity: Jul 21, 2025, 11:06 AM
0 votes
1 answers
898 views
How do I truncate performance_schema tables on a read replica?
On occasion, you have to truncate events_statements_summary_by_digest, when it's hit the row limit. On a write master, this is easy, but on a read replica, there doesn't seem to be a way to do it. The read-only enforcement extends to the performance_schema tables. Since the data are created locally,...
On occasion, you have to truncate events_statements_summary_by_digest, when it's hit the row limit. On a write master, this is easy, but on a read replica, there doesn't seem to be a way to do it. The read-only enforcement extends to the performance_schema tables. Since the data are created locally, it seems to me that there'd be a way that I could exempt those tables (or even the entire performance_schema DB) from the read-only enforcement? I have already set the performance_schema_events_statements_history_long_size to 10000, which is getting pretty large. I could set it larger (well, if it weren't read-only), but the size of that table affects overall system performance, and I'd rather not just crank the limit to something obscene - especially when so many of the entries in the history haven't been used in a mont or two. It's better to toss out the garbage.
Sniggerfardimungus (101 rep)
Apr 18, 2018, 08:47 PM • Last activity: Jun 28, 2025, 04:03 PM
0 votes
1 answers
1084 views
What is the difference between Aurora Mysql 5.7.x and Aurora 2.x.x
I have an RDS instance with Aurora Mysql 5.7.12 and I noticed that there are other versions of Aurora Mysql 5.7, but I don't understand what is the difference between them. Some of the version listed are: - Aurora (MySQL)-5.7.12 <- The one that I have. - Aurora (MySQL 5.7)-2.03.2 - Aurora (MySQL 5.7...
I have an RDS instance with Aurora Mysql 5.7.12 and I noticed that there are other versions of Aurora Mysql 5.7, but I don't understand what is the difference between them. Some of the version listed are: - Aurora (MySQL)-5.7.12 <- The one that I have. - Aurora (MySQL 5.7)-2.03.2 - Aurora (MySQL 5.7)-2.03.3 - Aurora (MySQL 5.7)-2.03.4 - Aurora (MySQL 5.7)-2.04.1 - Etc... I attach an image of the AWS Console: different versions of aurora mysql 5.7 I want to chose the right version (the one that have most bug fixes in order to have better stability in my system).
Roberto Briones Arg&#252;elles (1 rep)
Apr 22, 2020, 12:12 AM • Last activity: Jun 4, 2025, 01:04 AM
0 votes
1 answers
270 views
Amazon Aurora MySQL conditional comment queries
How does Amazon Aurora handle conditional queries based on MySQL version? I was unable to find any documentation on this. For example, the code below inserts into a database for MySQL 5.6.4 or newer. What is result in Aurora? I could spin up an instance but wanted to read the documentation about thi...
How does Amazon Aurora handle conditional queries based on MySQL version? I was unable to find any documentation on this. For example, the code below inserts into a database for MySQL 5.6.4 or newer. What is result in Aurora? I could spin up an instance but wanted to read the documentation about this to understand what incompatible features there are. /*!50604 REPLACE INTO phppos_app_config (key, value) VALUES ('supports_full_text', '1')*/;
Chris Muench (711 rep)
Jan 13, 2018, 06:27 PM • Last activity: May 23, 2025, 03:04 PM
1 votes
0 answers
53 views
Aurora PostgreSQL Severe Performance Degradation Under Concurrent Load
**Environment:** - Database: AWS Aurora PostgreSQL - ORM: SQLAlchemy - API Framework: Python FastAPI **Issue:** I'm experiencing significant query performance degradation when my API receives concurrent requests. I ran a performance test comparing single execution vs. concurrent execution of the sam...
**Environment:** - Database: AWS Aurora PostgreSQL - ORM: SQLAlchemy - API Framework: Python FastAPI **Issue:** I'm experiencing significant query performance degradation when my API receives concurrent requests. I ran a performance test comparing single execution vs. concurrent execution of the same query, and the results are concerning. **Real-World Observations:** When monitoring our production API endpoint during load tests with 100 concurrent users, I've observed concerning behavior: When running the same complex query through PGAdmin without concurrent load, it consistently completes in ~60ms However, during periods of high concurrency (100 simultaneous users), response times for this same query become wildly inconsistent: Some executions still complete in 60-100ms Others suddenly take up to 2 seconds No clear pattern to which queries are slow **Test Results:** Single query execution time: 0.3098 seconds Simulating 100 concurrent clients - all requests starting simultaneously... Results Summary: Total execution time: 32.7863 seconds Successful queries: 100 out of 100 Failed queries: 0 Average query time: 0.5591 seconds (559ms) Min time: 0.2756s, Max time: 1.9853s Queries exceeding 500ms threshold: 21 (21.0%) 50th percentile (median): 0.3114s (311ms) 95th percentile: 1.7712s (1771ms) 99th percentile: 1.9853s (1985ms) With 100 concurrent threads: - Each query takes ~12.4x longer on average (3.62s vs 0.29s) - Huge variance between fastest (0.5s) and slowest (4.8s) query - Overall throughput is ~17.2 queries/second (better than sequential, but still concerning) **Query Details:** The query is moderately complex, involving: Several JOINs across multiple tables, a subquery using EXISTS, ORDER BY and LIMIT clauses. **My Setup** **SQLAlchemy Configuration:**
engine = create_async_engine(
    settings.ASYNC_DATABASE_URL,
    echo=settings.SQL_DEBUG,
    pool_pre_ping=True,
    pool_use_lifo=True,
    pool_size=20,
    max_overflow=100,
    pool_timeout=30,
    pool_recycle=30,
)

AsyncSessionLocal = async_sessionmaker(
    bind=engine,
    class_=AsyncSession,
    expire_on_commit=False,
    autocommit=False,
    autoflush=False,
)
**FastAPI Dependency:**
async def get_db() -> AsyncGenerator[AsyncSession, None]:
    """Get database session"""
    async with AsyncSessionLocal() as session:
        try:
            yield session
            await session.commit()
        except Exception:
            await session.rollback()
            raise
**Questions:** - **Connection Pool Settings:** Are my SQLAlchemy pool settings appropriate for handling 100 concurrent requests? What would be optimal? - **Aurora Configuration:** What Aurora PostgreSQL parameters should I tune to improve concurrent query performance? - **Query Optimization:** Is there a standard approach to optimize complex queries with JOINs and EXISTS subqueries for better concurrency? - **ORM vs Raw SQL:** Would bypassing SQLAlchemy ORM help performance? Any guidance or best practices would be greatly appreciated. I'd be happy to provide additional details if needed. **Update:** **Hardware Configuration** 1. Aurora regional cluster with 1 instance 2. Capacity Type: Provisioned (Min: 0.5 ACUs (1GiB), Max: 16 ACUs (32 GiB)) 3. Storage Config: Standard **Performance Insights** 1. Max ACU utilization: 70% 2. Max CPU Utilization: 45% 3. Max DB connection: 111 4. EBS IO Balance: 100% 5. Buffer Cache Hit Ratio: 100%
Abhishek Tyagi (11 rep)
May 20, 2025, 07:18 PM • Last activity: May 21, 2025, 02:50 PM
0 votes
1 answers
297 views
Should I add my index after the table is populated or before (MySQL. Aurora)
I know this has been asked in the past, but sometimes things change over time and I wanted to double check. I have a table with about 9 Billion rows. Should I add the index before inserting the data or after. I'm using Aurora. Does it matter if I'm adding more than one index? Everything, I'm aware o...
I know this has been asked in the past, but sometimes things change over time and I wanted to double check. I have a table with about 9 Billion rows. Should I add the index before inserting the data or after. I'm using Aurora. Does it matter if I'm adding more than one index? Everything, I'm aware of says you should do this after the insert, but one of my colleagues is insisting it's faster to do it on the insert.
John (9 rep)
Oct 7, 2019, 05:13 PM • Last activity: May 10, 2025, 02:04 PM
0 votes
3 answers
773 views
Create user on Master does not create it on Slave
We have mysql replication set between two aurora clusters. Everything looks good and the replication is working well. However, I created a few hours ago a user on the master and I still don't see it on the slave. I see that Aurora excluding `mysql` schema which might be the issue. But I don't know i...
We have mysql replication set between two aurora clusters. Everything looks good and the replication is working well. However, I created a few hours ago a user on the master and I still don't see it on the slave. I see that Aurora excluding mysql schema which might be the issue. But I don't know if I should create the user myself or what. MySQL [(none)]> show replica status\G *************************** 1. row *************************** Replica_IO_State: Waiting for master to send event Source_Host: .rds.amazonaws.com Source_User: user Source_Port: port Connect_Retry: 60 Source_Log_File: mysql-bin-changelog.111 Read_Source_Log_Pos: 133216906 Relay_Log_File: relaylog.000548 Relay_Log_Pos: 133217133 Relay_Source_Log_File: mysql-bin-changelog.111 Replica_IO_Running: Yes Replica_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: mysql.% Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Source_Log_Pos: 133216906 Relay_Log_Space: 133217380 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Source_SSL_Allowed: No Source_SSL_CA_File: Source_SSL_CA_Path: Source_SSL_Cert: Source_SSL_Cipher: Source_SSL_Key: Seconds_Behind_Source: 0 Source_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Source_Server_Id: 58182133 Source_UUID: cdc9f787-fa44-376f-bceb-a16cd8c88c38 Source_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Replica_SQL_Running_State: Slave has read all relay log; waiting for more updates Source_Retry_Count: 86400 Source_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Source_SSL_Crl: Source_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Source_TLS_Version: Source_public_key_path: Get_Source_public_key: 0 Network_Namespace: on slave: MySQL [(none)]> select user from mysql.user where user='dms_u'; Empty set (0.00 sec) on master: MySQL [(none)]> select user from mysql.user where user='dms_u'; +-------+ | user | +-------+ | dms_u | +-------+ 1 row in set (0.00 sec)
Nir (529 rep)
Aug 30, 2022, 01:01 PM • Last activity: Apr 12, 2025, 05:03 AM
34 votes
3 answers
45961 views
Why do I get a PostgreSQL permission error when specifying a tablespace in the "create database" command?
When I create a database in PostgreSQL without explicitly specifying a default tablespace the database is created without issue (I'm logged in as the **pgsys** user): postgres=> create database rich1; CREATE DATABASE postgres=> \l+ List of databases Name | Owner | Encoding | Collation | Ctype | Acce...
When I create a database in PostgreSQL without explicitly specifying a default tablespace the database is created without issue (I'm logged in as the **pgsys** user):
postgres=> create database rich1;
CREATE DATABASE
postgres=> \l+
                                                                            List of databases
   Name    |  Owner   | Encoding |  Collation  |    Ctype    |          Access privileges          |   Size    | Tablespace |                Description
-----------+----------+----------+-------------+-------------+-------------------------------------+-----------+------------+--------------------------------------------
 postgres  | pgsys    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                                     | 7455 kB   | pg_default | default administrative connection database
 rdsadmin  | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin               | No Access | pg_default |
 rich1     | pgsys    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |                                     | 7233 kB   | pg_default |
 template0 | rdsadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin                         | 7345 kB   | pg_default | unmodifiable empty database
                                                             : rdsadmin=CTc/rdsadmin
 template1 | pgsys    | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/pgsys                            | 7345 kB   | pg_default | default template for new databases
                                                             : pgsys=CTc/pgsys
(5 rows)
As you can see, the database is put into the pg_default tablespace, but if I specify the default tablespace in the tablespace clause (also still logged in as **pgsys**) I get a permission error:
postgres=> create database rich2 tablespace pg_default;
ERROR:  permission denied for tablespace pg_default
Here's the permissions for that user:
postgres=> \du pgsys
               List of roles
 Role name | Attributes  |    Member of
-----------+-------------+-----------------
 pgsys     | Create role | {rds_superuser}
           : Create DB
This is PostgreSQL error, but I should mention that this an AWS Aurora instance in case that makes a difference.
HuggieRich (441 rep)
Apr 24, 2018, 03:46 PM • Last activity: Mar 30, 2025, 03:20 AM
0 votes
1 answers
1159 views
Cannot connect to Aurora RDS using workbench over ssh
I set up an environment on AWS that consists of a VPC with a private and public subnet following the instructions for Magento Quick Start: http://docs.aws.amazon.com/quickstart/latest/magento/architecture.html The private subnet includes: - an EC2 instance (web server) - an Aurora RDS that connects...
I set up an environment on AWS that consists of a VPC with a private and public subnet following the instructions for Magento Quick Start: http://docs.aws.amazon.com/quickstart/latest/magento/architecture.html The private subnet includes: - an EC2 instance (web server) - an Aurora RDS that connects to the web server The public subnet includes: - a bastion instance for ssh connections to the EC2 instance in the private subnet Generally, I can connect to Aurora RDS via ssh by connecting to the bastion host, then connecting to the private EC2 instance, and then to the RDS, however this makes it hard to perform simple tasks on the RDS. Is it possible to connect to the RDS using mySQL workbench over ssh through the bastion? I've tried everything but nothing seems to work. Thanks.
Dimitris (11 rep)
Dec 5, 2017, 03:33 PM • Last activity: Mar 2, 2025, 07:03 PM
0 votes
1 answers
1452 views
AWS Aurora Mysql different instance types for writer/reader
I have a db.r5.large for the master/writer. I want to add some readers. Do they need to be the same instance type as the master? Or can i add for example 3x db.t3.medium ?
I have a db.r5.large for the master/writer. I want to add some readers. Do they need to be the same instance type as the master? Or can i add for example 3x db.t3.medium ?
AFRC (101 rep)
Nov 27, 2020, 11:59 PM • Last activity: Jan 16, 2025, 11:04 PM
0 votes
1 answers
40 views
Aurora3 RDS MySQL Stored function/trigger performance when using information_schema
Since moving to Aurora3 RDS (MySQL 8 based) vs Aurora 2 (MySQL 5.7 based) noticed that execution time of certain triggers and stored functions performance has degraded. Correlation between triggers and functions was the select query on information_schema.table something like SELECT `AUTO_INCREMENT`...
Since moving to Aurora3 RDS (MySQL 8 based) vs Aurora 2 (MySQL 5.7 based) noticed that execution time of certain triggers and stored functions performance has degraded. Correlation between triggers and functions was the select query on information_schema.table something like SELECT AUTO_INCREMENT FROM information_schema.tables WHERE table_schema = DATABASE() AND table_name LIKE tablename LIMIT 1 RDS Stack has 50+ schemas with close to 2000+ tables in each. Can take > 25s (unloaded system, no other users/connections). Query above run outside of stored function/trigger takes < 0.1s (same as it did in 5.7 stored/trigger). Performs best when single database stack. Can also get better performance by changing the definer of the function/trigger to be schema based user. Unable to reproduce locally on Community Edition of MySQL8.
BigKiwiDev (1 rep)
Feb 9, 2024, 02:17 AM • Last activity: Dec 31, 2024, 07:31 AM
0 votes
0 answers
49 views
Getting OOM when trying to change id from int to bigint in MySQL w/ percona
We have 2.5tb table with id column as int in Aurora 3 (MySQL 8.0.26). We need to change it to bigint as ids will run out soon. We tested it with `pt-online-schema-change` and the server crashed because of memory. I was a bit surprise to see it crashing on memory (it's not our first time running perc...
We have 2.5tb table with id column as int in Aurora 3 (MySQL 8.0.26). We need to change it to bigint as ids will run out soon. We tested it with pt-online-schema-change and the server crashed because of memory. I was a bit surprise to see it crashing on memory (it's not our first time running percona) because the work is being done in bulks. Moreover it crashed in dev, where the activity on that table is low. This table is insert intensive, with some updates and deletes. Approaches that will not work/not an option: 1. Recreating the table in the side and copy deltas. 2. CTAS 3. Add column, copy ids, downtime, rename columns. Would love to hear another approach or how to investigate/fix the percona OOM issue. Table DDL: CREATE TABLE tab ( id int unsigned NOT NULL AUTO_INCREMENT, a varchar(50) DEFAULT '', b int DEFAULT NULL, c varchar(255) DEFAULT NULL, d varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL, e timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, f tinyint NOT NULL DEFAULT 0, g int DEFAULT NULL, h text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, PRIMARY KEY (id), KEY a (a,c), KEY b (b), KEY c_2 (c,a), KEY g (g,c,e), KEY e (e), KEY g_2 (g,d(191)), KEY ix_c_h (c,h(255)), KEY d (d(20),c,a), KEY ix_c_composite (c,f,e,g,a,id), KEY ix_c_e (c,e) ) charset=latin1 Percona code: ./pt-online-schema-change --host $host D=db,t=tab \ --alter "CHANGE COLUMN id id bigint unsigned NOT NULL AUTO_INCREMENT" \ --execute --recursion-method=none
Nir (529 rep)
Feb 25, 2024, 03:00 PM • Last activity: Feb 26, 2024, 07:26 AM
1 votes
0 answers
144 views
Force index not working in Aurora MySQL 8
I have recently upgraded to `Aurora MySQL engine version 8.0.mysql_aurora.3.05.2` from Aurora MySQL 5.7. I see huge performance issue after upgrading. Recently I found that I have queries whose indexes are being ignored and doing full table scan My `orders` table row count is `20 million` Here is th...
I have recently upgraded to Aurora MySQL engine version 8.0.mysql_aurora.3.05.2 from Aurora MySQL 5.7. I see huge performance issue after upgrading. Recently I found that I have queries whose indexes are being ignored and doing full table scan My orders table row count is 20 million Here is the query select * from orders FORCE INDEX (PRIMARY) where orders.id in (/* almost 64000 comma separated ids which are primary keys */) When I do EXPLAIN on this it says that it is doing full table scan with `type = ALL show create table orders show this CREATE TABLE orders ( id bigint unsigned NOT NULL AUTO_INCREMENT, ... other columns, PRIMARY KEY (id) ) ENGINE=InnoDB AUTO_INCREMENT=24105494 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci I checked on some blog and put optimizer_switch to index_condition_pushdown=on in parameter groups but it doesn't work How can I force the index it so that query is fast
Ali Mehdi (11 rep)
Feb 14, 2024, 01:53 PM • Last activity: Feb 14, 2024, 02:02 PM
0 votes
1 answers
435 views
Does Aurora Serverless support Babelfish?
If running Amazon Aurora Serverless on PostgreSQL, is it possible to use the Babelfish for PostgreSQL feature?
If running Amazon Aurora Serverless on PostgreSQL, is it possible to use the Babelfish for PostgreSQL feature?
SilentSteel (113 rep)
Apr 25, 2022, 08:49 PM • Last activity: Dec 29, 2023, 04:01 PM
4 votes
2 answers
6539 views
Are the MariaDB and MySql drivers interchangeable when connecting to a MySQL backend Aurora cluster?
I saw a [similar question asked before][1] but I didn't get enough supporting information Does anyone know any new information on this? Can MariaDB connector (driver) be used instead of the MySQL connector (driver) when connecting to AWS Aurora cluster MySQL instance? [1]: https://dba.stackexchange....
I saw a similar question asked before but I didn't get enough supporting information Does anyone know any new information on this? Can MariaDB connector (driver) be used instead of the MySQL connector (driver) when connecting to AWS Aurora cluster MySQL instance?
Hariharan Thavachelvam (41 rep)
Nov 30, 2018, 05:08 PM • Last activity: Mar 2, 2023, 07:37 PM
3 votes
2 answers
4450 views
What's the advantage of sharding in AWS Aurora?
[AWS' RDS docs on multi-master Aurora](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#:~:text=In%20an%20Aurora%20multi%2Dmaster%20cluster%2C%20each%20shard%20is%20managed%20by%20a%20specific%20DB%20instance%2C%20and%20a%20DB%20instance%20can%20be%20responsible%...
[AWS' RDS docs on multi-master Aurora](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#:~:text=In%20an%20Aurora%20multi%2Dmaster%20cluster%2C%20each%20shard%20is%20managed%20by%20a%20specific%20DB%20instance%2C%20and%20a%20DB%20instance%20can%20be%20responsible%20for%20multiple%20shards.) state the following: > In an Aurora multi-master cluster, each shard is managed by a specific DB instance, and **a DB instance can be responsible for multiple shards.** [Later in the same document](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html#:~:text=You%20can%20avoid%20resharding%20operations%20because%20all%20DB%20instances%20in%20a%20cluster%20can%20access%20all%20databases%20and%20tables%20through%20the%20shared%20storage%20volume.) , we read: > You can avoid resharding operations because **all DB instances in a cluster can access all databases and tables through the shared storage volume.** So, a multi-master Aurora instance can be responsible for multiple shards. This is made possible in part because all DB instances access the same shared storage volume. **If multi-master Aurora instances can manage multiple shards, what's the advantage of using sharding at all? Why not just configure all instances to manage *all* shards?** (Essentially obviating the need for sharding) # A theory My suspicion is that not using sharding would lead to more deadlocks within Aurora's internals if different masters write to the same page at the same time. This, in turn, would lead to increased latency while Aurora retries the contentious write. Or, perhaps Aurora would generate an error and the application itself would have to retry the query. (I'm clearly not familiar enough with Aurora to know what happens, hence this question 😁)
rinogo (447 rep)
Oct 12, 2022, 05:12 PM • Last activity: Oct 12, 2022, 09:48 PM
0 votes
0 answers
502 views
Fastest way to migrate largish (15 TB) PostgreSQL 11 instance to Aurora 14 in AWS
We have an AWS RDS PostgreSQL 11 cluster (currently version 11.13), in which we have a few databases and some 1500 tables. The RDS cluster is of size "db.r5.xlarge", with 4 vCPUs, 32 GB RAM and general purpose SSD (gp2) storage. Currently the cluster has about 15 TB of data. We now need to upgrade t...
We have an AWS RDS PostgreSQL 11 cluster (currently version 11.13), in which we have a few databases and some 1500 tables. The RDS cluster is of size "db.r5.xlarge", with 4 vCPUs, 32 GB RAM and general purpose SSD (gp2) storage. Currently the cluster has about 15 TB of data. We now need to upgrade to the latest database server version, and are planning to take this opportunity to switch from a pure PostgreSQL cluster to an Aurora PostgreSQL instance, in the hope that this will be more efficient. According to AWS documentation , there are four options that can be used: 1. Migrating an RDS for PostgreSQL DB instance using a snapshot 2. Migrating an RDS for PostgreSQL DB instance using an Aurora read replica 3. Importing S3 data into Aurora PostgreSQL 4. Migrating from a database that is not PostgreSQL-compatible Option 4 does not apply to our case as our existing database server is PostgreSQL. Option 3 is tedious and time-consuming, as tables need to be imported one at a time, with several steps required per table. The only feasible remaining options for us thus are 1 and 2. As I understand it, the migration process will involve following steps: 1. Create new Aurora PostgreSQL instance with version 11.13 (target Aurora PostgreSQL server must have the same major and the same or higher minor version as the source PostgreSQL server). 2. Import data using either option 1. or 2. above. 3. Upgrade Aurora PostgreSQL instance from version 11.13 to 14.3. My questions now are: 1. Is my understanding of all of the above correct? 2. Which of the options for importing data, i.e. either using a snapshot or using an Aurora read replica, is likely to be faster given the size of our database?
Robert (125 rep)
Aug 24, 2022, 08:04 AM • Last activity: Aug 24, 2022, 10:29 AM
8 votes
1 answers
6959 views
Why is Insert-Ignore so expensive in MySQL?
**TL;DR:** Inserting a large number of rows with no collisions is MUCH faster than (not) re-inserting the same rows, **BOTH** using the `INSERT IGNORE` syntax. Why is this? I would assume that the index lookup cost would be the same for an insert and an "ignored" insert, given that MySQL does not kn...
**TL;DR:** Inserting a large number of rows with no collisions is MUCH faster than (not) re-inserting the same rows, **BOTH** using the INSERT IGNORE syntax. Why is this? I would assume that the index lookup cost would be the same for an insert and an "ignored" insert, given that MySQL does not know if the incoming data has repeated/conflicting data (and therefore needs to be ignored)...thus, indexing occurs in both the initial insert and the ignored insert runs. Furthermore, I would assume that an "ignored" row should be CHEAPER in that it does not require any disk writes. But this is most definitely **NOT** the case. **Long Version:** In this question, we use AWS's Aurora/MySQL and the LOAD DATA FROM S3 FILE syntax to remove any transfer or performance variables. We load a 4 megarow CSV file corresponding to the schema below, and load it in twice, both times with LOAD ... IGNORE. Please note that the issue also occurred with standard INSERT ... IGNORE, but with use of batched row inserts. The use of LOAD ... IGNORE here is to steer the discussion towards the counter-intuitive nature of the measured results, and not "how to perform large number of ignored inserts". That is not the issue here, as a domain-specific method was worked out. In the model being tested, there is a three-tier index: the first two being enumerable categorization columns with very low cardinality, and a third column that is essentially "actual" data. For ease of qualifying this question, I am just sticking to my current setup. Assume the following trivial schema: our_table: id: basic auto-increment bigint, primary key constkey1: varchar(50) -- this is a constant in the insert constkey2: varchar(50) -- this is a constant in the insert datakey: varchar(50) -- this is pulled in from the CSV file datafield: varchar(50) -- this is pulled in from the CSV file consttime: datetime -- this is a constant in the insert Unique Index: (constkey1, constkey2, datakey) And the following query, run twice: LOAD DATA FROM S3 FILE 's3://...' IGNORE INTO TABLE our_table COLUMNS TERMINATED BY ',' (datakey, datafield) SET constkey1 = 'some string', constkey2 = 'some other string', consttime = CURRENT_TIMESTAMP -- so that I can verify changes Also assume that our_table already contains several million rows, NONE of which have have **NEITHER** the same constkey1 **NOR** constkey2 as that given in the query. - The first time I run the query above, the 4 megarows take 3 to 5 minutes to insert (presumably based on the vagrancies of the current database environment at the time of the run), and all 4 megarows are properly inserted. This is an acceptable baseline. - The SECOND time I run the query, the 4 megarows take an insane amount of time to (not) insert, and as expected, all 4 megarows are properly **NOT** updated. The time is long enough to not even be worth qualifying here (one run nearly an hour). This test was run a number of times, always yielding the same result. **UPDATE:** While modifying this question, the following question was suggested by S.E. mysql insert into indexed table taking long time after few million records While I do not think it completely generic here, perhaps it has something to do with memory size of the index range? In the example here, the "sub index" size of ('some string', 'some other string', * ) was ZERO in run one, while full of 4 megaentries in run 2. Perhaps That is an issue? Please also note that the eventual solution was of the form given in the question below. This is immaterial to the question at hand, as I wish to know why the original solution did not work. This is merely given for those curious. Can I identify duplicates in a large MySQL insert?
Mark Gerolimatos (247 rep)
Jun 29, 2018, 10:55 PM • Last activity: Mar 25, 2022, 07:51 PM
1 votes
1 answers
1370 views
How to tell if I'm connected to an Amazon Aurora PostgreSQL cluster?
I've created few different instances, one instance is of Amazon Aurora PostgreSQL and other instances are of PostgreSQL, MySQL etc. How do I find our the Postgres instance is running within an AWS Aurora cluster or not, from CLI? When I say from CLI this means terminal not AWS CLI.
I've created few different instances, one instance is of Amazon Aurora PostgreSQL and other instances are of PostgreSQL, MySQL etc. How do I find our the Postgres instance is running within an AWS Aurora cluster or not, from CLI? When I say from CLI this means terminal not AWS CLI.
Dharmik Gadhiya (29 rep)
Mar 12, 2018, 06:55 AM • Last activity: Dec 8, 2021, 12:03 PM
3 votes
1 answers
4048 views
Aurora MySQL 5.7 randomly fails
This is the 5th time. It happens once a week (Tuesday or Wednesday within 03:00-07:00 UTC+0). On the console, it shows available but inaccessible. We try to wait if the instance will recover itself, after ~30 min nothing happens. So I reboot it manually, then it came online again after rebooting (~5...
This is the 5th time. It happens once a week (Tuesday or Wednesday within 03:00-07:00 UTC+0). On the console, it shows available but inaccessible. We try to wait if the instance will recover itself, after ~30 min nothing happens. So I reboot it manually, then it came online again after rebooting (~5 min). It would be helpful to know what actually went wrong. This is only a dev server with few users and records. Engine: Aurora MySQL 5.7.12 DB instance class: db.t2.small Backup time: 16:00-16:30 UTC+0 Maintenance time: sun:17:00-sun:17:30 UTC+0 Below is the only list of available logs after rebooting the instance. error/mysql-error-running.log.2018-07-24.03 Tue Jul 24 11:14:06 GMT+800 2018 11.8 kB error/mysql-error-running.log.2018-07-24.04 Tue Jul 24 11:30:00 GMT+800 2018 285.5 kB error/mysql-error-running.log.2018-07-24.05 Tue Jul 24 12:30:00 GMT+800 2018 31.1 kB error/mysql-error-running.log.2018-07-24.06 Tue Jul 24 13:30:00 GMT+800 2018 31.8 kB error/mysql-error-running.log.2018-07-24.07 Tue Jul 24 14:30:00 GMT+800 2018 32.9 kB error/mysql-error-running.log.2018-07-24.08 Tue Jul 24 15:30:00 GMT+800 2018 29 kB error/mysql-error-running.log.2018-07-24.09 Tue Jul 24 16:30:00 GMT+800 2018 32.1 kB error/mysql-error-running.log.2018-07-24.10 Tue Jul 24 17:30:00 GMT+800 2018 27.5 kB error/mysql-error-running.log.2018-07-24.11 Tue Jul 24 18:30:00 GMT+800 2018 31.7 kB error/mysql-error-running.log.2018-07-24.12 Tue Jul 24 19:30:00 GMT+800 2018 27.1 kB error/mysql-error-running.log.2018-07-24.13 Tue Jul 24 20:30:00 GMT+800 2018 22.4 kB error/mysql-error-running.log.2018-07-24.14 Tue Jul 24 21:30:00 GMT+800 2018 22.8 kB error/mysql-error-running.log.2018-07-24.15 Tue Jul 24 22:30:00 GMT+800 2018 24.7 kB error/mysql-error-running.log.2018-07-24.16 Tue Jul 24 23:30:00 GMT+800 2018 24.7 kB error/mysql-error.log Wed Jul 25 00:34:45 GMT+800 2018 2.6 kB external/mysql-external.log Wed Jul 25 00:30:00 GMT+800 2018 7.6 kB *external/mysql-external.log* /rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with: Tcp port: 3306 Unix socket: /tmp/mysql.sock Time,ServerHost,User,UserHost,Command,Payload /rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with: Tcp port: 3306 Unix socket: /tmp/mysql.sock Time,ServerHost,User,UserHost,Command,Payload /rdsdbbin/oscar/bin/mysqld, Version: 5.7.12 (MySQL Community Server (GPL)). started with: Tcp port: 3306 Unix socket: /tmp/mysql.sock Time,ServerHost,User,UserHost,Command,Payload ----------------------- END OF LOG ---------------------- *error/mysql-error-running.log.2018-07-24.03* shows: https://pastebin.com/ywmXLR5g . *error/mysql-error-running.log.2018-07-24.04* shows: https://pastebin.com/g1dkR6rj . *error/mysql-error-running.log.2018-07-24.18* shows: https://pastebin.com/g0aAXfaT . All other logs shows nothing(see photo). enter image description here **Event Logs** July 24, 2018 at 11:14:14 AM UTC+8 DB instance restarted July 24, 2018 at 11:13:31 AM UTC+8 Error restarting mysql: Engine bootstrap failed with no mysqld process running... July 24, 2018 at 11:12:01 AM UTC+8 Recovery of the DB instance is complete. July 24, 2018 at 11:04:26 AM UTC+8 Recovery of the DB instance has started. Recovery time will vary with the amount of data to be recovered. CPU Utilization (07-24-2018) enter image description here CPU Utilization (07-11-2018 to 07-24-2018) enter image description here
John Paulo Rodriguez (151 rep)
Jul 24, 2018, 03:42 AM • Last activity: Oct 28, 2021, 01:45 AM
Showing page 1 of 20 total questions