Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

-2 votes

2 answers

72 views

Should I break a large user table into smaller tables for specific roles and information?

mysql performance database-design normalization scalability

I am designing a database for a system that has a `users` table. Currently, the table has around 50 columns, which include: - **Personal information** (e.g., `name`, `email`, `phone_number`, `address`, etc.) - **Work-related information** (e.g., `job_title`, `company_name`, `years_experience`, etc.)...

                                  I am designing a database for a system that has a users table. Currently, the table has around 50 columns, which include:  
- **Personal information** (e.g., name, email, phone_number, address, etc.)  
- **Work-related information** (e.g., job_title, company_name, years_experience, etc.)  
- **Education-related information** (e.g., degree, institution, graduation_year, etc.)  

Some of these columns are only applicable to specific roles, like employees or students, while others are common to all users. I’m considering breaking the users table into smaller tables, such as:  
- A main users table for general information.  
- A work_experiences table for job-related details.  
- An education table for academic details.  

### My questions are:  
1. Is splitting the users table into smaller tables based on their context (work, education, etc.) a good practice?  
2. Will this approach impact performance negatively, or is it better for maintainability?  
3. Should I consider alternatives like using JSON columns for role-specific data or polymorphic relationships for related models?  

I’m using MySQL, and performance is a concern since the system will scale to thousands of users.  

Thank you for your advice!  

                                

JayDev95 (97 rep)

Dec 9, 2024, 09:09 PM • Last activity: May 5, 2025, 07:55 AM

0 votes

1 answers

524 views

NoSQL vs SQL during sharding when a single read operation only hits one shard

locking nosql sharding rdbms scalability

I've a use case where I'm stuck between figuring out whether to use SQL or NoSQL db. The db has 2 fields - `A (PK), B` A is 8 char-long, while B is 100-400 char long. The operations are simply: 1. Write: Given strings `A`, `B` - store both in db. 2. Read: Given input `A`, return `B` associated with...

                                  I've a use case where I'm stuck between figuring out whether to use SQL or NoSQL db.
The db has 2 fields - A (PK), B

A is 8 char-long, while B is 100-400 char long.

The operations are simply:
1. Write: Given strings A, B - store both in db.
2. Read: Given input A, return B associated with it.

The database is queried by multiple application server to reads and writes, implying parallel read/write transactions(operations).

Now, comes the scalability part:

**SQLs:**

Scaling writes - (Assuming partitioning is already done). Then enable sharding(shard based on some hash function) for SQL dbs. And as every write operation comes, write to the desired shard. This scales the writes.

Scaling reads - Enable replication(not sure if this is enable by default for SQL dbs).
Assumption is that we are scaled for writes and reads. So, Sharding and replication are both enabled.

As, you can imply from above, the read operations are such that the db just needs to hit a single shard for reads, so basically no SQL-JOIN operation required and also no combining of results from multiple shards required in above case.

**1. Will using SQL db perform better in above case than NoSQL if we are scaling for high number of reads and writes(#read is equivalent to the #writes )?** I see that SQLs have to maintain ACID properties and maintaining strong-consistency slows down the writes, reads due to the reason that holding locks across shards is time-taking process but our case doesn't require cross-shard locks.

**2. Does the above decision change when reads >>> writes OR writes >>> reads** 

**4. If there's no difference between using SQL and NoSQL dbs then does it basically boils down to whether eventual consistency is permitted in which case NoSQL should be the better choice(I hope that's bcz writes will be much faster in NoSQL given that strong-consistency not a must which SQLs provide)?** 

**A very basic question, given that the db grows to terabyte/petabyte limits, does it make sense to use SQL db with a single server instance which holds this much data in a single hardware with replication turned ON?**

asn (127 rep)

Jul 2, 2023, 10:57 AM • Last activity: Apr 10, 2025, 03:05 PM

0 votes

1 answers

2042 views

Does a read replica help if the problem are heavy queries?

postgresql replication performance scalability read-only-database

The application that our team is responsible for is having ever more often DB performance problems. The application querying it doesn't throw lots of concurrent queries, the problem we have are rather some heavy queries joining a few tables. These queries are executed just every few minutes, and exe...

                                  The application that our team is responsible for is having ever more often DB performance problems. The application querying it doesn't throw lots of concurrent queries, the problem we have are rather some heavy queries joining a few tables. These queries are executed just every few minutes, and execution seldom overlap, but they sometimes make the DB choke and as the system has more data, it's a growing problem. The solution so far has been optimizing these queries, by trying e.g. to fetch less fields, and scaling up the DB.

I am relatively new to the team and today I asked if a read replica had been considered. I know little about DBs, so that felt like natural to me, but a senior engineer told me that it would barely help, because the read replica would have the same problems as the master: it would still need to do the same writes as the master, and the heavy queries would be equally heavy to it and the chances of it timing out, the same. The only gain would be the amount of them each of the instances has to serve.

Put like that, sounds very reasonable. My questions are:

1. Is a read replica really not of any help in this situation?
2. What are some alternatives?

PS: in case it matters, it's PostgreSQL.

antonro (101 rep)

Jul 29, 2019, 03:48 PM • Last activity: Mar 29, 2025, 03:09 PM

1 votes

1 answers

588 views

What makes many NoSQL databases non-ACID compliant when both SQL and NoSQL can scale horizontally

nosql rdbms scalability

A lot of the difference between SQL and NoSQL that is talked about focuses on the fact that SQL support ACID properties while in many NoSQL dbs ACID is compromised. I'm unsure of the reason but scalability seems to be the culprit from what I've read(do correct me here). A lot of discussion wrongly a...

                                  A lot of the difference between SQL and NoSQL that is talked about focuses on the fact that SQL support ACID properties while in many NoSQL dbs ACID is compromised.

I'm unsure of the reason but scalability seems to be the culprit from what I've read(do correct me here). A lot of discussion wrongly associate the consistency in CAP with the consistency in ACID. They say that NoSQL doesn't provide ACID but it provides BASE (Basically Available, Soft state, Eventual consistency). E stands for eventual consistency. But SQL dbs too can provide eventual consistency(master slave architecture - MySQL by default performs asynchronous replication).

Apart from the fact that SQL have a rigid schema and NoSQL have a flexible schema. What other difference do they have? Is there difference on the scalability part? I know NoSQL is an umbrella term but I'm mentioning about most of the NoSQL dbs here. 

If both SQL and NoSQL(I found some NoSQL dbs provide ACID) can scale then why can't some NoSQL db provide ACID? And when it says that some NoSQLs(pointing to the dbs that don't provide ACID properties) can't provide ACID, does it mean that a transaction can make a database table inconsistent? This inconsistency is wrongly interpreted to be the CAP inconsistency when in fact it actually means the ACID inconsistency. Can concurrent transaction make a NoSQL table into inconsistent state - for eg - is it possible that dirty reads can be performed by some transaction that is running concurrently with another parallel transaction?

One of them that seems to mix the ACID and CAP consistency: (I'll add a few if I find more)
1. https://dba.stackexchange.com/questions/18435/cap-theorem-vs-base-nosql 
2. https://stackoverflow.com/a/3423750/7547722

asn (127 rep)

Jul 6, 2023, 02:29 PM • Last activity: Feb 13, 2025, 05:00 AM

0 votes

2 answers

100 views

All possible ways to Scale a Database

mysql postgresql database-design replication scalability

I recently began studying **Database Scaling** and have a few questions. Please correct me if I’m mistaken on the following points: 1. A *database (db)* is a collection of *tables* (in relational databases) or *collections* (in non-relational databases). 2. A software system that manages databases (...

                                  I recently began studying **Database Scaling** and have a few questions.

Please correct me if I’m mistaken on the following points:

1. A *database (db)* is a collection of *tables* (in relational databases) or *collections* (in non-relational databases).
2. A software system that manages databases (and performs various other tasks) is called a *Database Management System*.
3. A *database server* is a machine that stores the actual data (the "database") and runs an instance of a DBMS.
4. Types of database include: relational, document-based, graph-based, etc. Types of dbms includes: MySQL, Postgres, MongoDB, DynamoDB, etc

Assuming the above are correct, I believe the following are the primary ways to scale a database server horizontally:

A. **Monolithic Architecture**

One MySQL (or Postgres, MongoDB, etc.) instance managing 5 databases. Each of these 5 databases contains 3 tables of its own. All 5 databases are part of the same application (for example, Uber).

B. **Database Replication**

5 MySQL instances on 5 different servers. Each instance contains all 5 databases, and each database has 3 tables. This can involve either master-slave or master-master replication setups.

C. **Microservices Architecture**

5 MySQL instances on 5 servers, with each instance managing only one of the 5 databases. Each database contains its own 3 tables, and each instance serves a specific microservice.

D. **Sharding**

5 MySQL instances on 5 servers. Each instance has all 5 databases, and each database contains 3 tables. However, each instance’s user.db's "user" table only holds a subset of the total user data.

E. **Sharding + Partitioning**

In the scenario described above, each "user.db" (spread across different instances) implements database partitioning (for example, based on user IDs).

F. **Sharding + Replication**

Each MySQL instance has a master-slave or master-master replication setup while also handling its own shard of the overall data.

G. **Sharding + Partitioning + Replication**

**Questions:**
1. What are your thoughts on the scalability methods described above?
2. In a monolithic application (such as Uber on its first day), would it make sense to "avoid" placing all tables in a single database? Instead, should separate databases be created (e.g., "user.db" for user-related tables and "vehicle.db" for vehicle-related tables) while still storing them on the same database server under a single DBMS instance?
3. Can a single DBMS instance manage two databases located on different servers?
4. Is it possible to run two DBMS instances on the same server?

Intangible _pg18 (11 rep)

Oct 12, 2024, 04:54 AM • Last activity: Oct 20, 2024, 04:57 AM

1 votes

0 answers

39 views

Is it even possible to create a scalable rhyming dictionary for 10 million words in a single language like English?

postgresql scalability pagination in-memory-database

I'm going in circles brainstorming ideas and TypeScript or SQL code to implement basically a "rhyming database". The goal of the rhyming database is to find rhymes for all words, not just exact rhymes but "nearby or close rhymes" too (like Rap music, etc.). Here are some of the facts and features: 1...

                                  I'm going in circles brainstorming ideas and TypeScript or SQL code to implement basically a "rhyming database". The goal of the rhyming database is to find rhymes for all words, not just exact rhymes but "nearby or close rhymes" too (like Rap music, etc.). Here are some of the facts and features:

1. Estimate 10 million English words for now (but realistically I'm thinking about doing this for ~30 languages).
2. I think rhymes would be like a reverse exponential curve (let's just imagine), so many short words rhyme long words, but it tapers down as words get longer.
3. We only will support up to 3 syllables of word-end rhyming.
4. Don't worry about the system for capturing the phonetic information of words, we can use something like the [CMU pronunciation format/dictionary](https://en.wikipedia.org/wiki/CMU_Pronouncing_Dictionary#Database_format) . I have a system for computing phonetic information (too involved to describe for this post).
5. In a not-worse-but-bad-case, let's say there are 1000 rhymes for every word, that is 10m x 1k = 10 billion relationships between words. At 10,000 rhymes, that is 100 billion, so the database might start running into scaling problems.
6. Ideally, we compute a "similarity score", _comparing each word to every other word_ (cross-product), and have a threshold things must score higher than to count as a rhyme.
7. We then sort by the similarity score.
8. We allow _pagination_ based on an input pronunciation text, and you can jump to specific pages in the rhyme query results.

Well, all these features together seem like an impossible ask so far: **pagination**, **complex/robust similarity scoring** (not just hacky extremely simplified SQL functions for calculating basic scores, but advanced cosineSimilarity scoring, or even more custom stuff taking into account sound-sequences in each word), **10 million words**, **up to 3 syllables of rhyming**, **fast query time**, and ideally not requiring a huge memory-intensive server.

I have been essentially cycling through 5 or 10 solutions to this problem with ClaudeAI (either backed by SQL, or just in-memory), but it can't seem to solve all those problems at once, it leaves one key problem unsolved, so everything won't work.

- First solution was in-memory, for every word, compute a robust vector similarity score based on the pronunciation/phonemes/sounds of each word, cross-product style. This seems like the ideal solution (which would give you 100% accurate results), but it won't scale, because 10m x 10m is trillions and beyond that. Especially not possible in the DB. By precomputing all similarities between every pair of words, search is easy, as there is a map from input to array of rhymes, already sorted by score. Pagination is easy too. But it won't scale.
- Next "solution" was a SQL version, with an **extremely primitive** phoneme_similarity SQL function. Then a query for all rhymes would be something like:

        const query = `
          WITH scored_rhymes AS (
            SELECT 
              w.word,
              (
                phoneme_similarity(w.last_vowel, ?) * 3 +
                phoneme_similarity(w.penultimate_vowel, ?) * 1.5 +
                CASE WHEN w.final_consonant_cluster = ? THEN 2 ELSE 0 END +
                CASE WHEN substr(w.stress_pattern, -1) = substr(?, -1) THEN 1 ELSE 0 END +
                CASE WHEN substr(w.stress_pattern, -2) = substr(?, -2) THEN 0.5 ELSE 0 END
              ) AS score
            FROM words w
            WHERE w.word != ? 
              AND w.last_vowel = ?
          )
          SELECT word, score
          FROM scored_rhymes
          WHERE score > 0
          ORDER BY score DESC, word
          LIMIT ? OFFSET ?
        `;
    While it seems to handle pagination, the scoring logic is severely lacking. This won't give quality rhyme results, we need much more advanced phonetic sequence clustering and scoring logic. But it would scale, as there is just a single words table, with some phonetic columns. It's just not going to be accurate/robust enough scoring / rhyming-wise.
- A third solution it came up with, did the advanced scoring, but _after_ it made a DB query (DB-level pagination). This will not result in quality pagination, because a page worth of words are fetched based on non-scored data, then scores are computed on that subset in-memory, and then they are sorted. This is completely inaccurate.
- Then the fourth solution, after saying how it didn't meet all the constraints/criteria, it did a SQL version, with storing the cross product of every word pair, precomputing the score! Again, we did that already in memory, and it definitely won't scale storing 10m x 10m links in the DB.

So then it is basically cycling through these answers with small variations that don't have a large effect or improvement on the solution.

_BTW using AI to help think through this has gotten me way deeper into the weeds of solving this problem and making it a reality. I can think for days and weeks about a problem like this on my own, reading a couple papers, browsing a few GitHub repos, ... but then I think in my head "oh yeah I got something that is fast, scalable, and quality". Yeah right haha. Learning through AI back and forth helps getting working data structures and algorithms, and brings new insights and pros/cons lists to my attention which I would otherwise not have figured out in a timely manner._

So my question for you now is, after a few days of working on this rhyming dictionary idea is, is there a way to solve this to get all the constraints of the system satisfied (pagination/scoring/10m-words/3-syllables/fast-query/scalable)?

An [answer to my StackOverflow question](https://stackoverflow.com/questions/79101873/how-to-build-a-trie-for-finding-exact-phonetic-matches-sorted-globally-by-weigh/79102113?noredirect=1#comment139481345_79102113)  about finding phonetic matches in detail suggested I use a [succinct indexable dictionary](https://en.wikipedia.org/wiki/Succinct_data_structure#Succinct_indexable_dictionaries) , or even the [Roaring Compressed Bitmap](https://roaringbitmap.org/)  data structure. But from my understanding so far, this requires still computing the cross product and scoring, it just might save on some memory. I don't know though if it would efficiently story trillions and quadrillions of associations though (in-memory even, on large machine).

So I'm at a loss. Is it impossible to solve my problem as described? If so, what should I cut out to make this solvable? Either what constraints/desires should I cut out, or what other things can I cut corners on?

_I tagged this as PostgreSQL because that's what I'm using for the app in general, if that helps._
                                

Lance Pollard (221 rep)

Oct 19, 2024, 06:08 AM • Last activity: Oct 19, 2024, 06:14 AM

0 votes

3 answers

367 views

does sharding affect querying speed

mysql scalability

I am implementing horizontal scaling or my sql database. I am spreading my table onto multiple dataservers with sharding designs. I know that this technique is very good for handling large volumes of data, and/or big data analysis. Also if spread across with geographical design, could serve some use...

                                  I am implementing horizontal scaling or my sql database. I am spreading my table onto multiple dataservers with sharding designs. I know that this technique is very good for handling large volumes of data, and/or big data analysis. Also if spread across with geographical design, could serve some users data faster on specific regions. 
 Besides all these, does sharding increase query speed by much? Since indexes already come into place. What would be other horizontal scaling options than sharding, or tips on that ?

To sum up my question: Imagine 2 identical table structures, one has 100 million rows, and the other has 1 million rows. They both have indexes set up, will 1 million one run queries faster than the other? Horizontal scaling help with that ? (Also more tips)

umarkaa (47 rep)

Apr 11, 2024, 11:05 AM • Last activity: Jul 24, 2024, 03:02 PM

0 votes

1 answers

116 views

How Can I Efficiently Structure a Data Model for Handling Both One-Time and Recurring Tasks in a Task Management app?

database-design query-performance architecture scalability

fellow developers and architects! I'm in the process of designing a webapp aimed at managing tasks and recurring tasks, which can be seen as habits. The unique challenge I'm facing revolves around how to best structure my data model to efficiently handle both one-time tasks and recurring tasks, with...

                                  fellow developers and architects! I'm in the process of designing a webapp aimed at managing tasks and recurring tasks, which can be seen as habits. The unique challenge I'm facing revolves around how to best structure my data model to efficiently handle both one-time tasks and recurring tasks, without differentiating between them during data retrieval.

The current design splits the "task" concept into two entities:

1. **Task**: Holds shared task data such as title, description, category, type, recurrence rule, etc.
2. **State**: Contains specific instance data like state (new, in-progress, done), start date, due date, etc.

This structure necessitates that both tables/entities be joined and fetched together at all times, considering that task entity attributes (like description and title) are often updated.

**To provide more context, here are a few use cases to consider:**

**Usecase1**: Imagine a calendar monthly view, and today is January 1st, with each user having hundreds of tasks, each with recurrences many times a week. The user wants to see all the tasks they have to do in the following December and change the due date/time of some of them.

**Usecase2**: With the same setup as above, a user has a task with a daily recurrence. They now want to change the recurrence to every second day and change the due time.

**Usecase3**: Again, the same calendar setup, but now the user wants to change the description and title of some tasks on a regular basis. Of course, this change should apply to every single recurrence of that particular task.

**Given these requirements, I have several questions for the community:**

1. **Performance & Scalability**: What are your thoughts on the proposed data model in terms of performance and scalability?
2. **Database Selection**: Which type of database would be best suited for this application (SQL vs. NoSQL) and why?
3. **Alternative Models**: Are there more efficient data models or structures that could handle these scenarios better?
4. **State Entity Creation**: For recurring tasks, should "state" entities be generated in advance or on-demand? How can this be optimized for both the user experience and system performance?
Considering the data model I described, how and when should the “state” entities be created for a task? In advance? On demand? If I create state entities in advance, I risk that the user picks a time frame where those states are still missing, so I have to make a check anyway if new state entities need to be created. On the other hand, if I do it on demand it will make all the queries very slow as with each fetch operation I first need to check and if needed to create the state entities.

Your insights, especially if you've tackled similar challenges, would be invaluable to me. I'm particularly interested in any architectural advice, design patterns, or technology recommendations that could enhance the app's functionality and user experience.

nanobot (1 rep)

Apr 6, 2024, 12:57 PM • Last activity: Apr 7, 2024, 12:38 PM

1 votes

4 answers

1087 views

Scalability of Postgres for table with large number of indexed columns

postgresql query-performance postgresql-performance scalability database-engine

I have a Postgres table with a large number of indexed columns (roughly 100 indexed columns total, and yes, I need them all, and yes, they all need to be separately indexed). Any row update causes all indexes to be updated, which is a lot of work for the DB engine. I want to understand the concurren...

                                  I have a Postgres table with a large number of indexed columns (roughly 100 indexed columns total, and yes, I need them all, and yes, they all need to be separately indexed). Any row update causes all indexes to be updated, which is a lot of work for the DB engine.

I want to understand the concurrency implications of the discussion on the Postgres documentation page titled Index Locking Considerations , and also the fact that Postgres is single-threaded (multi-process) , in terms of how the current design affects reader and writer performance for a large number of concurrent queries, given that I have so many column indices.

My interpretation of these things are the following (please correct any that are wrong):
* Writers that are updating individual rows don't block readers, unless the reader is running a query that produces a result set that would include the row that is being updated.
* Writers only block each other if they are trying to update the same row at the same time.
* Concurrent updates to btree-based indices from multiple writers get merged according to a set of rules that generally does the right thing (so updating the same indexes at the same time does not cause writers to block, unless they are updating the same row).

My questions are:
* How can there even be multiple concurrent readers or writers, if Postgres is single-threaded? If you have multiple processes running, do they simply rely on the inter-process consistency of disk caches (or have to manually flush contents to disk) to coordinate concurrent updates?
* What if anything can get blocked while a large number of indexes are being updated due to a row update? If anything can get blocked during an update, is it possible to turn a dial on the consistency-vs-availability tradeoff so that, for example, a row update is not atomic (i.e. so that the indexes are updated one at a time, but the update to all indexes doesn't have to happen atomically)? I'm OK with a lack of consistency in the name of higher concurrency.

                                

Luke Hutchison (141 rep)

Jan 25, 2024, 06:56 PM • Last activity: Jan 28, 2024, 09:49 AM

0 votes

1 answers

48 views

Why use $group / group_by instead of multiple requests?

mongodb join group-by nosql scalability

Let's suppose that there are 100 types of dogs scattered across the documents of my collection. If I need to group by types of dogs and then compute some aggregate statistic about each type of dog, why perform a query involving `$group`, which in principle processes each document sequentially, versu...

                                  Let's suppose that there are 100 types of dogs scattered across the documents of my collection. If I need to group by types of dogs and then compute some aggregate statistic about each type of dog, why perform a query involving $group, which in principle processes each document sequentially, versus sending 100 separate, simultaneous queries to the database where each query filters based on the type of dog?

Wouldn't the 100 simultaneous, separate query be faster?

If it is faster, at scale, what is the drawback to not do it, if any?

Bear Bile Farming is Torture (119 rep)

Sep 29, 2023, 01:56 AM • Last activity: Sep 29, 2023, 12:21 PM

1 votes

2 answers

183 views

Scaling from Multiple Database to Single Database Architecture in SQL Server

sql-server scalability multi-tenant multiple-database

My application is centered around self-contianed "workspaces". For many really good reasons (everything from management to security), we have always had a one-database-per-workspace architecture. Each database has identical schema, stored procedures, triggers, etc. There is a "database of databases"...

                                  My application is centered around self-contianed "workspaces". For many really good reasons (everything from management to security), we have always had a one-database-per-workspace architecture. Each database has identical schema, stored procedures, triggers, etc. There is a "database of databases" that coordinates all of this. Works great.

The problem: scalability. It was recently proposed that a customer might want to have 100,000 workspaces. Obviously this is a non-starter for one SQL instance. Plus, each workspace might be rather small, but there'd also be a very wide size distribution - the biggest workspace could be 100x the size of the _median_. The top 1% of workspaces could easily constitute 90+% of the rows across all workspaces.

I'm looking for options for rearchitecting things to support this scenario, and here are some things I've considered and the issues I see with each.

- Keep the multi-database architecture but spread across multiple SQL instances. The problem is cost (both administrative and infrastructure). If we stick to a limit of 1,000 DBs on each instance, that's still 100 instances, spread across who knows how many actual VMs. But since so many of the workspaces will be small (much smaller than our current average), the revenue won't nearly scale accordingly. So I think this is probably out of the question and I'm focusing now on single-database architectures.

- Every workspace shares the same tables, indexed by workspace ID. So every table would need a new workspace ID column and every query needs to add the workspace condition in the WHERE clause (or more likely every real table is wrapped in an inline table-valued function that takes the WorkspaceID; anyway...) The primary key of every table would also have to be redefined to include the workspace ID since not every PK now is globally unique. Programming-wise this is all fine, but even with proper indexing and perfect query design (and no, not all our queries are perfect - the dreaded row scan still happens on occasion) is there any conceivable way this could perform as well - for everyone - as separate databases? More specifically can we guarantee that small projects won't suffer from the presence of big projects which could be taking up 100x more rows than the small ones? And what specific steps would need to be taken, whether it be the type of index to use or how to write queries to guarantee that the optimizer always narrows things down by workspace ID before it does literally anything else?

- Partitioning - from what I've read, this doesn't help with query performance, and it appears MS recommends limiting tables or indexes to 1000 partitions so this also won't help.

- Create the same set of tables but with a new schema for each workspace. I thought of this because there are no limits to the number of tables a database can have other than the overall 2G object limit. But I haven't explored this idea much. I'm wondering if there would be performance concerns with 100,000 schemas and millions of tables, views, stored procs, etc.

With all that, here is the specific question -
What specific features of SQL Server, and/or general strategies, including but not limited to things I've considered, would be most useful for maintaining a large number of self-contained data sets with identical schemas in a single giant database? To reiterate, maintaining performance as close as possible to a multi-database architecture is of top priority. 

And needless to say, if any part of my assessment above seems incorrect or misguided I'd be glad to be corrected. Many thanks.

Peter Moore (113 rep)

Aug 17, 2023, 05:30 PM • Last activity: Aug 20, 2023, 06:23 PM

1 votes

1 answers

2884 views

What are the pros and cons of using CLIENT_RESULT_CACHE_SIZE and RESULT_CACHE_MODE

oracle scalability cache

I have multiple questions For Oracle Database 11g Release 11.2.0.1.0 what are the benefits of using CLIENT_RESULT_CACHE_SIZE and RESULT_CACHE_MODE. How will the CLIENT side cache be kept sync if data changes at server side. Do we have a AUTO mode for RESULT_CACHE_MODE? What is the recommended mode f...

                                  I have multiple questions 

For Oracle Database 11g Release 11.2.0.1.0 

what are the benefits of using CLIENT_RESULT_CACHE_SIZE and RESULT_CACHE_MODE.
How will the CLIENT side cache be kept sync if data changes at server side. 

Do we have a AUTO mode for RESULT_CACHE_MODE? What is the recommended mode for this?

If i execute a single query 10 times with CLIENT_RESULT_CACHE enabled will the query be run 10 times at the server?

Please point me towards any documentation regarding this.Thanks in advance

Phani (113 rep)

Oct 19, 2012, 06:09 PM • Last activity: Jun 19, 2023, 09:07 PM

2 votes

0 answers

576 views

Postgres: One-to-many relationship with WHERE, ORDER BY and LIMIT not scalable?

index execution-plan postgresql-performance index-tuning scalability

I'm currently using Postgres 12. A few years ago, I set up the following model: a `lot` table with a one-to-many relationship to a `line` table. My idea was to mutualize data that are common for multiple rows of the `line` table, into the `lot` table. The `lot` table contains metadata stored as in a...

I'm currently using Postgres 12. A few years ago, I set up the following model: a lot table with a one-to-many relationship to a line table. My idea was to mutualize data that are common for multiple rows of the line table, into the lot table. The lot table contains metadata stored as in a JSONB column, it felt right at the time not to duplicate the data, since a lot could contain 1 to 10 lines. Users can ask for lines through an API in a paginated manner, using seek pagination on a sort_index column in the line table. Users can also filter on multiple columns. I'm currently having issues to scale. I have ~40M rows in the line table for ~8M rows in the lot table. Here is a simplified view of the model:

\d+ public.lot
                                                   Table "public.lot"
     Column      |           Type           | Collation | Nullable | Default | Storage  | Stats target | Description
-----------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
 id              | uuid                     |           | not null |         | plain    |              |
 metadata        | jsonb                    |           | not null |         | extended |              |
 discriminant_id | bigint                   |           | not null |         | plain    |              |
 date_created    | timestamp with time zone |           | not null |         | plain    |              |
 last_updated    | timestamp with time zone |           | not null |         | plain    |              |
Indexes:
    "lot_pkey" PRIMARY KEY, btree (id)
    "idx_lot_discriminant_id" btree (discriminant_id)
Foreign-key constraints:
    "fk_lot_discriminant" FOREIGN KEY (discriminant_id) REFERENCES public.discriminant(id)
Referenced by:
    TABLE "public.line" CONSTRAINT "fk_line_lot" FOREIGN KEY (lot_id) REFERENCES public.lot(id)
Access method: heap
Options: autovacuum_enabled=true, toast.autovacuum_enabled=true

\d+ public.line
                                       Table "public.line"
   Column   |  Type  | Collation | Nullable | Default | Storage  | Stats target | Description
------------+--------+-----------+----------+---------+----------+--------------+-------------
 id         | uuid   |           | not null |         | plain    |              |
 type       | text   |           | not null |         | extended |              |
 sort_index | bigint |           | not null |         | plain    |              |
 lot_id     | uuid   |           | not null |         | plain    |              |
Indexes:
    "line_pkey" PRIMARY KEY, btree (id)
    "idx_line_lot" btree (lot_id)
    "idx_line_sort_index" btree (sort_index)
    "idx_line_type" btree (type)
Foreign-key constraints:
    "fk_line_lot" FOREIGN KEY (lot_id) REFERENCES public.lot(id)
Access method: heap

and a simplified query that I've problem with:

EXPLAIN ANALYZE
SELECT *
FROM public.lot lot
         JOIN public.line line ON lot.id = line.lot_id
WHERE lot.discriminant_id = $discriminant_id
ORDER BY line.sort_index DESC
LIMIT 10;

From what I understand of how the planer should behave, it should choose between: - Plan 1: Scanning the index idx_line_sort_index to retrieve lines sorted correctly, then filter rows on the condition discriminant_id = $discriminant_id - Plan 2: Scanning the index idx_lot_discriminant_id to filter on lots with a discriminant_id = $discriminant_id, then sort the matching lines From what I know, the planner should choose between the two plans based on statistics: - Plan 1 will be chosen for slightly selective $discriminant_id values, since it's better to sort first, then filter. - Plan 2 will be chosen for highly selective $discriminant_id values, since it's better to filter, then sort. However, I got always the same plan: For $discriminant_id=2003, a filter not selective matching ~30% of the total lot table:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..19.96 rows=10 width=923) (actual time=189.045..189.397 rows=10 loops=1)
   ->  Nested Loop  (cost=1.00..25037693.73 rows=13206387 width=923) (actual time=189.042..189.393 rows=10 loops=1)
         ->  Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2881081.84 rows=40287948 width=56) (actual time=0.139..8.513 rows=2500 loops=1)
         ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=859) (actual time=0.072..0.072 rows=0 loops=2500)
               Index Cond: (id = line.lot_id)
               Filter: (discriminant_id = 2003)
               Rows Removed by Filter: 1
 Planning Time: 8.149 ms
 Execution Time: 189.787 ms
(9 rows)

For $discriminant_id=2173, a filter very selective matching only a few rows in the lot table, I have the same plan, and very bad performances:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1001.02..31878.16 rows=10 width=923) (actual time=178184.893..178185.317 rows=10 loops=1)
   ->  Gather Merge  (cost=1001.02..11879434.51 rows=3847 width=923) (actual time=178184.890..178185.312 rows=10 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop  (cost=1.00..11877990.44 rows=1603 width=923) (actual time=73048.846..73050.343 rows=7 loops=3)
               ->  Parallel Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2646068.81 rows=16786645 width=56) (actual time=0.121..26016.348 rows=13428898 loops=3)
               ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=859) (actual time=0.003..0.003 rows=0 loops=40286694)
                     Index Cond: (id = line.lot_id)
                     Filter: (discriminant_id = 2173)
                     Rows Removed by Filter: 1
 Planning Time: 0.904 ms
 Execution Time: 178185.392 ms
(12 rows)

At first, I thought it was a statistic problem, so I check the pg_stats table:

SELECT *
FROM pg_stats
WHERE tablename = 'lot' AND attname = 'discriminant_id';

-[ RECORD 1 ]----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | pub
tablename              | lot
attname                | discriminant_id
inherited              | f
null_frac              | 0
avg_width              | 8
n_distinct             | 204
most_common_vals       | {2003,2349,2063,2199,2256,2002,2104,2215,2060,2042,2113,2004,2066,2101,2006,2049,2058,2156,2177,2007,2052,2043,2046,2064,2100,2050,2367,2165,2055,2164,2213,2216,2260,2200,2154,2190,2191,2158,2044,2001,2179,2201,2189,2105,2174,2187,2205,2161,2162,2309,2196,2108,2183,2312,2188,2194,2182,2197,2198,2008,2263,2181,2267,2048,2195,2062,2379,2180,2059,2112,2163,2171,2271,2106,2057,2206,2348,2381,2061,2120,2212,2005,2053,2259,2274,2115,2313,2121,2193,2251,2264,2099,2184,2262,2398,2419,2207,2409,2413,2422}
most_common_freqs      | {0.3278,0.1092,0.029466666,0.029466666,0.0294,0.0278,0.026333334,0.026333334,0.0224,0.020533333,0.0203,0.0194,0.018033333,0.0156,0.014733333,0.010433333,0.0093,0.0091,0.0091,0.008466667,0.0079666665,0.0073,0.0069333334,0.0068,0.0067,0.0064333333,0.0064333333,0.006366667,0.0062666666,0.005866667,0.0058,0.005433333,0.0047,0.0046666665,0.0046,0.0042333333,0.0042333333,0.0040666666,0.0039333333,0.0039,0.0037333334,0.0037,0.0032666668,0.0032,0.0031666667,0.0028,0.0028,0.0027,0.0026666666,0.0026333334,0.0024,0.0023333333,0.0022666666,0.0022666666,0.0021666666,0.0021,0.0020333333,0.0019666667,0.0019666667,0.0019,0.0019,0.0018,0.0018,0.0017,0.0017,0.0016,0.0016,0.0015333333,0.0014666667,0.0014333334,0.0014,0.0013666666,0.0013333333,0.0013,0.0012666667,0.0012666667,0.0012333334,0.0012,0.0010333334,0.00093333336,0.00093333336,0.0009,0.0009,0.00086666667,0.0008,0.00073333335,0.00073333335,0.0007,0.0007,0.0007,0.0007,0.0006,0.0006,0.00056666665,0.00053333334,0.0005,0.00046666668,0.00046666668,0.00046666668,0.00043333333}
histogram_bounds       | {2041,2047,2047,2054,2056,2056,2065,2065,2103,2103,2109,2109,2109,2114,2117,2119,2160,2160,2160,2168,2170,2175,2178,2178,2202,2202,2202,2203,2204,2211,2211,2250,2250,2252,2252,2253,2254,2254,2254,2255,2258,2258,2266,2266,2266,2268,2268,2269,2272,2273,2273,2359,2362,2362,2363,2366,2370,2370,2377,2378,2380,2383,2390,2390,2392,2397,2400,2402,2405,2426}
correlation            | 0.17503545
most_common_elems      |
most_common_elem_freqs |
elem_count_histogram   |

Everything looks fine to me, but I could not understand the plan chosen by the planer. So I tried to bump stats sampling for the column discriminant_id:

ALTER TABLE public.lot ALTER COLUMN discriminant_id SET STATISTICS 500;
ANALYZE public.lot;

I got more sampling in the pg_stats table:

-[ RECORD 1 ]----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | pub
tablename              | lot
attname                | discriminant_id
inherited              | f
null_frac              | 0
avg_width              | 8
n_distinct             | 204
most_common_vals       | {2003,2349,2256,2002,2199,2063,2104,2215,2060,2042,2113,2004,2066,2101,2006,2049,2058,2177,2007,2156,2052,2046,2064,2043,2050,2055,2100,2213,2367,2260,2165,2191,2154,2200,2164,2216,2190,2158,2179,2044,2001,2105,2205,2201,2189,2174,2161,2162,2187,2188,2196,2267,2309,2183,2194,2312,2263,2198,2181,2182,2163,2197,2379,2108,2057,2195,2106,2381,2062,2048,2059,2112,2348,2271,2180,2274,2008,2120,2005,2171,2061,2212,2206,2259,2313,2053,2121,2115,2099,2054,2184,2193,2251,2419,2264,2258,2422,2262,2254,2160,2398,2397,2409,2103,2065,2250,2252,2413,2362,2366,2268,2056,2380,2202,2109,2170,2204,2370,2207,2047,2114,2266,2211,2178,2390,2116,2118,2377,2272,2363,2217,2175,2176,2255,2400,2408,2378,2394,2405,2373,2384,2402,2155,2392,2117,2385,2275}
most_common_freqs      | {0.32298,0.108333334,0.029793333,0.029746667,0.0289,0.02848,0.02712,0.02536,0.02328,0.020573333,0.02038,0.019706666,0.018566666,0.01624,0.015613333,0.011046667,0.00908,0.009073333,0.008833333,0.008486667,0.007833334,0.0072333333,0.00708,0.00702,0.006853333,0.006726667,0.0063866666,0.0062066666,0.0061533335,0.0057666665,0.00576,0.004833333,0.0047533335,0.0047533335,0.0046066665,0.004526667,0.0042933333,0.00394,0.0038466668,0.0038066667,0.00356,0.0034933332,0.00344,0.00336,0.0031733334,0.00304,0.00276,0.00276,0.00276,0.0026333334,0.0025066666,0.0024533332,0.0022866668,0.00224,0.0021333334,0.0020933333,0.0020333333,0.0019933332,0.0019066667,0.00188,0.0018666667,0.0018133334,0.00178,0.0017533334,0.0017266667,0.0016066667,0.00158,0.0015333333,0.0015266667,0.0014866666,0.0014866666,0.0014333334,0.0013333333,0.00132,0.00124,0.00124,0.0011133334,0.0010666667,0.0010533333,0.0010533333,0.00098,0.00098,0.0009733333,0.00088,0.00086,0.00082666666,0.00077333336,0.00076666666,0.0007533333,0.0006466667,0.00064,0.00060666667,0.00056666665,0.00052666664,0.00047333332,0.00046666668,0.00045333334,0.00044,0.00041333333,0.0004,0.0004,0.00038666668,0.00038666668,0.00038,0.00033333333,0.00031333332,0.00031333332,0.00031333332,0.0003,0.0003,0.00029333335,0.00028666668,0.00028,0.00027333334,0.00025333333,0.00025333333,0.00025333333,0.00024,0.00023333334,0.00022,0.00022,0.00022,0.00020666666,0.0002,0.0002,0.00018666667,0.00018666667,0.00018,0.00017333333,0.00017333333,0.00015333333,0.00013333333,0.00012666667,0.00012666667,0.00012,0.000113333335,0.00010666667,0.00010666667,0.00010666667,0.0001,0.0001,0.0001,9.3333336e-05,9.3333336e-05,8.666667e-05,8.666667e-05,8e-05}
histogram_bounds       | {2041,2051,2051,2119,2157,2167,2168,2168,2169,2192,2208,2253,2253,2269,2270,2273,2273,2310,2311,2314,2357,2358,2359,2359,2361,2361,2361,2364,2365,2369,2369,2371,2383,2395,2395,2399,2401,2401,2407,2415,2418,2426,2432,2432}
correlation            | 0.17252338
most_common_elems      |
most_common_elem_freqs |
elem_count_histogram   |

Still, the value 2173 is still not in the common_vals array. At this point, I wouldn't be surprised to see the same execution plan. However, for $discriminant_id=2173, and after the stats inflation, my query planned changed for the best:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=26989.83..26989.86 rows=10 width=923) (actual time=21.925..21.930 rows=10 loops=1)
   ->  Sort  (cost=26989.83..26992.10 rows=908 width=923) (actual time=21.920..21.922 rows=10 loops=1)
         Sort Key: line.sort_index DESC
         Sort Method: quicksort  Memory: 65kB
         ->  Nested Loop  (cost=1.00..26970.21 rows=908 width=923) (actual time=13.881..21.228 rows=20 loops=1)
               ->  Index Scan using idx_lot_discriminant_id on lot  (cost=0.43..718.59 rows=183 width=859) (actual time=9.429..9.688 rows=4 loops=1)
                     Index Cond: (discriminant_id = 2173)
               ->  Index Scan using idx_line_lot on line  (cost=0.56..143.10 rows=35 width=56) (actual time=2.159..2.865 rows=5 loops=4)
                     Index Cond: (lot_id = lot.id)
 Planning Time: 3.137 ms
 Execution Time: 22.082 ms
(11 rows)

For $discriminant_id=2003, after the stats inflation, the planer still choose to pass through the idx_line_sort_index (which is indeed the most accurate plan):

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..20.24 rows=10 width=923) (actual time=20.670..20.711 rows=10 loops=1)
   ->  Nested Loop  (cost=1.00..25037693.73 rows=13012200 width=923) (actual time=20.665..20.704 rows=10 loops=1)
         ->  Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2881081.84 rows=40287948 width=56) (actual time=0.039..3.832 rows=2500 loops=1)
         ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=859) (actual time=0.006..0.006 rows=0 loops=2500)
               Index Cond: (id = line.lot_id)
               Filter: (discriminant_id = 2003)
               Rows Removed by Filter: 1
 Planning Time: 0.888 ms
 Execution Time: 20.807 ms
(9 rows)

I don't why, but it seems that altering the stats sampling helps the planer to choose the plans I was waiting for, according to the selectivity of the discriminant_id column. Even if the pg_stats did not really change for the specific values 2003, and 2173. I ran some tests, and it seems my problems did not end here: For $discriminant_id=2191, matching ~0.5% of the lot table, before the stat inflation:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..1060.32 rows=10 width=921) (actual time=7011.736..42354.274 rows=10 loops=1)
   ->  Nested Loop  (cost=1.00..25037693.73 rows=236355 width=921) (actual time=7011.720..42354.255 rows=10 loops=1)
         ->  Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2881081.84 rows=40287948 width=56) (actual time=0.273..16201.866 rows=10036046 loops=1)
         ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=857) (actual time=0.002..0.002 rows=0 loops=10036046)
               Index Cond: (id = line.lot_id)
               Filter: (discriminant_id = 2191)
               Rows Removed by Filter: 1
 Planning Time: 12.324 ms
 Execution Time: 42354.448 ms
(9 rows)

After the stats inflation:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..1392.35 rows=10 width=924) (actual time=5870.174..40331.055 rows=10 loops=1)
   ->  Nested Loop  (cost=1.00..25037693.73 rows=179952 width=924) (actual time=5870.166..40331.044 rows=10 loops=1)
         ->  Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2881081.84 rows=40287948 width=56) (actual time=0.583..15596.227 rows=10036046 loops=1)
         ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=860) (actual time=0.002..0.002 rows=0 loops=10036046)
               Index Cond: (id = line.lot_id)
               Filter: (discriminant_id = 2191)
               Rows Removed by Filter: 1
 Planning Time: 3.152 ms
 Execution Time: 40331.218 ms
(9 rows)

The plans are slightly different, but the idea is the same: the query seems too slow to me. Bumping the stats to the maximum (10000) does not solve this issue, the query is still slow. From what I guess, the planer thinks it will be faster to scan the sort_index then filter, because it thinks that the values to return will be very "contiguous" on the index, and underestimates the filter to do on the discriminant.

SELECT line.sort_index FROM public.line line
JOIN public.lot lot ON line.lot_id = lot.id
WHERE lot.discriminant_id = 2191
ORDER BY line.sort_index DESC
LIMIT 10;
sort_index
------------
39224084
39224083
39224082
39224081
39223063
30288652   Nested Loop  (cost=1.00..25037693.73 rows=179952 width=924) (actual time=3506.306..3512.158 rows=5 loops=1)
         ->  Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2881081.84 rows=40287948 width=56) (actual time=0.159..1212.901 rows=1012824 loops=1)
         ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=860) (actual time=0.002..0.002 rows=0 loops=1012824)
               Index Cond: (id = line.lot_id)
               Filter: (discriminant_id = 2191)
               Rows Removed by Filter: 1
 Planning Time: 1.565 ms
 Execution Time: 3512.199 ms
(9 rows)

In the case of limiting the rows to 6, the scanning of the idx_line_sort_index estimates 10M (x10) rows to scan

EXPLAIN ANALYSE
SELECT * FROM public.line line
JOIN public.lot lot ON line.lot_id = lot.id
WHERE lot.discriminant_id = 2191
ORDER BY line.sort_index DESC
LIMIT 6;
                                                                               QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..835.81 rows=6 width=924) (actual time=3539.010..38743.824 rows=6 loops=1)
   ->  Nested Loop  (cost=1.00..25037693.73 rows=179952 width=924) (actual time=3539.008..38743.818 rows=6 loops=1)
         ->  Index Scan Backward using idx_line_sort_index on line  (cost=0.56..2881081.84 rows=40287948 width=56) (actual time=0.107..14643.842 rows=10036002 loops=1)
         ->  Index Scan using lot_pkey on lot  (cost=0.43..0.55 rows=1 width=860) (actual time=0.002..0.002 rows=0 loops=10036002)
               Index Cond: (id = line.lot_id)
               Filter: (discriminant_id = 2191)
               Rows Removed by Filter: 1
 Planning Time: 1.077 ms
 Execution Time: 38744.039 ms
(9 rows)

**Questions:** **Question 1:** It seems that boosting the stats cannot solve the issue for every discriminant_value. But I don't understand why it did help the planer to choose the correct query plan in some cases, for some discriminant values. Can someone explain this behaviour? **Question 2:** is there a way for me to make the planer not underestimate the discriminant filter? Is there a way to boost the usage of the idx_lot_discriminant_id instead of the idx_line_sort_index when it should? **Question 3:** is my model scalable ? My intuition tells me I did a bad move by splitting the model in two tables, and that I should denormalize the column discriminant_id in the table lines to create an index (discriminant_id, sort_index). But I have more filters possibles in the lot table, so it means denormalize almost everything that is index in the line table... The other thoughts I had: - Using a CTE (or a subquery) to filter on the lot first, but since the filters are controlled by the user API, it's not efficient when the user is filtering by nothing. There is filters possible on the line table, which makes the pagination hard to do with the CTE. - Using a materialized view: Since I want real time, I would use too much resource on refreshing the view. - Switch my data store to something more appropriate to my usage (Cassandra?, Mongo?). I'm a bit surprised Postgres cannot handle this use case, and I have the feeling that I am missing something important here. **What are you thoughts?**

K.Banks (21 rep)

Mar 2, 2023, 02:55 PM • Last activity: Mar 2, 2023, 03:08 PM

-1 votes

1 answers

149 views

Real Time Analytics: Which database?

mongodb aggregate scalability olap

We are currently using MongoDB, and it is performing well for our needs. However, we are looking to support better real-time analytics and aggregations, which MongoDB doesn't handle effectively. Therefore, we are exploring other possibilities. The biggest problem is my data is so much flexible. We s...

                                  We are currently using MongoDB, and it is performing well for our needs. However, we are looking to support better real-time analytics and aggregations, which MongoDB doesn't handle effectively. Therefore, we are exploring other possibilities.

The biggest problem is my data is so much flexible. We store:

- contacts
- events

Every contact has different attributes. For example, besides standard attributes such as first_name, last_name, email, phone, gender, etc. there could be additional/custom attributes that users can create.

Also, we have events such as order, add to cart, the coupon applied, etc. Every user can push different events -- like in Google Analytics, for example.

We want to support the following queries:

- Select all users from London

- Select all users from London who created between 3 and 5 orders whose minimum value is $100, in the last 90 days.

- Select all users from Germany who purchased T-Shirt in the last X days

and so on...

Currently, we don't have such aggregations (count, average, etc).

As you can see, our data is quite unstructured. We are exploring several options to improve this, including MongoDB + Spark, ClickHouse, MariaDB ColumnStore, Apache Druid, Apache Pinot, and others.

Since different attributes have input types, we could create two tables:

**contacts**

- workspace_id
- attribute_string_1
- attribute_string_2
- attribute_string_3
- ...
- attribute_integer_1
- attribute_integer_2
- attribute_integer_3 
...
- created_at
- updated_at

and have a few hundreds of such columns.

The same is for **events**:

- workspace_id
- event
- event_attribute_string_1
- event_attribute_string_2
- event_attribute_string_3
..
- event_attribute_integer_1
- event_attribute_integer_2
- event_attribute_integer_3
..

etc.

This means every column can be used in the WHERE clause for filtering... plus we need to join them (or use dictionaries in ClickHouse, for example). Currently, we have a nested array in MongoDB for easier handling.

Questions:

1. What do you think might be the best choice for such a problem?
2. How to organize indexes to support such dynamical data?
3. What about time series databases?

Thanks

                                

Nedim (99 rep)

Feb 19, 2023, 12:12 PM • Last activity: Feb 19, 2023, 02:03 PM

0 votes

1 answers

682 views

Galera cluster performance, node count, and where to write?

high-availability galera scalability

As far I understand Galera does not scale **write** performance, best case it does not degrade it, compared to one single server, so we use Galera for **HA**. (read performance may benefit and it can be load balanced across every nodes) In Galera the **write** performance will be the performance of...

                                  As far I understand Galera does not scale **write** performance, best case it does not degrade it, compared to one single server, so we use Galera for **HA**. 
(read performance may benefit and it can be load balanced across every nodes)

In Galera the **write** performance will be the performance of the weakest node. 
In light of that, who wants to got more HA than 3-4 nodes can provide? If one or two nodes fail the **write** performance will not degrade, even it is possible it increasing... so it seems to over guard HA with more than 4 nodes. (I am aware the WAN use case and multiple datacenters, but that case we have two or mode Galera clusters)

It is not clear for me, if given a particular write transaction, is there any significant load difference on node that actually the client use to write, compared to the load on all other nodes, who also needs to complete synchronously the very same transaction? 

I try to decide, that should I have a dedicated node to write (btw it is single point of error, so I should use some infra to react), and only use the others to read, or try to load balance the writes also across all nodes? 

How many nodes are optimal? Is it depends on my application read load/write load ratio?

g.pickardou (187 rep)

Jan 21, 2023, 07:46 PM • Last activity: Jan 27, 2023, 07:08 AM

2 votes

3 answers

3808 views

Is it OK to create hundreds of databases in SQL Azure versus one big one and run the risk of deadlocking

sql-server azure-sql-database scalability

I need to create highly scalable solution - where field devices in thousands of sites are delivering data in real time to a back end system, and SQL Azure seems to fit the bill nicely in terms of adding sql databases and application servers. Each field device is effectively sending 400 sensor values...

                                  I need to create highly scalable solution - where field devices in thousands of sites are delivering data in real time to a back end system, and SQL Azure seems to fit the bill nicely in terms of adding sql databases and application servers. 

Each field device is effectively sending 400 sensor values every second - for about two hours a day, and those 400 sensor values every 5 minutes for all other times forever. Additionally, when an error occurs on this field device, it will send up the last minute's data for all 400 sensors as well (400 * 60 readings) - causing a mass flood of data when anything goes wrong.

I really want to design the system so that the independent field devices and the data in which they store can not affect other devices. Allowing each field device to not affect the performance of other field devices.

I started the design with thinking a single database holding all the device's data - but have started to get deadlocks occurring when simulating multiple site devices. Hence, am in the process of moving to a multiple database solution. Where a master database holds a lookup table for all the devices - returning a connection string to the real database

At this stage of the project, it is most important that I am able to pass that data back to User Interfaces running in web browsers in real time - updating their screens every second.

In future stages of the project it will be necessary to start aggregating data across multiple devices showing statistics such as sum of sensor X in region Y. I can see this will be hard to do with the multiple database approach. 

So would value any advice e.g.

Do you think it is sensible to use Sql Azure to host potentially 1000's of databases and to use this master database to indirectly point to the real ones?

Will I have a problem with Connections to the databases from the applications- with issues with connection pooling for example?

How will I be able to aggregate data from all these different databases in Sql Azure.

Would be interested in all your comments. Regards, Chris.

ChrisI (31 rep)

May 4, 2012, 06:48 AM • Last activity: Oct 26, 2022, 08:02 AM

0 votes

0 answers

227 views

Is Postgres appropriate for this application? (1 billion inserts a year)

postgresql scalability

I am currently working on drug record management system which needs to record and retain insurance billing information for patients. So far, all the typical information (patient profile, prescriber profile, drug information etc.) can easily be handled by a single postgres server as they don't really...

                                  I am currently working on drug record management system which needs to record and retain insurance billing information for patients. So far, all the typical information (patient profile, prescriber profile, drug information etc.) can easily be handled by a single postgres server as they don't really scale past the tens of millions of rows, but I am worried about the billing information as that enters into the billions territory.

Some back of the napkin calculation:
Each location (~2000) performs about 300 billing transactions a day - this translates to about 1000 inserts. So this is about 2000 * 1000 * 365 = ~700 million inserts a year.

There will also be a migration of historic billing data which goes back about a decade so thats another 7 billions records thats already there. Since this is medical information **we're not allowed to delete any of it** including all the new insertions moving forward.

Each billing row has a patient id, billing number, date and other insurance related information. So this table only needs to support three types of look ups:
1) Finding all the billing rows for a particular patient
2) Finding all the billing rows for a particular number
3) Finding all the rows within a certain time period

Is postgres appropriate for this type of workload?

d124y (1 rep)

Aug 27, 2022, 12:18 AM

8 votes

3 answers

4962 views

Database design to handle millions of rows in MySQL

mysql scalability big-data

We are running an application that is collecting data much faster than we anticipated. Trying to addapt to that, we are doing a redesig of the database. After reading [this][1], [this][2] and [this][3], I am not sure what the best approach for our design is... considering our HW is very humble. Ther...

                                  We are running an application that is  collecting data much faster than we anticipated. Trying to addapt to that, we are doing a redesig of the database.
After reading this , this  and this , I am not sure what the best approach for our design is... considering our HW is very humble.

There are three main tables that are causing problems:
 - SCANS
 - DOMAINS
 - DOCUMENTS
 - VALUES

Currently we have one single table to store data. The relation between them is:

 - 1 **SCAN** -> (avg 4x) **DOMAINS** -> (avg 3000) MANY **DOCUMENTS** -> (avg 51000) MANY **VALUES**
    - 1 SCAN points to average 4 entries on DOMAINS.
    - 4 entries on DOMAINS point to average 12.000 entries ON DOCUMENTS
    - 12000 entries ON DOCUMENTS point TO average 204000 entries on VALUES

We are currently performing around 100 scans/day. That is inserting around 20.400.000 items per day into VALUES.

We are considering to split VALUES table as one "VALUE_table_per_month":

 - **VALUES_year_month** with the intention to distribute the load between them. But if we multiply the number of scanners, this mechanism is not escalable.
 - **VALUES_year_month_day** then we will end up with so many tables into the same DB.

In both cases, if we increase the number of scans per day, none of the solutions seems scalable.

At this point, to keep all the data into a centralized DB does not seem the best option for scalability reasons... but at the same time, a distributed system will increase the load time significantly.

What would be a reasonable approach? I am sure we are not the first team to find this issue! :P

**EDIT**

*How much data do we read per query?*

That depends on the SCAN. Not all scans have the same amount of data. The range varies between:

 - 1 SCAN --> 200 VALUES
 - 1 SCAN --> 200.000 VALUES

The information is presented on a front end to the end user. So we have splitted how the queries are requested to the backend to avoid overload the server, but in some cases it is not enought due the high number of VALUES.

*When is the data read?*

That entirely depends on the end users. Some days they read 10s of SCANS a day, others none and others 100s.


**EDIT II**
ANALYZE DESCRIBE results from two queries. First one quick and second one slow.

    EXPLAIN ANALYZE 
    SELECT value,
            url,
            filetype, 
            severity,
            COUNT(id_value) AS data_count
    FROM VALUES
    WHERE (weigth = 150 OR weigth = 100) 
    AND id_analysis = 23 
    AND is_hidden = 0 
    AND is_hidden_by_user = 0 
    GROUP BY value 
    ORDER BY data_count DESC

 **Result 1:**

    | -> Sort row IDs: data_count DESC  (actual time=34.016..34.016 rows=0 loops=1)
    -> Table scan on   (actual time=34.006..34.006 rows=0 loops=1)
        -> Aggregate using temporary table  (actual time=34.005..34.005 rows=0 loops=1)
            -> Filter: ((VALUES.is_hidden_by_user = 0) and (VALUES.is_hidden = 0) and ((VALUES.weigth = 150) or (VALUES.weigth = 100)))  (cost=1.00 rows=0.05) (actual time=0.024..0.024 rows=0 loops=1)
                -> Index lookup on VALUES using id_analysis (id_analysis=23)  (cost=1.00 rows=1) (actual time=0.024..0.024 rows=0 loops=1)
 |

Result 2:

        | -> Sort row IDs: data_count DESC  (actual time=187172.159..187172.173 rows=136 loops=1)
        -> Table scan on   (actual time=187172.079..187172.111 rows=136 loops=1)
            -> Aggregate using temporary table  (actual time=187172.077..187172.077 rows=136 loops=1)
                -> Filter: ((VALUES.is_hidden_by_user = 0) and (VALUES.is_hidden = 0) and ((VALUES.weigth = 150) or (VALUES.weigth = 100)))  (cost=264956.35 rows=695) (actual time=249.030..186775.012 rows=52289 loops=1)
                    -> Index lookup on VALUES using id_analysis (id_analysis=8950)  (cost=264956.35 rows=265154) (actual time=248.979..186696.529 rows=134236 loops=1)
     |

**EDIT III**

> Consider PARTITIONing

This is a great suggestion. Kudos!. From what I have read now, that is the native equivalent to spliting tables in the way we were consideting to do.

> (weigth = 150 OR weigth = 100) is a rather strange test.

Removing the OR clausule improves the timing: 

    
    | -> Sort row IDs: data_count DESC  (actual time=101261.260..101261.271 rows=113 loops=1)
        -> Table scan on   (actual time=101261.187..101261.216 rows=113 loops=1)
            -> Aggregate using temporary table  (actual time=101261.185..101261.185 rows=113 loops=1)
                -> Filter: ((VALUES.is_hidden_by_user = 0) and (VALUES.is_hidden = 0) and (VALUES.id_analysis = 8950) and (VALUES.weigth = 150))  (cost=79965.29 rows=623) (actual time=83848.835..100942.179 rows=52259 loops=1)
                    -> Intersect rows sorted by row ID  (cost=79965.29 rows=62292) (actual time=83848.830..100908.758 rows=52259 loops=1)
                        -> Index range scan on VALUES using id_analysis over (id_analysis = 8950)  (cost=291.66 rows=265154) (actual time=0.100..443.145 rows=134236 loops=1)
                        -> Index range scan on VALUES using weigth over (weigth = 150)  (cost=13492.63 rows=12380386) (actual time=0.043..83511.686 rows=7822871 loops=1)
     |

> Please elaborate on value versus id_value

I believe it might be just a "bad naming". 

    +-------------------+-------------+------+-----+---------+----------------+
    | Field             | Type        | Null | Key | Default | Extra          |
    +-------------------+-------------+------+-----+---------+----------------+
    | id_value          | int         | NO   | PRI | NULL    | auto_increment |
    | id_document       | int         | NO   | MUL | NULL    |                |
    | id_tag            | int         | YES  | MUL | NULL    |                |
    | value             | mediumtext  | YES  |     | NULL    |                |
    | weigth            | int         | YES  | MUL | NULL    |                |
    | id_analysis       | int         | YES  | MUL | NULL    |                |
    | url               | text        | YES  |     | NULL    |                |
    | domain            | varchar(64) | YES  |     | NULL    |                |
    | filetype          | varchar(16) | YES  |     | NULL    |                |
    | severity_name     | varchar(16) | YES  |     | NULL    |                |
    | id_domain         | int         | YES  | MUL | NULL    |                |
    | id_city           | int         | YES  | MUL | NULL    |                |
    | city_name         | varchar(32) | YES  |     | NULL    |                |
    | is_hidden         | tinyint     | NO   |     | 0       |                |
    | id_company        | int         | YES  |     | NULL    |                |
    | is_hidden_by_user | tinyint(1)  | NO   |     | 0       |                |
    +-------------------+-------------+------+-----+---------+----------------+


                                

Javi M (61 rep)

Aug 23, 2022, 07:08 PM • Last activity: Aug 24, 2022, 07:36 PM

0 votes

0 answers

18 views

Keeping up-to-date values aggregated based on some time period

database-design view cache scalability time-series-database

I'm considering a few solutions for a problem involving some sort of a time series data. Imagine: you have a system that keeps track of financial transactions in multiple currencies. To keep it simple, let's say it just keeps track of money flowing through the system; it doesn't care about which way...

                                  I'm considering a few solutions for a problem involving some sort of a time series data. Imagine: you have a system that keeps track of financial transactions in multiple currencies. To keep it simple, let's say it just keeps track of money flowing through the system; it doesn't care about which way it goes (no debit, credit, etc.). What we want to do is find the total amount of money that flowed through the system in some recent time period (e.g. last hour, 30 mins, 20 seconds, etc.). The system processes a sizeable amount of transactions ranging from tens of thousands into millions per minute, and the solution needs to scale. The use case for this is really more around sampling, so the value isn't exactly time sensitive; for example, the sum can be for a 5-minute window starting from 6 mins ago (i.e. it's stale by about a minute), although we'd love for it to be as real time as possible. Transactions are stored in a relational database.

I don't have a lot of experience when it comes to systems at this scale, but I'm considering the options below:

### (Distributed) Caching

My first thought is to use a cache (e.g. Redis) and set the TTL on the data based on the desired window. The cache can be asynchronously updated by a regularly running job that sends a query to the database, just so that the clients don't have to wait for when the cached value goes stale (like I said, we can tolerate a bit of staleness).

### Views? Triggers?

I don't have a lot of experience when it comes to database views and triggers, so I'm not actually sure if this is a feasible option. My understanding is that select queries can be converted into views, and views can be indexed. However, I don't know if it's possible to set it up such that it maintains a sliding window based on the time period. Is this even feasible?

### Time Series Database

Another approach I'm thinking of is to use a time series database where transactions are asynchronously copied. Transactions are written into this TSDB mainly for the purpose of fulfilling this use case. I think one of the main downsides to this is that the transactions written to this db are only eventually consistent.

As mentioned, I don't have a lot of hands-on experience with running systems at this scale, so I would appreciate any insights into the pros and cons of these options, or any alternative suggestions. If you know of any product-specific features that would work well for use cases like this, I'd also love to hear about them. Thanks!

Psycho Punch (101 rep)

Jul 27, 2022, 08:47 PM

1 votes

0 answers

656 views

Multiple databases vs partitioning in PostgreSQL

postgresql database-design partitioning sharding scalability

Consider following database schema for *PostgreSQL* (I'm using 13.x.x) [![my current database structure][1]][1] **Users table** - just place to store users and do authentication **Projects** - projects which are created by specific user **Driver/Event/Vehicle types** - "handbook" tables defined in s...

Consider following database schema for *PostgreSQL* (I'm using 13.x.x)

**Users table** - just place to store users and do authentication **Projects** - projects which are created by specific user **Driver/Event/Vehicle types** - "handbook" tables defined in specific project (e.g. project#1 could have only 10 vehicle types, but project#2 - 1000) **Really big table** - 30 up to 70 million rows **per specific project** **Event reactions** - abstraction around "really big table" (the number of rows could as big) I do want to partition driver/event/vehicle/really_big_data/event_reactions by *project_id* field which could give me something like this:

driver_types_some_uuid1
driver_types_some_uuid2
...
event_types_some_uuid1
event_types_some_uuid2
...
really_big_data_some_uuid1
really_big_data_some_uuid2
...
and etc.

The reason why I do want to make partitions is to make faster search per project in my API, easier way to locate data visually and users want have to "backup" theirs projects as some kind of dump. But I have some feelings that this is a bad way, since I'll face foreign keys between partitioned tables and this is not easy as I know so far. So I came up with next idea: * Create database "A" for users and their projects lists

* Create databases named *database_project_id" per project

and etc. As I see now, I could easily select data per project, do dumps, visually locate data without headache. But this leads me to creating some abstractions in my API code though: open 1 DB connection for users/projects and manage N-amount of DB connections for each "project", which are requested (if project is not requested for certain amount of time - close connection). **So the question is**: Is my idea about creating databases instead of partitions any good? Are there better options to do it in such case? **UPD** I do not need any relations between different users/projects.

DocC (111 rep)

Jul 23, 2022, 09:03 AM

Showing page 1 of 20 total questions