Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
1
votes
1
answers
170
views
Keys of a relation
I got relation: > R = { (ABCDEF) , (A B F -> C, B C -> F, F -> A, F -> B, D E -> E, E -> D)} keys of this relation are: > {BCE}, {EF} But I don't get it... why? Why not only {EF}.. it got less attributes than {BCE}.. Another example, relation: > R2 = {(ABCDEF), (A -> B C , B -> D , E -> F)} and the...
I got relation:
> R = { (ABCDEF) , (A B F -> C, B C -> F, F -> A, F -> B, D E -> E, E -> D)}
keys of this relation are:
> {BCE}, {EF}
But I don't get it... why? Why not only {EF}.. it got less attributes than {BCE}..
Another example, relation:
> R2 = {(ABCDEF), (A -> B C , B -> D , E -> F)}
and the key is:
> {AE}
why only one? Why not:
> {AE}, {ABE}
ABE also determines all other attributes, why is it not a key?
Ariel Grabijas
(125 rep)
Jan 19, 2014, 04:29 PM
• Last activity: Jul 14, 2025, 07:02 PM
3
votes
1
answers
157
views
Table Structure: 2 Related Entities Share the Same Entity
I was hoping I could get some advice on my table structure. Im writing a simple workout tracker for me and my buddies. These are the requirements: - Multiple users - A user can have multiple workouts - A workout has multiple sets - A set can have multiple users I have a way to accomplish what I need...
I was hoping I could get some advice on my table structure. Im writing a simple workout tracker for me and my buddies. These are the requirements:
- Multiple users
- A user can have multiple workouts
- A workout has multiple sets
- A set can have multiple users
I have a way to accomplish what I need to but I feel like theres a much simpler way that Im missing.
tbl_user
| UserId | Name |
-------------------
| 1 | John |
| 2 | Greg |
tbl_workout
| WorkoutId |
-------------
| 1 |
| 2 |
tbl_user_workout
| WorkoutId | UserId |
----------------------
| 1 | 1 |
| 2 | 1 |
tbl_set
| SetId | WorkoutId |
---------------------
| 1 | 1 |
| 2 | 1 |
tbl_user_set
| UserSetId | SetId | UserId | Reps | Weight |
----------------------------------------------
| 1 | 1 | 1 | 20 | 50 |
| 2 | 1 | 2 | 15 | 60 |
I figure this way I could get:
- A users sets with user->userset
- A users workouts with user->userworkout
- A workouts sets with workout->set
Is this structure logical?
Jonathan Eckman
(131 rep)
Aug 2, 2013, 03:55 AM
• Last activity: Jul 10, 2025, 12:01 PM
1
votes
1
answers
164
views
Transaction Table in Personal Finance App
I'm a noob designing a Personal Finance App to teach myself Rails. This is just for learning purposes, but I actually intend to use it to replace my current system in Google Spreadsheets. I want to apply best practice. I created a transactions table with the following columns: user_id: integer type:...
I'm a noob designing a Personal Finance App to teach myself Rails. This is just for learning purposes, but I actually intend to use it to replace my current system in Google Spreadsheets. I want to apply best practice.
I created a transactions table with the following columns:
user_id: integer
type: string (income or expense)
date: date
account_group_id: integer
account_id: integer
amount: integer
(my currency does not need decimals) comment: string
Now, the way is this designed, all users' transactions would be recorded on the same table. Is this a good idea? Or should I create a different transactions table for each user? Does it matter if I do it either way?
type: string (income or expense)
date: date
account_group_id: integer
account_id: integer
amount: integer
(my currency does not need decimals) comment: string
Now, the way is this designed, all users' transactions would be recorded on the same table. Is this a good idea? Or should I create a different transactions table for each user? Does it matter if I do it either way?
David
(11 rep)
Oct 20, 2015, 08:02 PM
• Last activity: Jul 5, 2025, 09:03 PM
2
votes
2
answers
246
views
Is sql_variant compatible with First Normal Form (1NF)?
1NF rule says that we should not keep data of different types in one single column. Does it mean that sql_variant is not compatible with first normal form hence should not be used?
1NF rule says that we should not keep data of different types in one single column. Does it mean that sql_variant is not compatible with first normal form hence should not be used?
igelr
(2162 rep)
Jul 27, 2018, 03:20 PM
• Last activity: Apr 19, 2025, 09:17 PM
2
votes
1
answers
1366
views
Finding all possible minimal covers
I have a relation schema R = {A, B, C} and the following functional dependencies: - A → B - A → C - B → A - B → C - C → A - C → B How many different minimal covers can I derive from this relation schema? I have found the following, but I am not really sure if those are all: - A → C - B → C - C → A -...
I have a relation schema R = {A, B, C} and the following functional dependencies:
- A → B
- A → C
- B → A
- B → C
- C → A
- C → B
How many different minimal covers can I derive from this relation schema? I have found the following, but I am not really sure if those are all:
- A → C
- B → C
- C → A
- C → B
Also, I am not sure if there is some rule on how to know if one has found all possible minimal covers.
Thanks a lot for any help!
BlackPearl
(39 rep)
Oct 10, 2019, 06:54 AM
• Last activity: Apr 4, 2025, 04:00 PM
-1
votes
1
answers
2335
views
Data Schema For Stock Control / Multi Source Inventory
I'm working on a project that involves stock control with multiple stock sources and sales channels. The overall hierarchy I've got so far looks like this; Sales Channels The system must have one or more sales channels, each sales channel > must have 1 or more stock sources, a stock source must have...
I'm working on a project that involves stock control with multiple stock sources and sales channels. The overall hierarchy I've got so far looks like this;
Sales Channels The system must have one or more sales channels, each sales channel
> must have 1 or more stock sources, a stock source must have 1 or more
> stock locations (warehouses / buildings / distribution centres ), a
> stock location may have 1 or more Bin/Shelf locations.
>
> A product may have 1 or more stock locations, may have one or more sub
> locations in those stock locations and may have one or more Shelf / Bin
> location.
First off, is this a solved problem where some reference schema exists I could utilise and save myself some headaches?
If there isn't a reference design for this situation, Am I best to build a 1 to Many relationship to assign stock to a sub location(s) and another 1 to Many relationship for Shelf Locations (if exists)
This is the ERD I've come up with so far;

Matthew
(99 rep)
Oct 29, 2020, 12:32 AM
• Last activity: Mar 17, 2025, 10:09 AM
0
votes
3
answers
596
views
Database: Splitting tables based on logically grouped optional columns
I keep having this argument with my management about splitting the tables. I prefer splitting a database table to multiple tables based on attributes that can be grouped together. The columns which are always not filled in and are almost always `Null` (optional). Here is an example: `appointment` -...
I keep having this argument with my management about splitting the tables. I prefer splitting a database table to multiple tables based on attributes that can be grouped together. The columns which are always not filled in and are almost always
Null
(optional).
Here is an example:
appointment
- id (PK)
- time
- category
- status
- employee_id
- visitor_id
- optional_registration_time
- optional_registration_message
- optional_registration_feedback
- optional_registration_image
All the columns named optional_registration_*
are optional fields. They might or might not be filled in for every row.
And since they are optional and can be logically be grouped together, my approach was to split them off to a different table. Like this
appointment_registration
:
- appointment_id (FK)
- time
- message
- feedback
- image
Although this is just an example, but it pretty much explains my dilemma. I prefer splitting it off, but my management insists because they belong together, they should be in one table. I don't agree that just because they belong together, doesn't mean they can't be associated with a different table.
By splitting them off, the new table appointment_registration
will only have entries that are appropriate. And the table looks lean.
Which is the better approach?
user2354302
(101 rep)
Oct 29, 2021, 04:03 PM
• Last activity: Feb 19, 2025, 07:05 AM
0
votes
1
answers
112
views
Is that table in BCNF?
In the book I am currently reading this table is mentioned as an example for what we can do if a table fulfills BCNF but has still redundancies. It then mentions that a solution are arrays. But is that table even in BCNF? Aren't there functional dependencies from Language -> PersNr and from Programm...
In the book I am currently reading this table is mentioned as an example for what we can do if a table fulfills BCNF but has still redundancies. It then mentions that a solution are arrays. But is that table even in BCNF? Aren't there functional dependencies from Language -> PersNr and from Programming -> PersNr. Both attributes are not superkeys, since they do not identify the rows uniquely. So they break BCNF in my view. Or am I misunderstanding something?

Marlon Brando
(141 rep)
Jan 22, 2025, 01:48 PM
• Last activity: Jan 22, 2025, 02:08 PM
2
votes
1
answers
463
views
Does R ⋈ (S ∪ T) = (R ⋈ S) ∪ (R ⋈ T) hold for bag semantics?
Does R ⋈ (S ∪ T) = (R ⋈ S) ∪ (R ⋈ T) hold for relational algebra bag semantics? I don't know whether [this holds](https://stackoverflow.com/a/33704297) for bag semantics for join and union.
Does R ⋈ (S ∪ T) = (R ⋈ S) ∪ (R ⋈ T) hold for relational algebra bag semantics?
I don't know whether [this holds](https://stackoverflow.com/a/33704297) for bag semantics for join and union.
Marlon Brando
(141 rep)
Jan 19, 2025, 02:53 PM
• Last activity: Jan 20, 2025, 08:35 AM
1
votes
0
answers
84
views
Is there a formalism for the kinds of constraints we can implement with indexed-views? Or at least a rule-of-thumb?
SQL Server’s Indexed Views provide a performant way to implement _certain classes_ of database-scoped constraints - [such as ensuring a value is unique over multiple tables][1] - or other [non-key, cross-table value constraints][2]. As far as I understand it: provided you can reduce your desired con...
SQL Server’s Indexed Views provide a performant way to implement _certain classes_ of database-scoped constraints - such as ensuring a value is unique over multiple tables - or other non-key, cross-table value constraints .
As far as I understand it: provided you can reduce your desired constraint to a single
SELECT
query using some combination of INNER JOIN
, CROSS JOIN
, and/or COUNT_BIG
/SUM
such that invalid data causes duplicate output rows then using an indexed-view is doable.
…but how can I know (or ideally: prove) that some proposed constraint can be implemented this way? And how can I know that some other kind of constraint is impossible to implement?
Another consideration is that the limitation of only having INNER JOIN
and CROSS JOIN
becomes less of a restriction with the help of utility-tables containing data-ranges (either manually maintained or using GENERATE_SERIES
) - this enables other classes of constraints to be implemented - but I struggle to conceptualise those classes (do they have names?)
Dai
(632 rep)
Oct 15, 2022, 01:26 AM
• Last activity: Oct 19, 2024, 09:54 PM
0
votes
0
answers
109
views
Consistent transitive relationships
I have a database with three tables: `City`, `Region`, and `Country`. These tables have some straightforward FK constraints: a city is in one region and in one country, and a region is in one country. I would like to look up what country a given city is in. For example, San Francisco. I could do thi...
I have a database with three tables:
City
, Region
, and Country
. These tables have some straightforward FK constraints: a city is in one region and in one country, and a region is in one country.
I would like to look up what country a given city is in. For example, San Francisco. I could do this in two ways:
1. City
-> Country
tells me that San Francisco is in the US
1. City
-> Region
-> Country
tells me that San Francisco is in California, which is in the US
But for a disputed territory, it's not unreasonable to worry that these two ways of answering the question disagree. For example, Sevastopol might give the following result:
1. City
-> Country
tells me that Sevastopol is in Ukraine
1. City
-> Region
-> Country
tells me that Sevastopol is in Crimea, which is in Russia
Clearly, if I want to guarantee a consistent answer, no matter how I ask the question, I'll need some additional constraint on the transitive relationship. In this example, I could add a mapping table with multiple composite foreign key constraints:
CREATE TABLE city_region_country_mapping (
city_id INTEGER NOT NULL REFERENCES city(city_id),
region_id INTEGER NOT NULL REFERENCES region(region_id),
country_id INTEGER NOT NULL REFERENCES country(country_id),
PRIMARY KEY (city_id, region_id, country_id),
FOREIGN KEY (city_id, region_id) REFERENCES city(city_id, region_id),
FOREIGN KEY (city_id, country_id) REFERENCES city(city_id, country_id),
FOREIGN KEY (region_id, country_id) REFERENCES region(region_id, country_id)
)
That's fine for this simple example, but what if I make a much bigger database, where I could go CityDistrict
-> City
-> County
-> Region
-> Country
-> Continent
. Now there is a lot more scope for answering a question by traversing one of several paths through the database. Ensuring that all the possible paths in the database are consistent with one another quickly becomes very cumbersome, if you have to rely on constraints like the one above.
Is there a standardised and/or better approach for addressing this kind of problem?
Some other observations, in no particular order:
- This problem appears related to normalisation, in the sense that you can get different answers depending _how_ you ask a question. Up to 3NF this is also a problem with missing normalisation: there are multiple sources of truth, causing insertion/update/deletion anomalies.
- That said, it seems to me that my initial city/region/country example doesn't break any of the classical (up to 5NF) normal forms...I might be wrong though. Or there might be more exotic normal forms that address this issue?
- I'm asking this primarily in the context of relational databases, but I imagine this is exacerbated in graph databases...do such databases have a better way of addressing this problem?
funklute
(109 rep)
Sep 30, 2024, 02:29 PM
0
votes
1
answers
52
views
primary key referened by only single foreign key of another table among multiple tables that can reference it
Let's say I have table Account that contains username and password of users to login:- ```sql account ( account_id BIGINT PRIMARY KEY, username VARCHAR(30), password VARCHAR(30) ) -- I have two more tables that stores their credentials in the above table:- customer ( delivery_address VARCHAR(100), a...
Let's say I have table Account that contains username and password of users to login:-
account (
account_id BIGINT PRIMARY KEY,
username VARCHAR(30),
password VARCHAR(30)
)
-- I have two more tables that stores their credentials in the above table:-
customer (
delivery_address VARCHAR(100),
account_id FOREIGN KEY,
..30 more customer columns
)
admin (
sp_admin_code INT,
account_id FOREIGN KEY,
..30 more admin columns
)
An admin cannot be a customer and use same username and password to login as a customer.
My Questions:-
1. Is this the proper way to implement this kind of relationship? If not, what is?
2. Do I need to add some constraint in the primary key of the account table to denote that it's single entry can either be referenced by customer table or by admin table at once?
3. Now we also know that, an entry on account can only exist if there is an entry for it on either customer table or admin table and vice versa. So do we need to or if we want do add some cascade/constraint, how do we do it or how should we do it or if we should do it at all? sorry im confused..
thank you!
Sidharth Bajpai
(103 rep)
Jun 15, 2024, 08:53 AM
• Last activity: Jun 15, 2024, 12:56 PM
0
votes
1
answers
127
views
Calculating memory size required for an index
Assume I have the following table: CREATE TABLE t1 ( id int NOT NULL PRIMARY KEY, rank int NOT NULL, phrase varchar(100) NOT NULL ); with 10 billion records in a relational database like PostgreSQL. It will take about 1TB of disk space, right? (each record is about 100B) As far as I can guess, Postg...
Assume I have the following table:
CREATE TABLE t1 (
id int NOT NULL PRIMARY KEY,
rank int NOT NULL,
phrase varchar(100) NOT NULL
);
with 10 billion records in a relational database like PostgreSQL.
It will take about 1TB of disk space, right? (each record is about 100B)
As far as I can guess, PostgreSQL operates by pages of size 8KB, so I have about 134 217 728 pages (2^40 / (8*1024))
And if I have 64 GB RAM I can easily create an index on
id
or rank
in memory, because I have 512 bytes per page (64*2^30 / 134217728) and handle a query like
SELECT * FROM t1 WHERE id=12435678;
with a single disk read operation and so handle about 50 000 queries per second on a modern SSD drive, right?
Are the above calculations correct?
How do I calculate the more accurate memory size required for the index on id
or rank
?
What if I create a full text search index on phrase
? What memory size do I need?
**EDIT1:**
I am not sure about the index on id
, because it is a primary key, probably it is a clustered index.
Index on rank:
CREATE INDEX rank_index ON t1(rank);
Also I am not sure how do I create a full text index, but a quick Goodle search gives me something like this :
ALTER TABLE t1 ADD COLUMN search_vector tsvector;
UPDATE t1 SET search_vector = to_tsvector('english', phrase);
CREATE INDEX t1_search_vector_idx ON t1 USING gin(search_vector);
and then I search for search terms
with a query like this (not sure about the syntax):
SELECT id, phrase, rank * ts_rank(search_vector, to_tsquery('english', 'search terms')) AS final_rank
FROM t1
ORDER BY final_rank DESC
LIMIT 50;
Alexey Starinsky
(113 rep)
Feb 27, 2024, 12:23 AM
• Last activity: Feb 27, 2024, 12:33 PM
6
votes
2
answers
25569
views
How does normalization fix the three types of update anomalies?
I have been reading 'Database Systems: A practical approach to design, implementation and management' by Connolly and Begg in order to understand database normalization (chapter 14). The three DB normalization forms: - First normal form (1NF) - Second normal form (2NF) - Third normal form (3NF) The...
I have been reading 'Database Systems: A practical approach to design, implementation and management' by Connolly and Begg in order to understand database normalization (chapter 14).
The three DB normalization forms:
- First normal form (1NF)
- Second normal form (2NF)
- Third normal form (3NF)
The three DB update anomalies:
- Insertion anomaly
- Deletion anomaly
- Update / Modification anomaly
How do I link the two together?
Which of the different forms help(s) to fix each kind of anomaly?
What is the mapping relationship between the two groups?
Why does each form fix its anomalies?
the_butler
(71 rep)
Jan 4, 2018, 03:31 PM
• Last activity: Jan 16, 2024, 09:07 PM
0
votes
2
answers
91
views
How do I identify the SQL server brand via SQL queries
A 3rd party API exposes a endpoint which accepts "SQL" queries. Their examples are simple `SELECT` statements, however I want to write more complicated queries. They do not specify which SQL brand they are using, which means I don't know the exact syntax I should be following. How can I identify the...
A 3rd party API exposes a endpoint which accepts "SQL" queries. Their examples are simple
SELECT
statements, however I want to write more complicated queries.
They do not specify which SQL brand they are using, which means I don't know the exact syntax I should be following.
How can I identify the SQL brand (SQL Server, MySQL, Azure SQL, Postgres, etc) with one or more SQL queries (no command line access, file permissions, etc)?
yeerk
(101 rep)
Dec 12, 2023, 02:31 PM
• Last activity: Dec 12, 2023, 04:14 PM
0
votes
0
answers
53
views
How to create a schema for a combination of values that needs to be unique
How would I create a schema for the following data: ``` Dimension A Attribute X, Attribute Y positive , positive => Description 1 positive , negative => Description 2 negative , positive => Description 3 negative , negative => Description 4 Dimension B Attribute Z, Attribute W positive , positive =>...
How would I create a schema for the following data:
Dimension A
Attribute X, Attribute Y
positive , positive => Description 1
positive , negative => Description 2
negative , positive => Description 3
negative , negative => Description 4
Dimension B
Attribute Z, Attribute W
positive , positive => Description 5
positive , negative => Description 6
negative , positive => Description 7
negative , negative => Description 8
It should not be possible for a combination of attributes to result in multiple outcomes.
Is there a way to structure this that would allow for more dimensions to be created without creating new tables? The created dimensions would have different attributes with different names but those attributes would always have as values either positive or negative.
I'm hoping to accomplish this in postgres.
Filipe Giusti
(256 rep)
Nov 24, 2023, 10:33 PM
• Last activity: Nov 27, 2023, 05:55 PM
6
votes
3
answers
5292
views
Database Design - One table for different types of transactions
Considering a procurement system as an example. Would it make sense to have one single table for the following: Quote, Order, Invoice and Credit Memo? There will be some details that are different. For example each list will have its 5 special fields but the other say 50 fields would be exactly the...
Considering a procurement system as an example. Would it make sense to have one single table for the following: Quote, Order, Invoice and Credit Memo?
There will be some details that are different. For example each list will have its 5 special fields but the other say 50 fields would be exactly the same. Would it make sense in this case to join all tables in a single one. And some fields are unused for some types.
I can see that Navision system does something like that in its database. It uses one table, but then splits into separate tables once the items are posted. I think this may be for performance reasons because the posted items can tend to become very large. & splitting into separate tables would speed up queries. But this is just my thinking.
So my main dilemma is ... what is the recommended approach in such case: 1 table or separate tables? what are the pros and cons of each? What are the business considerations to take into account?
Thanks for your help!
Joseph Caruana
(207 rep)
Nov 20, 2012, 09:29 AM
• Last activity: Nov 18, 2023, 07:15 PM
1
votes
1
answers
79
views
building this not-so-necessary relationship is a good approach?
I created a database for an app, where I have several networks, which have several nodes and elements and also several scenarios, all related to the networks. [see picture below] My question is: since the `sce_results` and `sce_paramaters` will have result and parameter information of the `nodes` an...
I created a database for an app, where I have several networks, which have several nodes and elements and also several scenarios, all related to the networks. [see picture below]
My question is: since the
sce_results
and sce_paramaters
will have result and parameter information of the nodes
and elements
of a network, should I also build this relationship between the nodes
& elements
tables and sce_results
& sce_paramaters
tables via foreign-keys?
or since it gets complicated, the way I created the tables are fine? Maybe more general question, when it is good to build these relationships via the foreign-keys?
So far I though since nodes
& elements
have their names, I can match them with the name
columns in sce_results
& sce_paramaters
tables, after I download all the necessary data for doing further calculations or creating result plots, etc... Therefore I do not really need this relationship from my point of view. What do you think?
Example:
Network1: Nodes: [Node1, Node2, Node3]
Elements: [Element1, Element2]
Scenarios: [ScenarioX]
ScenarioX: Paramater: [Pressure_Node1 = 5, Pressure_Node2 = 10, ...]
Result: [Flow_Node1 = 100, ...]
# In this case nodes table looks like:
id | net_id | name | ... | x | y
1 1 Node1
2 1 Node2
...
# In this case sce_parameter table looks like:
id | sce_id | name | extension | value | unit
1 1 Node1 Pressure 5
2 1 Node2 Pressure 10
...

oakca
(113 rep)
Nov 15, 2023, 06:43 PM
• Last activity: Nov 16, 2023, 04:47 AM
0
votes
3
answers
39
views
Which would be best for a scalable environment?
Which is more advantageous, speaking of performance, and best practices? A table of customers and a table of suppliers and a table of seller Or A table that covers the three entities having a field that defines whether the record is a customer, supplier and seller. Remembering that the same person c...
Which is more advantageous, speaking of performance, and best practices?
A table of customers and a table of suppliers and a table of seller
Or
A table that covers the three entities having a field that defines whether the record is a customer, supplier and seller.
Remembering that the same person can be both a customer, a supplier or a seller and all has complete registration with data common people
Gabriel Simeoni
(1 rep)
Oct 9, 2023, 04:08 AM
• Last activity: Oct 22, 2023, 02:58 AM
9
votes
2
answers
1248
views
How can I losslessly decompose this table?
I have a table with the following PK (primary key): child_id integer parent_id integer date datetime `child_id` and `parent_id` are FKs (foreign keys) to entity tables. The `child` table also contains a FK to the `parent` table, and each `child_id` references the same `parent_id` as expected by the...
I have a table with the following PK (primary key):
child_id integer
parent_id integer
date datetime
child_id
and parent_id
are FKs (foreign keys) to entity tables. The child
table also contains a FK to the parent
table, and each child_id
references the same parent_id
as expected by the table above. Code keeps the two in sync.
To remove redundancy I decompose to the following:
Table_1 PK:
child_id integer
date datetime
Table_2 PK:
parent_id integer
date datetime
Table_3: (already exists)
child_id integer PRIMARY KEY
parent_id integer FOREIGN KEY
When I join these I recover the original table. It's my understanding that makes this 5NF.
However, the dates associated with a given child_id
must be a subset of the dates associated with the corresponding parent_id
. The original table enforces this rule.
My decomposition does not enforce the rule, because you can add to Table_1
until the dates get too large.
Is this decomposition 5NF? While it permits insertion anomalies, it also follows the Wikipedia example, which itself follows A Simple Guide to Five Normal Forms in Relational Database Theory . The latter's phrase (emphasis mine) "we can reconstruct all the **true** facts from a normalized form consisting of three separate record types" gives me special pause, since no matter how much garbage I pump into Table_1
, the natural join ignores it.
I don't like this decomposition. The practical solution is to leave the table and code as they are. How can I decompose and/or add constraints such that I get away from the original table **and** preserve my business rule?
trevor
(99 rep)
Jun 28, 2011, 09:13 PM
• Last activity: Oct 16, 2023, 09:38 PM
Showing page 1 of 20 total questions