Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

2 answers

4056 views

How can I design address table?

Hello I need designing addresses table. I designed it but I didn't like it very much. My needs: - I have customers and users (admins). I have addresses for customers and users. We need design in different tables like CustomerAddresses and UserAddresses. But we must be have Addresses table and connec...

                                  Hello I need designing addresses table. I designed it but I didn't like it very much.

My needs:

- I have customers and users (admins). I have addresses for customers and users. We need design in different tables like CustomerAddresses and UserAddresses. But we must be have Addresses table and connect with CustomerAddreses and UserAddresses.
- I have different address types like individual, corporate address. This types just using in customer's addresses.
- A customer can have more than one address.
- A user can have more than one address.
- Customers are not individual or corporate. Customer addresses are individual or corporate.

My design: (but I think this design not good. I think this design breaking normalization rules.)
CustomerAddresses is junction table in this design. And belongs to CustomerAddressTypes. We can create a new design or update my design.

https://dbdiagram.io/d/5fc2218b3a78976d7b7dbaf6

Fırat Kaya (41 rep)

Nov 28, 2020, 10:14 AM • Last activity: Aug 3, 2025, 11:00 AM

-1 votes

2 answers

765 views

Database Model for Various Type of Sensors

database-design normalization table

I am planning to build a database that will store sensor value for various type of sensors in a factory building. Is there any best practice on how to do this? - Option 1 : should I integrate all sensors into a single database like this Table Sensor --- SensorID SensorType SensorDesc Table SensorRea...

                                  I am planning to build a database that will store sensor value for various type of sensors in a factory building. Is there any best practice on how to do this?

-

Option 1 : should I integrate all sensors into a single database like this

    Table Sensor
    ---
    SensorID
    SensorType
    SensorDesc

    Table SensorReadings
    ---
    SensorReadingID
    SensorID
    ReadingDateTime
    ReadingValue

-

Option 2 : Or should I put each sensor type into separate table?

    Table TemperatureSensor
    ---
    SensorID
    SensorDesc
    
    Table TemperatureSensorReadings
    ---
    SensorReadingID
    SensorID
    ReadingDateTime
    ReadingValue (double)
    
    Table MotionSensor
    ---
    SensorID
    SensorDesc
    
    Table MotionSensorReadings
    ---
    SensorReadingID
    SensorID
    ReadingDateTime
    ReadingValue (integer)

Considering that various types of sensor tables will have similar fields but potentially different reading data types (some of them are integer and some of them are doubles, etc). Is it logical to separate the tables for each sensors? Or keep using a single table for every type of sensors?

- Which one is the best practice to implement such database if there might be various types of sensors to be added in the future?
- Which one can be better optimized for performance?

Thank you
                                

Naruto (7 rep)

Oct 22, 2019, 11:21 AM • Last activity: Jul 26, 2025, 09:04 AM

0 votes

1 answers

155 views

Normalisation with an Underlying Hierarchy

sql-server normalization hierarchy

I'm designing a new database schema, and every time I do, I like to check my assumptions on normalisation. Something I haven't been able to find a definitive answer for here or by searching the web is how to handle multi-level hierarchies. Best way to explain is by example. **Source Data** ref, stat...

                                  I'm designing a new database schema, and every time I do, I like to check my assumptions on normalisation. Something I haven't been able to find a definitive answer for here or by searching the web is how to handle multi-level hierarchies. Best way to explain is by example.

**Source Data**

    ref, state, city, suburb, population
    -------------------------
    1, ABC, x, qwe, 1234 
    2, ABC, y, rty, 1456
    3, DEF, z, uio, 2000 

If I were just normalising that at face value, I'd create four tables - Residents, States, Cities, Suburbs. However, let's add the stipulation that there's a hierarchy of State->City->Suburb.

Now in order to preserve referential integrity, it should look something like:

    states
    -------
    stateid, statename
    
    cities
    -------
    cityid, stateid, cityname
    
    suburbs
    -------
    suburbid, cityid, suburbname

The question I haven't been able to answer is what should the Residents table look like? The sensible option is:

    ref, stateid, cityid, suburbid, population

and the slightly less sensible option is:

    ref, suburbid, population

So when it comes to normal forms, I don't think the second answer is a valid normal form. Even though suburbid seems like a valid super key, it would involve traversing upward through the hierarchy to retrieve city and state names.

But then, I'm not sure the first option is valid either, because there's redundancy throughout the hierarchy - I can get the state name three ways (Residents.stateid, Cities.stateid, Suburbs.cityid->Cities.stateid)

From a functional standpoint, the joins would be something like:

    --First Option
    SELECT [...] 
    FROM Residents AS r
    JOIN States AS s ON s.stateid = r.stateid
    JOIN Cities AS c ON c.cityid = r.cityid
    JOIN Suburbs AS b ON b.suburbid = r.suburbid
    
    --Second Option
    SELECT [...] 
    FROM Residents AS r
    JOIN Suburbs AS b ON b.suburbid = r.suburbid
    JOIN Cities AS c ON c.cityid = b.cityid
    JOIN States AS s ON s.stateid = c.stateid

What I'm hoping to understand is many faceted:

 - Is there a better approach from a normalisation perspective?
 - Is there a better approach from a performance perspective? (Assume MSSQL if relevant)
 - Is there a reason to favour normal forms over performance, given that the underlying schema already preserves referential integrity?
 - Are there other options I should consider? Have I got it all wrong?

Thanks in advance!
                                

Vocoder (117 rep)

Jul 3, 2023, 06:39 AM • Last activity: Jul 23, 2025, 07:03 AM

0 votes

2 answers

156 views

Handling two different types of relationships between same two tables

database-design normalization

I'm working on a database where I have two tables where there can be different types of relationships between them, and I'm struggling with the best way to handle it because all the options I seem to come up with have disadvantages. I have a `birds` table and a `nests` table. The first relationship...

                                  I'm working on a database where I have two tables where there can be different types of relationships between them, and I'm struggling with the best way to handle it because all the options I seem to come up with have disadvantages. I have a birds table and a nests table. The first relationship a bird can have to a nest is their "natal" relationship (e.g., the nest they were born in). The second relationship a bird can have with a nest is the nests they visit/build as adults. Data is collected over a long enough period of time that some birds will be seen both as nestlings and as adults. Data collection is also not perfect; so some birds may not be observed as nestlings, so their natal nest could be unknown.

I have three different schemas I've come up with so far:

Option 1: Seems the cleanest. It uses a pivot table with an attribute to describe an observed relationship. It prevents null values which, is nice, but it does not have a way to enforce a single natal nest without extra logic somewhere.

Option 2: The pivot table just represents the adult-nest relationship. The natal relationship is represented using a 1-1 relationship with a foreign key in the birds table. The downside to this is that this leads to possible nulls in the data.

Option 3: Uses different pivot tables to represent the different relationships. It seems to avoid the pitfalls of the other two, but something is nagging me about this being overly complicated. But maybe that is more of a front-end issue than a database issue.

In terms of proper database normalization/theory, do any of these options stand out as particularly bad or good? Did I miss an option that will magically solve my problem without downsides?

anjama (121 rep)

Jan 8, 2024, 05:13 PM • Last activity: Jul 15, 2025, 02:07 PM

0 votes

1 answers

194 views

Design database table for property listing

database-design performance normalization query-performance

I want to design a database for a listing website, this is the first time I design the database, so I am not sure if I in the correct path. Here is the fields that needed in a listing, 1. title - Character 2. pub_date -Timestamp 3. modi_date -Timestamp 4. ex_date - Timestamp 5. desc - Character 6. t...

                                  I want to design a database for a listing website, this is the first time I design the database, so I am not sure if I in the correct path. 
Here is the fields that needed in a listing, 

title - Character
pub_date -Timestamp
modi_date -Timestamp
ex_date - Timestamp
desc - Character
thumb - Character
office_no - Integer
mobile_no - Integer
email - Character
web - Character
status - Character
state - Character
city - Character
address - Character
postcode - Character
lat - float
lng - float

The lat (latitude) and lng (longitude) are used for searching, eg, searching nearby listing by comparing the lat and lng. So these fields will be queried often. 

My question is should I separate the  *lat* and *lng* into another table (normalization) since these fields will be query a lot. Or I just leave it in one table.
                                

dev-jim (101 rep)

Sep 16, 2015, 06:28 PM • Last activity: Jun 21, 2025, 11:04 AM

0 votes

1 answers

219 views

Does combining multiple tables into one table with more than 100 columns improve performance and overall design?

database-design normalization denormalization

I have 10 tables without any type of foreign key relations between them. All the tables have the same primary key. In this case, "stock_id" is the primary key. All table columns describe a metric of the stock_id. For eg: product_fees is a column to describe the total fees of stocks. Like this, we ha...

                                  I have 10 tables without any type of foreign key relations between them. All the tables have the same primary key. In this case, "stock_id" is the primary key. All table columns describe a metric of the stock_id. For eg: product_fees is a column to describe the total fees of stocks. Like this, we have various metrics in all the tables. I use all 10 tables using Django API for the frontend website to show all visualizations (20+ charts). I don't think this design is a good one. 

**My solution:**
As all attributes have stock_id I'm thinking of combining the tables into one table which will contain about 110 columns. There is one extra table that will have the prediction data with dates (1000+ rows for each stock) for all stocks. This prediction data will have a foreign key (1:M) to the main table with 110 columns. To improve readability, I am thinking of using 20 views to debug the data in the future instead of scanning through all the column names.

The most important thing is all the values of stocks in the 20 tables change every day and I usually do delete and insert them in each table instead of using updates especially for the stock prediction table with dates.

Is it a good idea to have more than 100 columns and use views for aggregating smaller sections of the table for debugging purposes?

Kaushik Ganesan (1 rep)

Jan 14, 2022, 08:20 PM • Last activity: Jun 15, 2025, 11:05 AM

-2 votes

2 answers

72 views

Should I break a large user table into smaller tables for specific roles and information?

mysql performance database-design normalization scalability

I am designing a database for a system that has a `users` table. Currently, the table has around 50 columns, which include: - **Personal information** (e.g., `name`, `email`, `phone_number`, `address`, etc.) - **Work-related information** (e.g., `job_title`, `company_name`, `years_experience`, etc.)...

                                  I am designing a database for a system that has a users table. Currently, the table has around 50 columns, which include:  
- **Personal information** (e.g., name, email, phone_number, address, etc.)  
- **Work-related information** (e.g., job_title, company_name, years_experience, etc.)  
- **Education-related information** (e.g., degree, institution, graduation_year, etc.)  

Some of these columns are only applicable to specific roles, like employees or students, while others are common to all users. I’m considering breaking the users table into smaller tables, such as:  
- A main users table for general information.  
- A work_experiences table for job-related details.  
- An education table for academic details.  

### My questions are:  
1. Is splitting the users table into smaller tables based on their context (work, education, etc.) a good practice?  
2. Will this approach impact performance negatively, or is it better for maintainability?  
3. Should I consider alternatives like using JSON columns for role-specific data or polymorphic relationships for related models?  

I’m using MySQL, and performance is a concern since the system will scale to thousands of users.  

Thank you for your advice!  

                                

JayDev95 (97 rep)

Dec 9, 2024, 09:09 PM • Last activity: May 5, 2025, 07:55 AM

2 votes

2 answers

246 views

Is sql_variant compatible with First Normal Form (1NF)?

sql-server database-design normalization relational-theory sql-variant-property

1NF rule says that we should not keep data of different types in one single column. Does it mean that sql_variant is not compatible with first normal form hence should not be used?

                                  1NF rule says that we should not keep data of different types in one single column. Does it mean that sql_variant is not compatible with first normal form hence should not be used?
                                

igelr (2162 rep)

Jul 27, 2018, 03:20 PM • Last activity: Apr 19, 2025, 09:17 PM

2 votes

1 answers

1366 views

Finding all possible minimal covers

database-design normalization relational-theory dependencies

I have a relation schema R = {A, B, C} and the following functional dependencies: - A → B - A → C - B → A - B → C - C → A - C → B How many different minimal covers can I derive from this relation schema? I have found the following, but I am not really sure if those are all: - A → C - B → C - C → A -...

                                  I have a relation schema R = {A, B, C} and the following functional dependencies:

- A → B
- A → C
- B → A
- B → C
- C → A
- C → B

How many different minimal covers can I derive from this relation schema? I have found the following, but I am not really sure if those are all:

- A → C
- B → C
- C → A
- C → B

Also, I am not sure if there is some rule on how to know if one has found all possible minimal covers.

Thanks a lot for any help!
                                

BlackPearl (39 rep)

Oct 10, 2019, 06:54 AM • Last activity: Apr 4, 2025, 04:00 PM

-1 votes

2 answers

128 views

Normalization of Grocery Shop Database: From 1NF to 3NF

database-design normalization

In a grocery shop, customers buy items, and the products have categories. The TotalQuantity represents the stock of products. 1NF: ``` OrderProduct (OrderID, OrderDate, OrderTime, CustomerID, CustomerName, CustomerAddress, CustomerNumber, CustomerEmail, ProductID, ProductName, UnitPrice, TotalQuanti...

In a grocery shop, customers buy items, and the products have categories. The TotalQuantity represents the stock of products. 1NF:

OrderProduct (OrderID, OrderDate, OrderTime, CustomerID, CustomerName,
    CustomerAddress, CustomerNumber, CustomerEmail, ProductID, ProductName,
    UnitPrice, TotalQuantity, CategoryID, CategoryName, CategoryDescription)

What is correct and incorrect in my 2NF and 3NF? 2NF:

Order (OrderID, OrderDate, OrderTime, CustomerID)
Customer (CustomerID, Name, Address, PhoneNumber, Email)
Product (ProductID, ProductName, UnitPrice, CategoryID, TotalQuantity)
Category (CategoryID, CategoryName, Description)
OrderItem (OrderID, ProductID)

3NF:

Order (OrderID, OrderDate, OrderTime, CustomerID)
Customer (CustomerID, Name, Address, PhoneNumber, Email)
Product (ProductID, ProductName, UnitPrice, CategoryID, TotalQuantity)
Category (CategoryID, CategoryName, Description)
OrderItem (OrderID, ProductID)
ProductCategory (ProductID, CategoryID)

User190230UK (1 rep)

Dec 27, 2024, 07:26 AM • Last activity: Dec 29, 2024, 09:14 PM

0 votes

1 answers

35 views

How to design properly a membership history accross multiple groups?

database-design normalization primary-key null

I'm looking to design an application (Python with sqlite) which deals with some contributors moving across several groups through time. My design looks currently like (PK in bold weight, FK in italic, both if a FK is part of the PK): * Group(**Id**, Name) * Contributor(**Id**, Name, FirstName) * Act...

                                  I'm looking to design an application (Python with sqlite) which deals with some contributors moving across several groups through time.

My design looks currently like (PK in bold weight, FK in italic, both if a FK is part of the PK):
* Group(**Id**, Name)
* Contributor(**Id**, Name, FirstName)
* Action(**ActionId**, Name)
* ContributorAction (***ContributorId***, ***ActionId***, Date)
* Membership (***GroupId***, ***ContributorId***, **StartDate**, **EndDate**)

My aim is to keep an history of memberships in order to retrieve the right membership group of a contributor at the time of a given contribution.

Doing that involve the use of date columns in a primary key (Membership table). However, EndDate could be NULL for the current membership of a contributor, and I read this could be not a good (or even permitted) practice.

Did I miss something here to get at least 3NF design?
                                

Amessihel (103 rep)

Dec 23, 2024, 04:22 PM • Last activity: Dec 23, 2024, 04:49 PM

-1 votes

1 answers

72 views

Database normalization

database-design normalization

When normalizing table Message, which is a weak entity of table Email, into 2NF, can Message_Name, Message_Description and Message_Type attributes partially depend on Message_ID only? These attributes are in a weak entity. So they are dependent on composite primary key of message table. [![part of r...

                                  When normalizing table Message, which is a weak entity of table Email, into 2NF, can Message_Name, Message_Description and Message_Type attributes partially depend on Message_ID only? 

These attributes are in a weak entity. So they are dependent on composite primary key of message table.

Kyo-Dal-Mi (1 rep)

Oct 16, 2024, 09:15 PM • Last activity: Nov 6, 2024, 01:22 PM

0 votes

0 answers

109 views

Consistent transitive relationships

database-design normalization relational-theory

I have a database with three tables: `City`, `Region`, and `Country`. These tables have some straightforward FK constraints: a city is in one region and in one country, and a region is in one country. I would like to look up what country a given city is in. For example, San Francisco. I could do thi...

I have a database with three tables: City, Region, and Country. These tables have some straightforward FK constraints: a city is in one region and in one country, and a region is in one country. I would like to look up what country a given city is in. For example, San Francisco. I could do this in two ways: 1. City -> Country tells me that San Francisco is in the US 1. City -> Region -> Country tells me that San Francisco is in California, which is in the US But for a disputed territory, it's not unreasonable to worry that these two ways of answering the question disagree. For example, Sevastopol might give the following result: 1. City -> Country tells me that Sevastopol is in Ukraine 1. City -> Region -> Country tells me that Sevastopol is in Crimea, which is in Russia Clearly, if I want to guarantee a consistent answer, no matter how I ask the question, I'll need some additional constraint on the transitive relationship. In this example, I could add a mapping table with multiple composite foreign key constraints:

CREATE TABLE city_region_country_mapping (
    city_id INTEGER NOT NULL REFERENCES city(city_id),
    region_id INTEGER NOT NULL REFERENCES region(region_id),
    country_id INTEGER NOT NULL REFERENCES country(country_id),

    PRIMARY KEY (city_id, region_id, country_id),

    FOREIGN KEY (city_id, region_id) REFERENCES city(city_id, region_id),
    FOREIGN KEY (city_id, country_id) REFERENCES city(city_id, country_id),
    FOREIGN KEY (region_id, country_id) REFERENCES region(region_id, country_id)
)

That's fine for this simple example, but what if I make a much bigger database, where I could go CityDistrict -> City -> County -> Region -> Country -> Continent. Now there is a lot more scope for answering a question by traversing one of several paths through the database. Ensuring that all the possible paths in the database are consistent with one another quickly becomes very cumbersome, if you have to rely on constraints like the one above. Is there a standardised and/or better approach for addressing this kind of problem? Some other observations, in no particular order: - This problem appears related to normalisation, in the sense that you can get different answers depending _how_ you ask a question. Up to 3NF this is also a problem with missing normalisation: there are multiple sources of truth, causing insertion/update/deletion anomalies. - That said, it seems to me that my initial city/region/country example doesn't break any of the classical (up to 5NF) normal forms...I might be wrong though. Or there might be more exotic normal forms that address this issue? - I'm asking this primarily in the context of relational databases, but I imagine this is exacerbated in graph databases...do such databases have a better way of addressing this problem?

funklute (109 rep)

Sep 30, 2024, 02:29 PM

0 votes

0 answers

96 views

Database model for multi-channel notification system

mysql database-design normalization

We're currently redesigning the way our application sends out notifications to users (so far, it has been mainly calls at various places in the code, directly sending mails to the user) to a system that allows multiple notification channels and should give the users options to select which notificat...

                                  We're currently redesigning the way our application sends out notifications to users (so far, it has been mainly calls at various places in the code, directly sending mails to the user) to a system that allows multiple notification channels and should give the users options to select which notification channels they want to receive their notification on. The database is currently MySQL. The requirements are as follows:

 - Multiple notification **channels** should be supported (like mail, firebase push notifications, slack)
- Some **channels** are targeted to **individual** users (mail, firebase), while others are **not user specific** (messages to Slack channels)
- Not all **notifications** support all **channels**
- **Notifications** are uniquely identified by a name (string) and belong to exactly one **notification category**
- Each **channel** requires a certain set of **parameters for each notification** supported (e.g. for mail, we require a template_id associated with the notification, for firebase we require a title and body template, and so on)
- **Users** should be able to select which b they want to receive **notifications on, per category**
- Some combinations are mandatory (e.g. mail is required for almost all notifications, firebase is typically optional)

So far, we came up with a design similar like this. In the meantime, I've come up with a second iteration, which may or not be better, it does address some of the concerns in terms of normalisation though:



There is a set of questions around this:

 1. The notification (now notification_event) table is what maps the individual name to all the fields that are required for the various channels, such as email_template_id for mail or push_*_template for firebase. A value of NULL means that this particular channel is not supported for this notification. It feels wrong in terms of database normalisation to store these values directly in the same table, as the table structure would need to change every time a new notification channel is added. How to handle these per-channel settings correctly?
2. There are some combinations that are mandatory (e.g. a category with mail_mandatory set to true). Should we pre-create one entry for each user in user_notifications and simply mark them as enabled there, or is it better to not create these settings in the first place and ensure in code that the default configuration always applies, even if for whatever reason the entry in user_notification says otherwise (this should be prevented by the API in the first place, its still a design decision though).

Trying to answer some of the questions that came up in the comments:

- The backend is written in Python, using SQLAlchemy as ORM. API is FastAPI, with arq as worker for async task. The notifications are triggered anywhere, but submitted / sent by the worker.
- There are roughly 25 different events / notification types, around 1000 users.
- The amount of notifications is rather low - I'd say typically below a 100 each day.
- Today, we only have notifications via mail. Currently we plan an extension and support Firebase push notifications as well as slack channel notifications, which led us to re-think the system and make it more modular to grow in the future (e.g. show notifications on the website, ...)

                                

Daniel (101 rep)

Aug 28, 2024, 06:00 PM • Last activity: Aug 29, 2024, 04:29 PM

-1 votes

1 answers

106 views

Customer Address Design Probelm

sql-server database-design normalization

I'm designing a database for E-commercial. So the Customer have many address also 1 address also has many customers owned so I have another CustomerAddress table (many-to-many relation). Here detail : 1. Customer: - CustomerID (PK) - Name - DateofBirth 2. AddressNote : - AddressNoteID (PK) - Address...

                                  I'm designing a database for E-commercial. So the Customer have many address also 1 address also has many customers owned so I have another CustomerAddress table (many-to-many relation). Here detail :

 1. Customer:
     - CustomerID (PK)
     - Name 
     - DateofBirth
 2. AddressNote :
    - AddressNoteID (PK)
    - AddressID (FK) (has a relationship with the city table so I'm not mentioning it here)
 3. CustomerAddress :

    - CustomerID (PK)

    - AddressNoteID (PK)
    - PhoneNumber (The phone number here indicates who will be called upon delivery at that address)

My problem is that I want 1 client will have multiple addresses but will have 1 single default address. So I thinking to add one more column  **IsDefault (bool)** in CustomerAddress table. Later, I feel wrong because that's still allowed customer have two different addresses and both set Default address. I believe in application level validation can handle this but I want database constraints. Is there anyway to design this constraints. I'm just beginner so thank for any instructions.

John (3 rep)

May 6, 2024, 04:39 PM • Last activity: May 6, 2024, 05:22 PM

1 votes

2 answers

1385 views

How to solve multiple minimal covers for functional dependencies?

database-design normalization

I have the schema: `a,b,c,d,e,g` and the following functional dependencies: `b->c, dg->ce, bc->dg, e->a, g->bd` for which I want to find the minimal cover. It seems that more than one scenario is possible. First I represent the dependencies by splitting the right-hand side: `b->c, dg->c, dg->e, bd->...

                                  I have the schema: a,b,c,d,e,g and the following functional dependencies:

b->c, dg->ce, bc->dg, e->a, g->bd

for which I want to find the minimal cover. It seems that more than one scenario is possible.

First I represent the dependencies by splitting the right-hand side:

b->c, dg->c, dg->e, bd->d, bc->g, e->a, g->b, g->d

For dg->c there's one extraneous attributed. Same goes for dg->e.

For bc->d the attribute c is extraneous as well as for bc->g.

For g->bd the attribute d is extraneous because if we have g->b then g+ = {gbcd} which contains d. 

So we have 4 possible minimal covers:

b->c, g->c OR g->e, b->d OR b->g, g->b, e->a.

The correct answer is  g-->c; g-->e; b-->g; e-->a; g-->b; g-->d according to this tool .

I cannot get the same answer from simplifying my minimal cover. What am I doing wrong?

Yos (195 rep)

Aug 11, 2018, 06:37 PM • Last activity: May 1, 2024, 06:28 PM

-1 votes

1 answers

177 views

Create a category column or a table for each category?

sql-server t-sql normalization powerbi

Which is better? ``` Table: Words Columns: ID, Word, ParentWordID, CatID Table: Categories Columns: CatID, Category ``` Or: ``` Table: Words Columns: ID, Word Table: Categories Columns: CatID, Categories Table: Cat1 Columns: WordID ... Table: CatN Columns: WordID ``` One possible advantage for the 2...

Which is better?

Table: Words
Columns: ID, Word, ParentWordID, CatID

Table: Categories
Columns: CatID, Category

Or:

Table: Words
Columns: ID, Word

Table: Categories
Columns: CatID, Categories

Table: Cat1
Columns: WordID
...
Table: CatN
Columns: WordID

One possible advantage for the 2nd option is that not all

need to have a

. But the same could be said for

, which is empty most of the time. Maybe that too could be replaced with a

table and a table for each

simpatico (99 rep)

Apr 5, 2024, 07:07 PM • Last activity: Apr 5, 2024, 07:54 PM

0 votes

1 answers

111 views

3NF Decomposition and NP-Completeness

database-design normalization design-pattern

I have a question about the computational complexity of the algorithm that allows us to decompose a schema into 3NF. My book says that the decisional problem of telling if a schema complies with the 3NF is NP-Complete. An NP problem is a decisional problem that can be solved in a polynomial time by...

                                  I have a question about the computational complexity of the algorithm that allows us to decompose a schema into 3NF.

My book says that the decisional problem of telling if a schema complies with the 3NF is NP-Complete.
An NP problem is a decisional problem that can be solved in a polynomial time by a non-deterministic algorithm and verified in a polynomial time by a deterministic algorithm.
An NP-Complete problem is a problem that is as hard to solve as the SAT problem.

So, if I've understood correctly, if somebody gives me a schema and tells me wheater or not it is in 3NF I can verify it in polynomial time with a deterministic algorithm. But if somebody gives me the actual problem and not the decisional one by asking me: "Find a 3NF decomposition of my schema" I wouldn't be able to find a polynomial solution with a deterministic algorithm.

I would be glad if somebody could tell me if there are any mistakes in my understanding.
Thank you in advance!

lorenzo_moni (1 rep)

Jun 30, 2023, 09:03 PM • Last activity: Feb 11, 2024, 09:57 PM

-1 votes

1 answers

60 views

Should we repeat multilevel dependency reference columns to avoid multiple joins breaking the normal forms for better performance

database-design normalization

In general I know repeating information in tables are not considered a good database design But I have a question for following particular design for multi level parent child relationship Table_1_Table_2_Mapping - Id - Table_1_Id - Table_2_Id Table_2_Table_3_Mapping - Id - Table_2_Id - Table_3_Id I...

                                  In general I know repeating information in tables are not considered a good database design

But I have a question for following particular design for multi level parent child relationship


    Table_1_Table_2_Mapping
      - Id
      - Table_1_Id
      - Table_2_Id

    Table_2_Table_3_Mapping
      - Id
      - Table_2_Id
      - Table_3_Id

I am just showing the two levels relationship here for example, but I have 5 levels of such parent child relation

So with the current design (which I feel is the correct design following the database normal forms) if I have to traverse to Table 1 from Table 3 I have to go via Table 2 

So this way

    Table_3 joins Table_2 join Table_1

This looks okay because they are just three tables, but what if I have to reach from Table_9 to Table_1, there will be more than 15 joins going from each mapping table to each main table.

It it still worth to follow the normal forms in that case also if I have say 9 child parent tables?

Or should I keep adding the repeating information in child mapping tables like this to keep the performance in mind (however this is not a good design as far as I know)

    Table_2_Table_3_Mapping
      - Id
      - Table_2_Id
      - Table_3_Id
      - Table_1_Id //added this id to go directly to Table_1 from Table_3 skipping Table_2 join


                                

Pawan Nogariya (119 rep)

Feb 8, 2024, 07:04 AM • Last activity: Feb 8, 2024, 11:01 AM

0 votes

2 answers

165 views

3NF normalized table design

mysql normalization

I am struggling with how to normalize a database where there is multiple addresses pr. user. The situation is that a user can have both a regular address and a delivery address. When a user make an order, one of the addresses should be connected to the order. I am trying to achive a 3NF normalizatio...

                                  I am struggling with how to normalize a database where there is multiple addresses pr. user.

The situation is that a user can have both a regular address and a delivery address. When a user make an order, one of the addresses should be connected to the order.

I am trying to achive a 3NF normalization. So i will store the addresses in a junction table, should i add an primary key to user_addresses table and use that in fk_address_id column in orders table? or would that be breaking the 3NF?

What i am basically trying to do is to make it possible to make an order with a diffrent address.

Soma Juice (111 rep)

Jan 31, 2024, 09:12 PM • Last activity: Jan 31, 2024, 11:41 PM

Showing page 1 of 20 total questions