Sample Header Ad - 728x90

Consistent transitive relationships

0 votes
0 answers
109 views
I have a database with three tables: City, Region, and Country. These tables have some straightforward FK constraints: a city is in one region and in one country, and a region is in one country. I would like to look up what country a given city is in. For example, San Francisco. I could do this in two ways: 1. City -> Country tells me that San Francisco is in the US 1. City -> Region -> Country tells me that San Francisco is in California, which is in the US But for a disputed territory, it's not unreasonable to worry that these two ways of answering the question disagree. For example, Sevastopol might give the following result: 1. City -> Country tells me that Sevastopol is in Ukraine 1. City -> Region -> Country tells me that Sevastopol is in Crimea, which is in Russia Clearly, if I want to guarantee a consistent answer, no matter how I ask the question, I'll need some additional constraint on the transitive relationship. In this example, I could add a mapping table with multiple composite foreign key constraints:
CREATE TABLE city_region_country_mapping (
    city_id INTEGER NOT NULL REFERENCES city(city_id),
    region_id INTEGER NOT NULL REFERENCES region(region_id),
    country_id INTEGER NOT NULL REFERENCES country(country_id),

    PRIMARY KEY (city_id, region_id, country_id),

    FOREIGN KEY (city_id, region_id) REFERENCES city(city_id, region_id),
    FOREIGN KEY (city_id, country_id) REFERENCES city(city_id, country_id),
    FOREIGN KEY (region_id, country_id) REFERENCES region(region_id, country_id)
)
That's fine for this simple example, but what if I make a much bigger database, where I could go CityDistrict -> City -> County -> Region -> Country -> Continent. Now there is a lot more scope for answering a question by traversing one of several paths through the database. Ensuring that all the possible paths in the database are consistent with one another quickly becomes very cumbersome, if you have to rely on constraints like the one above. Is there a standardised and/or better approach for addressing this kind of problem? Some other observations, in no particular order: - This problem appears related to normalisation, in the sense that you can get different answers depending _how_ you ask a question. Up to 3NF this is also a problem with missing normalisation: there are multiple sources of truth, causing insertion/update/deletion anomalies. - That said, it seems to me that my initial city/region/country example doesn't break any of the classical (up to 5NF) normal forms...I might be wrong though. Or there might be more exotic normal forms that address this issue? - I'm asking this primarily in the context of relational databases, but I imagine this is exacerbated in graph databases...do such databases have a better way of addressing this problem?
Asked by funklute (109 rep)
Sep 30, 2024, 02:29 PM