Normalized Data Store - Confused with prefixes to use
4
votes
1
answer
800
views
I'm designing a Staging+NDS+DDS Data Warehouse system, where an ETL is going to normalize data from
[Staging]
and load it into [NDS]
, which will hold all history.
I've pretty much finished the T-SQL script that will create the tables and constraints in the [NDS]
database, which contains *Master* and *Transactional* tables, that will respectively feed [DDS]
*Dimension* and *Fact* tables in what I'm intending to be a star schema.
I'm given myself the following rules to follow:
- Tables sourcing [DDS]
dimensions are prefixed with DWD_
- Tables sourcing [DDS]
facts are prefixed with DWF_
- Foreign key columns are prefixed with DWK_
- Surrogate key column is prefixed with the same prefix as the table. Which means the surrogate key is always either:
- DWD_Key
for a DWD_
table, or
- DWF_Key
for a DWF_
table.
- Control columns are prefixed with the same prefix as the table. For example...
- The DWD_Customers
table has control columns:
- DWD_IsLastImage
- DWD_EffectiveFrom
- DWD_EffectiveTo
- DWD_DateInserted
- DWD_DateUpdated
- DWD_DateDeleted
- The DWF_InvoiceHeaders
table has control columns:
- DWF_DateInserted
- DWF_DateUpdated
- DWF_DateDeleted
- Primary keys (/surrogate keys) are always prefixed with PK_
followed by the table name (including the table prefix) - e.g. PK_DWD_Customers
and PK_DWF_InvoiceHeaders
.
- I also added a unique
constraint on *natural keys*, and those are always prefixed with NK_
followed by the table name (including the table prefix) - e.g. NK_DWD_Customers
and NK_DWF_InvoiceHeaders
.
- Foreign key columns are always prefixed with DWK_
followed by the name of the referenced table (without its prefix) and the word "Key" - e.g. DWK_CustomerKey
.
- Foreign key constraints are always named FK_[ParentTableNameWithPrefix]_[ChildTableNameWithPrefix]
.
- When a table has multiple FK's to the same table, the name of the FK column is appended to the constraint's name, e.g. FK_DWD_FiscalCalendar_DWF_OrderDetails_DeliveryDate
.
All prefixed columns have no business meaning and should never appear in views; this leaves me with, I find, a pretty clean and consistent design, and create table
scripts looking like this:
create table DWD_SubCategories (
DWD_Key int not null identity(1,1)
,DWD_DateInserted datetime not null
,DWD_DateUpdated datetime null
,DWK_CategoryKey int not null
,Code nvarchar(5) not null
,Name nvarchar(50) not null
,constraint PK_DWD_SubCategories primary key clustered (DWD_Key asc)
,constraint NK_DWD_SubCategories unique (Code)
);
---
So, my question is, is there anything I should know (or *unlearn*) before I continue and implement the ETL to load data into this database? Would anyone inheriting this database want to chase me down and rip my head off in the future? What should I change to avoid this? The reason I'm asking about prefixes, is because I'm using DWD
and DWF
, but the tables are technically not "dimension" and "fact" tables. Is that confusing?
Also, I'm unsure about the concept of *natural key* - am I correct to presume it should be a *unique* combination of columns that the source system might consider its "key" columns, that I can use in the ETL process to locate, say, a specific record to update?
Asked by Mathieu Guindon
(914 rep)
Nov 19, 2014, 04:19 PM
Last activity: Nov 19, 2014, 05:38 PM
Last activity: Nov 19, 2014, 05:38 PM