Sample Header Ad - 728x90

Dim/Fact Modelling: Dimension table for customer devices

2 votes
0 answers
180 views
The incoming event stream data contains information for customer phone devices: - Platform: Android or iOS? - Operating System: (version) - UDID - Phone Model In the event stream, each event telemetry contains a pre-defined event (e.g. User swiped their screen, or, User clicked a button in app). I'd like to model these information into a dimension table, but I'm pretty new to Kimball and reading his books for an afternoon doesn't help too much. I'm wondering if I'm doing the right thing here? Details in next paragraph. I'm thinking about assigning each unique combination of (Platform, OS, UDID, Model) a surrogate key, and during ETL, I'll check to see if that unique combination is in the dimension table. If it's not there, the key self-increases and the combination is recorded. If it is, no change is made to the dimension table. The dimension table consists of 5 fields, a surrogate PK (join with fact table), and Platform, OS, UDID and Model. Say we have the following data coming in: UserID | EventName |...|Platform|OS|UDID|Model -------|-----------|---|--------|--|----|----- 111 | Swipe_Up |...|Android|Android 8.0|123-345-678|Google Pixel 111 | Click_UI |...|iOS|iOS 11.0|abcdefg|iPhone 11 200 | Swipe_Up |...|Android|Android 8.0|123-345-678|Google Pixel 201 | Swipe_Down |...|iOS|iOS 13.0|hijklmn|iPhone 12 201 | Swipe_Down |...|iOS|iOS 13.0|NULL|NULL 230 | Swipe_Up |...|iOS|NULL|NULL|NULL 300 | Swipe_Up |...|Android|Android 8.0|NULL|Google Pixel So you can see user 111 used two phones, so that should be two devices. User 200 has the same phone as User 111 (even same UDID) so that's not a new record for my dimension table. User 201's second row seems to be the same as his first row, but because of the NULLs it is technically a new device. Of course maybe business wants to assign some rules to say "Hi as long as a few fields are the same, we consider them as the same device". In total I should have 5 unique rows in my dimension table. Does this make sense to you?
Asked by Nicholas Humphrey (121 rep)
Feb 1, 2021, 02:55 AM