Sample Header Ad - 728x90

Database schema design of company time series data

1 vote
0 answers
338 views
**Purpose**: To monitor the changes in total capital and paid-in capital of various companies, and identify trends among the different company categories. By determining whether the sum of total capital or paid-in capital is increasing or decreasing for a given category, we can understand which categories of companies are growing or not. **Data structure**: I have data for several companies with the following structure: * company_name: string * company_id: string * company_category: string * date: timestamp * total_capital: numeric (allows null values) * paidin_capital: numeric (allows null values) The "date" field represents the date on which the total capital and paid-in capital were recorded for this company. Different companies may not have the same recording dates. **Data challenge**: If I want to compare the trends of categories by month, I will need to create many aggregated records with the same month in order to sum the total capital for that month. For example: Raw record 1: * company_name: coop a * company_category: category A * total_capital: 1000 * paidin_capital: null * date: 2022-03-28 Raw record 2: * company_name: coop a * company_category: category A * total_capital: 2000 * paidin_capital: null * date: 2022-05-12 (After) New record 1 (edited from Raw record 1): * company_name: coop a * company_category: category A * total_capital: 1000 * paidin_capital: null * month: 2022-03 New record 2 (edited from Raw record 1): * company_name: coop a * company_category: category A * total_capital: 1000 * paidin_capital: null * month: 2022-04 New record 3 (edited from Raw record 2): * company_name: coop a * company_category: category A * total_capital: 2000 * paidin_capital: null * month: 2022-05 Aggregated table: Aggregated record 1 (aggregated from New record 1): * company_category: category A * total_capital: 1000 * paidin_capital: null * month: 2022-03 Aggregated record 2 (aggregated from New record 2): * company_category: category A * total_capital: 1000 * paidin_capital: null * month: 2022-04 Aggregated record 3 (aggregated from New record 3): * company_category: category A * total_capital: 2000 * paidin_capital: null * month: 2022-05 The above arrangement of data seems very inefficient and will also result in a large number of duplicated records. Is there a better schema structure that I should use if I continue to use a relational database management system (RDBMS) or switch to time-series database engine?
Asked by Planetoid Hsu (13 rep)
Jan 9, 2023, 03:41 AM
Last activity: Jan 31, 2023, 09:46 AM