Database schema design of company time series data

1 vote
0 answers
338 views
                          **Purpose**: 

To monitor the changes in total capital and paid-in capital of various companies, and identify trends among the different company categories. By determining whether the sum of total capital or paid-in capital is increasing or decreasing for a given category, we can understand which categories of companies are growing or not.

**Data structure**: 

I have data for several companies with the following structure:
* company_name: string
* company_id: string
* company_category: string
* date: timestamp
* total_capital: numeric (allows null values)
* paidin_capital: numeric (allows null values)

The "date" field represents the date on which the total capital and paid-in capital were recorded for this company. Different companies may not have the same recording dates.

**Data challenge**: 

If I want to compare the trends of categories by month, I will need to create many aggregated records with the same month in order to sum the total capital for that month. For example:

Raw record 1:
* company_name: coop a
* company_category: category A
* total_capital: 1000
* paidin_capital: null
* date: 2022-03-28

Raw record 2:
* company_name: coop a
* company_category: category A
* total_capital: 2000
* paidin_capital: null
* date: 2022-05-12

(After) 
New record 1 (edited from Raw record 1):
* company_name: coop a
* company_category: category A
* total_capital: 1000
* paidin_capital: null
* month: 2022-03

New record 2 (edited from Raw record 1):
* company_name: coop a
* company_category: category A
* total_capital: 1000
* paidin_capital: null
* month: 2022-04

New record 3 (edited from Raw record 2):
* company_name: coop a
* company_category: category A
* total_capital: 2000
* paidin_capital: null
* month: 2022-05

Aggregated table: 
Aggregated record 1 (aggregated from New record 1):
* company_category: category A
* total_capital: 1000
* paidin_capital: null
* month: 2022-03

Aggregated record 2 (aggregated from New record 2):
* company_category: category A
* total_capital: 1000
* paidin_capital: null
* month: 2022-04

Aggregated record 3 (aggregated from New record 3):
* company_category: category A
* total_capital: 2000
* paidin_capital: null
* month: 2022-05

The above arrangement of data seems very inefficient and will also result in a large number of duplicated records. Is there a better schema structure that I should use if I continue to use a relational database management system (RDBMS) or switch to time-series database engine?
                        
Asked by Planetoid Hsu (13 rep)
Jan 9, 2023, 03:41 AM
Last activity: Jan 31, 2023, 09:46 AM
Database schema design of company time series data

Related Questions