Transition from MongoDB Time Series Collections to InfluxDB

1 vote

1 answer

948 views

database-design mongodb migration time-series-database influx-db

                          With version 5.0 MongoDB 's specialized Time Series Collections  were introduced to deal with such data. As I already stored some sensor meta data (configuration, specification ...) in MongoDB, I decided to make use of these special collections to store sensor readings next to the sensor meta data.

According to the docs  I used a single document for each sensor reading like this (pseudo code):

    {
            "timestamp": timestamp,
            "value": value,
            "metadata": {
                "sensorId": sensor_uid,
                "unit": sensor_unit,            
                "type": sensor_type,
                "fromFile": reading_imported_from_file,
            },
    }

Around 50 different sensors are read at the same time which results in 50 documents with equal timestamp but varying value and metadata.

I am currently working on migrating our time series data storage from MongoDB to InfluxDB as this seems to provide a sleeker API and has some basic data visualization already included . As already described above, in MongoDB I used a single document per sensor which might be considered as bad practice when using InfluxDB :

> A measurement per sensor seems unnecessary and can add a significant
> amount of system overhead depending on the number of sensors you have.
> I’d suggest storing all sensor data in a single measurement with
> multiple fields, [...]

Based on this I came up with the following data structure to be passed to InfluxDB (Python dictionary pseudo code for influxdb-client ):

    {
        "time": 1,
        "measurement": measurement_name,
        "tags": {
            "location": location,
            "from_file": reading_imported_from_file,
        },
        "fields": {
            "sensor_1": reading_from_sensor_1,
            "sensor_2": reading_from_sensor_2,
            "sensor_3": reading_from_sensor_3,
        },
    }

However, I did not figure out how to store the other meta data like sensorId, unit, or type. On the one hand side I could easily solve this by violating the before mentioned suggestion and use a single measurement per sensor. On the other hand side, from a relational perspective these meta information should be tied to the sensorId and be therefore accessible from a sensor configuration/specification database using the sensorId as a key. Unfortunately, these values can change throughout a single measurement or experiment due to changing device configurations on-site which are not reflected in the configuration database.

How could I solve this issue? Am I missing something or do I simply have to deal with this design/performance vs. ease-of-use tradeoff?





                        

Asked by albert (113 rep)

Feb 18, 2022, 10:24 PM
Last activity: Jan 6, 2025, 10:04 AM

Transition from MongoDB Time Series Collections to InfluxDB

Related Questions