You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Tian Jiang (Jira)" <ji...@apache.org> on 2021/10/08 03:08:00 UTC

[jira] [Created] (IOTDB-1809) Squeeze MeasurementSchema

Tian Jiang created IOTDB-1809:
---------------------------------

             Summary: Squeeze MeasurementSchema
                 Key: IOTDB-1809
                 URL: https://issues.apache.org/jira/browse/IOTDB-1809
             Project: Apache IoTDB
          Issue Type: Improvement
          Components: Core/Engine
            Reporter: Tian Jiang


Each timeseries is associated with a MeasurementSchema, which contains the `measurementId`, datatype, encoding, compression type, and properties. However, with limited numbers of data types, encodings, compression types, and mostly null properties, MeasurementSchemas are highly redundant.

To make it more specific, we currently have 7 data types, 9 encodings, and 8 compressions, so there are at most 7*9*8=504 distinguish MeasurementSchemas. However, each timeseries will create its own MeasurementSchema, when there are 1M timeseries, only 0.05% of the MeasurementSchemas are different.

If we squeeze `measurementId` out of MeasurementSchema, then we can share MeasurementSchema in different timeseries and reduce the number of MeasurementSchema greatly. In the example above, about 1M MeasurementSchema instances will be eliminated, assuming 20 bytes per instance, 20MB memory footprint will be reduced, and the number grows almost linearly with the number of timeseries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)