You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Tian Jiang (Jira)" <ji...@apache.org> on 2020/08/26 12:30:00 UTC

[jira] [Created] (IOTDB-853) Log compaction by omitting the same log fields

Tian Jiang created IOTDB-853:
--------------------------------

             Summary: Log compaction by omitting the same log fields
                 Key: IOTDB-853
                 URL: https://issues.apache.org/jira/browse/IOTDB-853
             Project: Apache IoTDB
          Issue Type: Improvement
          Components: Core/WAL
            Reporter: Tian Jiang


[1] mentioned an interesting way of log compaction, which records the page Id and txn Id of the previous log and omit the one in the next log if they are the same. 

I think it is very possible to apply such a technique to IoTDB's WAL. During the persistence of logs, we may keep a log window of the previous N logs, and when we are going to persist one log, we search the log window to find the nearest log with the same type and see if that log has the same field as the current one, e.g., it is very possible that neighboring insertions will have the same deviceIds and measurementIds, so we can directly use a forward reference to fill the log field (like using "3" meanings this field has the same value as the log whose index is smaller by 3 than the current one). This way, a very long path can be simply replaced by a byte (0~255), and disk space and I/O may be saved greatly.

The idea itself can be implemented easily, but the challenges locate in that how to define a proper window length and compare logs efficiently so that the additional computing will not become another bottleneck. 

[1] Michael Haubenschild, Caetano Sauer, Thomas Neumann, and Viktor Leis. 2020. Rethinking Logging, Checkpoints, and Recovery for High-Performance Storage Engines. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 877–892. DOI:https://doi.org/10.1145/3318464.3389716



--
This message was sent by Atlassian Jira
(v8.3.4#803005)