You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2022/03/15 12:26:37 UTC

[GitHub] [iotdb] THUMarkLau opened a new pull request #5248: [IOTDB-2723] Fix sequence inner space compaction lose data

THUMarkLau opened a new pull request #5248:
URL: https://github.com/apache/iotdb/pull/5248


   When executing inner space compaction for sequence files, the chunks of a series are read into memory one by one, and we use three way to determine how to process a chunk:
   1. If the chunk is small or part of data in the chunk is deleted, the chunk is deserialized into points and rewritten into chunk writer. The following chunk will be written into chunk writer util the size of chunk writer is large enough to flush.
   2. If the chunk is too large, we just flush it to the disk.
   3. If the chunk is neither too small nor too large, we just cached it in memory and merge it with the chunk following. The cached chunk will not be flush util its size is large enough.
   
   Of course, these are rough descriptions. When we read a chunk that satisfies the condition of deserialization, if there is a cached chunk in the memory, the program will deserialize the cached chunk into chunk writer first, after which the freshly read chunk will be deserialized. Before the program deserializes the cached chunk, it will call the `flip` function of the cached chunk to make sure the chunk reader can read it correctly. However, in some cases, the cached chunk is the first cached chunk, which means it is a chunk directly read from TsFile using `readMemChunk` function in `TsFileSequenceReader`, and hasn't merged with any chunk yet. The chunk read by `readMemChunk` has already called `flip` function, while the chunk generated by `mergeChunk` hasn't. The program only needs to call the `flip` function for the later. So if the program call the `flip` function for the former, the `flip` function is called twice actually, which accounts for the error of variable `position` a
 nd `limit` in the data buffer of the chunk. Consequently, the chunk reader cannot read the data in the cached chunk correctly and the data is lost.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [iotdb] coveralls commented on pull request #5248: [IOTDB-2723] Fix sequence inner space compaction lose data

Posted by GitBox <gi...@apache.org>.
coveralls commented on pull request #5248:
URL: https://github.com/apache/iotdb/pull/5248#issuecomment-1067999507


   
   [![Coverage Status](https://coveralls.io/builds/47376730/badge)](https://coveralls.io/builds/47376730)
   
   Coverage decreased (-0.001%) to 65.705% when pulling **765c612fb973dffefeb7771a0c5d8b814ca7cfdf on THUMarkLau:IOTDB-2723** into **c3d34b6b0e593b1ad2edf82f244547c4f8f0bf2d on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [iotdb] JackieTien97 merged pull request #5248: [IOTDB-2723] Fix sequence inner space compaction lose data

Posted by GitBox <gi...@apache.org>.
JackieTien97 merged pull request #5248:
URL: https://github.com/apache/iotdb/pull/5248


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org