You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "PrachiKhobragade (via GitHub)" <gi...@apache.org> on 2023/06/21 00:44:46 UTC

[GitHub] [pinot] PrachiKhobragade opened a new issue, #10951: After segment purge, the segment start and end time in segment zk metadata may not reflect the correct values

PrachiKhobragade opened a new issue, #10951:
URL: https://github.com/apache/pinot/issues/10951

   During the process of purging segments based on a record matcher, certain rows may be eliminated. Once the segment is rebuilt, new metadata is generated. However, in certain scenarios, the time metadata from the previous segment is carried over to the new segment. Consequently, even if rows from the beginning of the segments are removed, the segment start time metadata will still reflect the start time of the segment before purge. The same applies to the segment end time metadata.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10951: After segment purge, the segment start and end time in segment zk metadata may not reflect the correct values

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10951:
URL: https://github.com/apache/pinot/issues/10951#issuecomment-1601955203

   Checking the code in `SegmentPurger` and find the following note:
   
   ```
         // The time column type info is not stored in the segment metadata.
         // Keep segment start/end time to properly handle time column type other than EPOCH (e.g.SIMPLE_FORMAT).
         if (segmentMetadata.getTimeInterval() != null) {
           config.setTimeColumnName(_tableConfig.getValidationConfig().getTimeColumnName());
           config.setStartTime(Long.toString(segmentMetadata.getStartTime()));
           config.setEndTime(Long.toString(segmentMetadata.getEndTime()));
           config.setSegmentTimeUnit(segmentMetadata.getTimeUnit());
         }
   ```
   
   I think at the time when we added the purge task (in 2018), schema might not be available in the cluster, and we don't have the time type info (more context in #2846). Now with #10869 we always use the schema from ZK to generate the new segment, so we can safely remove these special handling to reflect the actual time range


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org