You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/11/10 23:37:34 UTC

[GitHub] [pinot] mcvsubbu opened a new issue #7741: Realtime consumer loses data when new partitions are detected in stream

mcvsubbu opened a new issue #7741:
URL: https://github.com/apache/pinot/issues/7741


   This issue happens  on tables that are configured with an offset criteria of anything other than SMALLEST.
   
   Tables are often provisioned with offset criteria set to LARGEST (basically, ignore earlier offsets and consume only from the latest messages). This is done so that we don't have to consume older data from a stream, only to discard all the data consumed so far since they are too old. Other possible criteria are CUSTOM or TIME period based. 
   
   Pinot has a periodic task (RealtimeSegmentValidationManager) that periodically scans the stream for new partitions and starts consumers for the new partitions detected. It is possible (and most likely the case) that the new partitions were created in between two runs of RealtimeSegmentValidationManager, and that the new partitions already have some data in them.
   
   In such cases, for the newer partitions that appeared, pinot will ignore the first some messages, and will consume after applying the offset criteria specified in table config. 
   
   This was introduced in PR #4695
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu closed issue #7741: Realtime consumer loses data when new partitions are detected in stream

Posted by GitBox <gi...@apache.org>.
mcvsubbu closed issue #7741:
URL: https://github.com/apache/pinot/issues/7741


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7741: Realtime consumer loses data when new partitions are detected in stream

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7741:
URL: https://github.com/apache/pinot/issues/7741#issuecomment-965841009


   cc: @npawar , @Jackie-Jiang , @sajjad-moradi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org