You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/03 22:31:01 UTC

[GitHub] [incubator-pinot] mcvsubbu commented on issue #6637: Kafka ingestion: Allow users to reset the offset to consume

mcvsubbu commented on issue #6637:
URL: https://github.com/apache/incubator-pinot/issues/6637#issuecomment-790119420


   This has to be done on a per-partition basis. A couple of ways to do this:
   - Provide a method to start the offset of the current consuming segment at a certain point. So, if a segment started consumption at offset 100, but had a problem at 120, the user can set the start offset to be (say) 125. In this case, there is loss of data from 100 to 120. Hopefully there are no bad offsets beyond 125.
   - Provide a method to skip certain offsets in a stream-partition. The user specifies an array of offsets that are to be skipped by the consumer. This is more complicated in terms of user-interface, but preserves maximum data possible.
   
   I expect the second option to be a bit more complex in implementation as well. We will need to consume each row and check whether it is taboo or not.
   
   I prefer the first approach -- big hammer but simple(r).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org