You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/08/26 21:00:26 UTC

[GitHub] [incubator-pinot] fpj commented on issue #5928: Add connector for Pravega

fpj commented on issue #5928:
URL: https://github.com/apache/incubator-pinot/issues/5928#issuecomment-681121098


   Thanks for starting this issue, @kishoreg. I'm wondering what would be the best way to approach the implementation of a connector. I started by looking at the spi interfaces, and in particular at the `StreamLevelConsumer`. Here is the reason.
   
   Pravega has as a few different APIs, the main one being the event stream API. it also has a batch API that enables unordered reads for parallelism, but let's focus on the event stream API for now. A Pravega stream comprises a set of parallel segments, and that set can change over time according to a scaling policy. We do not expose the complexity of dealing with the set of segments changing, and let the readers in a group coordinate internally the assignment of segments, respecting order. 
   
   Given that segments aren't clearly exposed in the event stream API, the `StreamLevelConsumer` interface seems to provide the right level of abstraction, except for the commit call. We don't really provide the ability to commit per reader, like Kafka provides the ability to commit per consumer. Pravega reader groups instead produce checkpoints, which are consistent collections of offsets for segments currently being read. The application sees checkpoints.
   
   Our approach to recording positions of one or more streams is consequently more coarse-grained and coordinated across the group. I wanted to understand how I can introduce checkpoints given that the interface expects a commit implementation.
   
   I'd love to get some input and hopefully some ideas. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org