You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/06/04 18:35:20 UTC

[GitHub] [incubator-pinot] mcvsubbu commented on issue #7004: Preserver Kafka Message Metadata in Pinot Tables

mcvsubbu commented on issue #7004:
URL: https://github.com/apache/incubator-pinot/issues/7004#issuecomment-854927322


   +1 to keeping it virtual column
   +1 to making it configurable, since it can be quite some overhead
   +1 to keeping it a string, pinot is transparent to the stream underneath.
   We should not be building indices on it. Please use raw index. It is supported for consuming segments now.
   
   In case of kinesis, we should (may want to) also keep track of other metadata like partition IDs in the group during the time the segment was being consumed. I think these do not change (if they do, we close the segment), but @npawar  or @KKcorps  can comment on that.
   
   Since this is stream dependent, I would make it a string that has (at the minimum) the StreamMsgOffset serialized, and also the partition group ID. Beyond that, each stream may add its own stuff.
   
   Also, consider having a less verbose version of this by having some data common to the entire segment, in the segment metadata (Some of these are there in zk metadata). For kafka, this could mean start/end offset, partition group id, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org