You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/20 14:20:01 UTC

[GitHub] [pinot] jadami10 opened a new issue, #8929: Allowing empty segments with kafka consumption

jadami10 opened a new issue, #8929:
URL: https://github.com/apache/pinot/issues/8929

   We have a fairly unique consumption pattern that leads to consumption issues in Pinot.
   
   - our Pinot table is set to consume topic T across P partitions.
   - topic T has many different shapes of events, but at ingestion time we filter out any events that do not meet criteria C
   - topic T has a high number of events per second
   - for some Partitions P, there may be 0 events for days that match our criteria
   
   What ends up happening is Pinot never seals those segments but continues extending their lease. When we go to restart Pinot servers, they then restart consuming from days ago. This leads to a huge amount of data being consumed just to be filtered out again, throttling from the kafka side, and waiting hours for the server to go healthy again.
   
   I believe kinesis already has a way to seal "empty segments" that we would need here as well to get Pinot to continue advancing offsets correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jadami10 commented on issue #8929: Allowing empty segments with kafka consumption

Posted by GitBox <gi...@apache.org>.
jadami10 commented on issue #8929:
URL: https://github.com/apache/pinot/issues/8929#issuecomment-1175515116

   accidentally duplicated in https://github.com/apache/pinot/issues/9014. closing this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #8929: Allowing empty segments with kafka consumption

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #8929:
URL: https://github.com/apache/pinot/issues/8929#issuecomment-1160770448

   This is a fair ask. Since we introduced filtering during ingestion, if the offset moves but no record ingested, we should commit an empty segment and move the offset. cc @npawar @mcvsubbu 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jadami10 closed issue #8929: Allowing empty segments with kafka consumption

Posted by GitBox <gi...@apache.org>.
jadami10 closed issue #8929: Allowing empty segments with kafka consumption
URL: https://github.com/apache/pinot/issues/8929


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org