You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/11/24 23:35:50 UTC

[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6287: Parallelize segment index init and building

mcvsubbu commented on pull request #6287:
URL: https://github.com/apache/incubator-pinot/pull/6287#issuecomment-733294374


   Garbage collection during segment build has been a bigger problem for us than the segment build itself -- in the realtime path. In the offline path, where segments are built in hadoop/spark, we have not had issues with GC or performance.
   
   Here is a prototype that I had done a while ago, using a columnar segment builder reduced GC significantly.
   
   https://github.com/apache/incubator-pinot/issues/4036
   
   Please use a similar technique and post GC results for some use cases where you hit performance problems (yes, unfortunately, the amount of garbage depends on the kind of columns you have -- I suspect performance also goes the same way).
   
   In which scenario are you facing performance problems?
   
   Lastly, if you could please add a short paragraph outlining your approach, that will help reviewers. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org