You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/03/29 19:59:29 UTC

[GitHub] [incubator-pinot] mcvsubbu opened a new issue #4036: Reduce heap usage when building realtime segments

mcvsubbu opened a new issue #4036: Reduce heap usage when building realtime segments
URL: https://github.com/apache/incubator-pinot/issues/4036
 
 
   Reducing heap usage while building completed segments. Currently, the segment builder is designed to read incoming data row by row, and build dictionaries in a hash table before translating them to the on-disk format of a dictionary. We can by-pass these steps since we already have the segment in columnar format (realtime consumers ingest rows but store in a columnar format for serving queries). Initial prototype has shown significant reduction in heap usage during segment builds. If we reduce heap usage (better yet, move completely to off-heap based segment completion) more segments can be packed into a single host, saving hardware cost. If a higher latency can be tolerated, these hosts could use SSDs and map off-heap memory from files (Pinot already provides primitives for doing these)
   
   Prototype: https://github.com/apache/incubator-pinot/tree/columnar-segment-builder

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org