You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/01/05 07:13:38 UTC

[GitHub] [pinot] richardstartin commented on issue #7973: chunk compression type is hardcoded to passthrough for metric columns

richardstartin commented on issue #7973:
URL: https://github.com/apache/pinot/issues/7973#issuecomment-1005437083


   This makes sense the way it is for a couple of reasons:
   * chunks for metric columns are tiny: 4-8KB depending on the data type. This means there would be many chunks to decompress in a column scan. 
   * general purpose compression algorithms work better on text than arbitrary numeric data, so the compression ratio for the average user’s column likely wouldn’t be very good.
   
   These two factors combine to make a less than compelling case for general purpose compression of metric columns. 
   
   There are numerous encoding techniques which could be explored for metric columns in the future, which tend to produce better space reductions and are faster to decode. 
   
   If you have a metric column which you expect to be compressible because it has lots of duplicates, it would be worth experimenting with using a dictionary column instead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org