You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/12/05 05:02:25 UTC

[GitHub] [pinot] siddharthteotia opened a new issue #7870: Possible storage optimization for MV forward index

siddharthteotia opened a new issue #7870:
URL: https://github.com/apache/pinot/issues/7870


   We recently observed a huge increase on table size upon adding a MV column. Table size went from 2TB to 12TB.
   
   Column was dictionary encoded (back then raw MV column support wasn't added or was in the middle of being added). Column is needed in WHERE clause so it is dictionary encoded along with inverted index. Both forward and inverted index have resulted in size increase. Some stats from a sample segment:
   
   mvCol.cardinality = 131483
   mvCol.totalDocs = 287714
   mvCol.dictionary.size = 525940
   mvCol.forward_index.size = 800860697
   mvCol.inverted_index.size = 553252156
   mvCol.maxNumberOfMultiValues = 16649 
   mvCol.totalNumberOfEntries = 336962215
   
   The numDocs in segment is fairly low 287k. Given that totalNumberOfEntries is around 336million, cardinality is super low around 131k. 
   
   The fwd index size of 800MB is majorly coming from rawDataSize section computed as :
   
   `rawDataSize = ((long) totalNumValues * numBitsPerValue + 7) / 8;`
   
   numBitsPerValue is 18 given that cardinality is 131k
   
   So dictId for each of the 336million values is encoded with 18 bits in the rawData section. 
   
   Haven't thought much about a solution yet but given that there is so much duplicate data (and I have checked that there are repetitive runs) in the above sample, one potential way could be to use RLE along with bit packing where run length itself is bit-packed and/or a hybrid combination of RLE and bit-packing that switches between the two depending on data (something like what Parquet does). 
   
   Another solution would be to consider variable length bit encoding where instead of current way of using fixed number of bits (max number of bits essentially) for each dictId, we use based on the value. But in this case, a fixed 5 bits per dictId are going to be needed to indicate how many bits are used to encode the dictId
   
   Another way could be to have sort of dictionary on top of dictionary -- this can work if entire array is duplicated. So for example, if dictId array [1, 2, 3, 4, 5, 10] is appearing multiple times across rows/docs, we create a dictId for this dictId array and use the former dictId to encode the array values instead of using array of dictIds.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] siddharthteotia commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

siddharthteotia commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-991484703


   Working on adding support for compressing raw index with dictionary


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] ashishkf commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

ashishkf commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-986309125


   BTW - compressing fwd index for dictionary encoded columns can/should be done for SV columns also.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

richardstartin commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-1025004368


   In addition to the above, given that there are runs, LZ4 will do a better job if the values are byte aligned - runs of 18 bit dictionary codes looks like a random sequence of bytes to LZ4 because the 2 bit offsets scramble the bytes (I have a demo of this effect with base64 encoding). It’s worth experimenting with rounding up to 24 prior to LZ4 compression, iff LZ4 compression is enabled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] ashishkf commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

ashishkf commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-986306333


   +1 for the using LZ4 compression for the dictionary encoded fwd index.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

richardstartin commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-1025002881


   @siddharthteotia how fast have you got with this? Do you have the frequency distribution of the column? I have some ideas for variable length dictionary encoding which will pay off if the frequency distribution has skew and we sort the dictionary by descending frequency.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] richardstartin edited a comment on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

richardstartin edited a comment on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-1025002881


   @siddharthteotia how far have you got with this? Do you have the frequency distribution of the column? I have some ideas for variable length dictionary encoding which will pay off if the frequency distribution has skew and we sort the dictionary by descending frequency.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kishoreg commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

kishoreg commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-986264969


   Daniel Lemire has done a lot of work on Integer compression techniques. In fact, we did use PFOR and PFORDelta in the early days of Pinot and removed them because of latency requirements of site facing. It might be worth revisiting given the broader use cases for Pinot.
   
   https://github.com/lemire/FastPFor - there are a lot of references 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] siddharthteotia commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

siddharthteotia commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-991484703


   Working on adding support for compressing raw index with dictionary


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] Jackie-Jiang commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

Jackie-Jiang commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-993010573


   > Working on adding support for compressing raw index with dictionary
   
   @siddharthteotia For clarification, are you suggesting compressing bit-compressed forward index when a column is dictionary encoded?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org

[GitHub] [pinot] kishoreg commented on issue #7870: Possible storage optimization for MV forward index

Posted by GitBox <gi...@apache.org>.

kishoreg commented on issue #7870:
URL: https://github.com/apache/pinot/issues/7870#issuecomment-986262501


   Even though the number of docs is low, the total number of entries is quite high - 336 million. There is not much we can do to reduce the size of forward index without sacrificing the access speed. One option would be to eliminate the forward index if it's not needed post-filtering and just keep the inverted index similar to what we do for some text index columns.
   
   One thing that stands out is that the Inverted index size (500 MB) seems quite high for 336 million. Can you verify if we do the runCompress on the bitmap. cc @richardstartin added something here recently.
   
   I like the 4th idea, will be great to apply compression on any column with or without dictionary encoding


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org