You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2019/06/14 20:21:37 UTC

[GitHub] [incubator-pinot] buchireddy opened a new issue #4317: Support variable length Offline Dictionary Indexes for bytes, strings and maps to save on storage

buchireddy opened a new issue #4317: Support variable length Offline Dictionary Indexes for bytes, strings and maps to save on storage
URL: https://github.com/apache/incubator-pinot/issues/4317

**What?**
Currently, the dictionary index for offline segments for bytes and string types uses Fixed-size storage for each value (by picking the size of the max element and padding the smaller elements with "0"). See org.apache.pinot.core.io.util.FixedByteValueReaderWriter
The idea is to avoid padding and support storing byte arrays/strings/maps of different length while not slowing down the lookups much (obviously).

**Why?**
Fixed size based storage is good for fast lookups but it's very inefficient for the storage. For example, if we have a String column and the size of the biggest string value is 100 bytes but the average size is only 10 bytes, there is about 90% padding. The same thing applies for byte[], maps, etc.

**How?**
Currently, `FixedByteValueReaderWriter` only writes the sorted values in the buffer directly starting from "0" offset and at fixed lengths. So, first Int is at index "0" and the second one at index "4", etc. There is no additional metadata needed in the buffer.
The idea is to maintain the index of each element at the beginning of the buffer so that the element sizes needn't be fixed. When looking up an element from the buffer, we first get it's offset and then read the actual element. This means we do two reads from the buffer (first int offset and then the actual element) but the offset read should be fast enough so it shouldn't slow down the overall operation that much.

Few things to note:
* If all values of a byte[], string or map column have fixed length, this approach rather adds storage overhead and one additional lookup and might not be preferable. Hence, we can have a flag/property at the column level to decide whether to use the VarLengthByteValueReaderWriter or not.
* Backward compatibility shouldn't be broken, which means we need to introduce some kind of header into the buffer to be able to distinguish the on-disk storage format.
* Need to run Benchmarks to see the lookup overhead added by this approach.
* If possible, we should do some benchmarking to get the storage savings with the new approach so that we can make data-driven decisions.

Thanks @kishoreg for pointing this problem and brainstorming.

P.S: This was originally tracked in https://github.com/winedepot/pinot/issues/24

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org