You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/10 14:42:49 UTC

[GitHub] [arrow-rs] tustvold opened a new issue #1416: DeltaBitPackEncoder Pads Miniblock BitWidths With Arbitrary Values

tustvold opened a new issue #1416:
URL: https://github.com/apache/arrow-rs/issues/1416


   **Describe the bug**
   
   https://github.com/apache/arrow-rs/blob/master/parquet/src/encodings/encoding.rs#L577 skips over the miniblock bit widths, and then only goes back and writes a value for the miniblocks that contain a non-zero number of values. The empty miniblocks are left with whatever value happens to be in the encoder's buffer.
   
   **To Reproduce**
   
   This is one of the underlying bugs behind https://github.com/apache/arrow-datafusion/issues/1976
   
   **Expected behavior**
   
   Whilst the specification technically allows for arbitrary padding, it seems like a good idea to avoid non-deterministic output where possible
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb closed issue #1416: DeltaBitPackEncoder Pads Miniblock BitWidths With Arbitrary Values

Posted by GitBox <gi...@apache.org>.
alamb closed issue #1416:
URL: https://github.com/apache/arrow-rs/issues/1416


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org