You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2016/08/01 17:34:20 UTC
[jira] [Updated] (PARQUET-671) Improve performance of
RLE/bit-packed decoding in parquet-cpp
[ https://issues.apache.org/jira/browse/PARQUET-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated PARQUET-671:
---------------------------------
Assignee: Eric Daniel
> Improve performance of RLE/bit-packed decoding in parquet-cpp
> -------------------------------------------------------------
>
> Key: PARQUET-671
> URL: https://issues.apache.org/jira/browse/PARQUET-671
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-cpp
> Reporter: Eric Daniel
> Assignee: Eric Daniel
>
> There are steps that can dramatically improve decoding performance:
> - when decoding repeated values in the rle/dictionary encoding, do the dictionary lookup only once
> - when decoding bit-packed sequences, do the decoding in batches so the bit unpacker's state can be kept in registers (instead of updating members for every decoded value)
> - use Daniel Lemire's fast unpacking routines whenever possible (https://github.com/lemire/FrameOfReference/)
> I have a PR ready to implement these changes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)