You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/05/23 18:02:00 UTC

[jira] [Commented] (IMPALA-8253) Implement delta encoding in Parquet

    [ https://issues.apache.org/jira/browse/IMPALA-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725521#comment-17725521 ] 

ASF subversion and git services commented on IMPALA-8253:
---------------------------------------------------------

Commit dc63ae514a445e3f197cab405b01a30c58015695 in impala's branch refs/heads/master from Daniel Becker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=dc63ae514 ]

IMPALA-8253: Parquet delta encoding and decoding.

Implemented an encoder and decoder for the Parquet delta encoding (see
https://github.com/apache/parquet-format/blob/master/Encodings.md).

The coders are not integrated with Impala yet, they provide an interface
that Impala could use.

Added new methods to BitWriter and BatchedBitReader handling Uleb and
ZigZag integers for 64 bits.

Also added a benchmark (parquet-delta-benchmark.cc) that compares the
space and CPU performance of plain, dictionary and delta encoding.

Testing:
  - Added new tests for the encoder and decoder
  - Tests covering the additions in BitPacking, BitWriter and
    BatchedBitReader.

Change-Id: Ie7378ac1a490a6c89a0a4349aae86cbc0fbc80f8
Reviewed-on: http://gerrit.cloudera.org:8080/12621
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Implement delta encoding in Parquet
> -----------------------------------
>
>                 Key: IMPALA-8253
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8253
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Csaba Ringhofer
>            Assignee: Daniel Becker
>            Priority: Major
>              Labels: parquet
>
> For the definition of the protocol see DELTA_BINARY_PACKED in https://github.com/apache/parquet-format/blob/master/Encodings.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org