You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/24 03:36:00 UTC

[jira] [Commented] (PARQUET-2251) Avoid generating Bloomfilter when all pages of a column are encoded by dictionary

    [ https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692993#comment-17692993 ] 

ASF GitHub Bot commented on PARQUET-2251:
-----------------------------------------

yabola opened a new pull request, #1033:
URL: https://github.com/apache/parquet-mr/pull/1033

   In parquet pageV1, even all pages of a column are encoded by dictionary, it will still generate BloomFilter. Actually it is unnecessary to generate BloomFilter and it cost time and occupy storage.
   Parquet pageV2 doesn't generate BloomFilter if all pages of a column are encoded by dictionary,




> Avoid generating Bloomfilter when all pages of a column are encoded by dictionary
> ---------------------------------------------------------------------------------
>
>                 Key: PARQUET-2251
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2251
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Mars
>            Priority: Major
>
> In parquet pageV1, even all pages of a column are encoded by dictionary, it will still generate BloomFilter. Actually it is unnecessary to generate BloomFilter and it cost time and occupy storage.
> Parquet pageV2 doesn't generate BloomFilter if all pages of a column are encoded by dictionary,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)