You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Ferdinand Xu (JIRA)" <ji...@apache.org> on 2015/06/17 07:55:00 UTC

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

    [ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589347#comment-14589347 ] 

Ferdinand Xu commented on PARQUET-41:
-------------------------------------

Hi guys,
The pull request for parquet-format-mr is located at https://github.com/apache/parquet-mr/pull/215 and the one for parquet-format is at https://github.com/apache/parquet-format/pull/28. Currently, I only add the support for integer and test passed for unit test and hive side. The second one is used to define the data structure. Please help me review these two PRs. Thank you!

> Add bloom filters to parquet statistics
> ---------------------------------------
>
>                 Key: PARQUET-41
>                 URL: https://issues.apache.org/jira/browse/PARQUET-41
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-format, parquet-mr
>            Reporter: Alex Levenson
>            Assignee: ferdinand xu
>              Labels: filter2
>
> For row groups with no dictionary, we could still produce a bloom filter. This could be very useful in filtering entire row groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)