You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/01/25 20:07:01 UTC

[jira] [Created] (ARROW-11384) [C++][Dataset] Support bloom filters in predicate pushdown

Ben Kietzman created ARROW-11384:
------------------------------------

             Summary: [C++][Dataset] Support bloom filters in predicate pushdown
                 Key: ARROW-11384
                 URL: https://issues.apache.org/jira/browse/ARROW-11384
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Ben Kietzman


The parquet spec includes bloom filters which can be useful during filtration. In the context of dataset::, this would be expressed as additional parquet statistics expressions on each row group, allowing entirely-excluded row groups to be skipped more aggressively.


Prerequisite: https://issues.apache.org/jira/browse/PARQUET-1327 (reader/writer support for bloom filters)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)