You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2016/01/27 17:25:40 UTC

[jira] [Created] (PARQUET-473) Develop external predicate pushdown API for column readers

Wes McKinney created PARQUET-473:
------------------------------------

             Summary: Develop external predicate pushdown API for column readers
                 Key: PARQUET-473
                 URL: https://issues.apache.org/jira/browse/PARQUET-473
             Project: Parquet
          Issue Type: New Feature
          Components: parquet-cpp
            Reporter: Wes McKinney


This will happen significantly downstream of where we are at right now, but we should be planning ahead to facilitate scanning Parquet files with externally-defined predicates as a primary use case. 

I suggest that the most general (and high performance) predicate will be batch-oriented; i.e. the predicate will be passed a batch of materialized values from one or more columns, and it returns an array of booleans indicating whether or not the predicate is true. We can also develop a row-by-row "scalar" predicate API if users need that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)