You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/06/19 19:43:00 UTC

[jira] [Resolved] (IMPALA-3900) Add per-split runtime filtering to HdfsParquetScanner::ProcessSplit()

     [ https://issues.apache.org/jira/browse/IMPALA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-3900.
-----------------------------------
    Resolution: Duplicate

> Add per-split runtime filtering to HdfsParquetScanner::ProcessSplit()
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-3900
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3900
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.6.0
>            Reporter: Henry Robinson
>            Assignee: Henry Robinson
>            Priority: Minor
>              Labels: runtime-filters
>
> If a partition filter arrives after a footer scan range for a Parquet has been issued, but before {{HdfsParquetScanner::ProcessSplit()}}, there's an opportunity to filter out all the scan ranges that would otherwise be issued when reading that footer, by adding a call to {{ScanRangeIsFilteredOut()}} at the top of that method.
> Care must be taken to ensure that all scan ranges are marked as done, since they won't be processed by their own scanner instances. This will avoid a recurrence of IMPALA-3804.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org