You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/06/19 19:43:00 UTC
[jira] [Resolved] (IMPALA-3900) Add per-split runtime filtering to
HdfsParquetScanner::ProcessSplit()
[ https://issues.apache.org/jira/browse/IMPALA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-3900.
-----------------------------------
Resolution: Duplicate
> Add per-split runtime filtering to HdfsParquetScanner::ProcessSplit()
> ---------------------------------------------------------------------
>
> Key: IMPALA-3900
> URL: https://issues.apache.org/jira/browse/IMPALA-3900
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 2.6.0
> Reporter: Henry Robinson
> Assignee: Henry Robinson
> Priority: Minor
> Labels: runtime-filters
>
> If a partition filter arrives after a footer scan range for a Parquet has been issued, but before {{HdfsParquetScanner::ProcessSplit()}}, there's an opportunity to filter out all the scan ranges that would otherwise be issued when reading that footer, by adding a call to {{ScanRangeIsFilteredOut()}} at the top of that method.
> Care must be taken to ensure that all scan ranges are marked as done, since they won't be processed by their own scanner instances. This will avoid a recurrence of IMPALA-3804.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org