You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/02/05 02:02:00 UTC

[jira] [Commented] (IMPALA-11047) Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0

    [ https://issues.apache.org/jira/browse/IMPALA-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487398#comment-17487398 ] 

ASF subversion and git services commented on IMPALA-11047:
----------------------------------------------------------

Commit 63bd6a5aec8d712210abf261f2fd98b7a60ce885 in impala's branch refs/heads/master from Qifan Chen
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=63bd6a5 ]

IMPALA-11047 Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0

This patch addresses the check not-null failure in FE by checking
the query option PARQUET_READ_STATISTICS in the following situations:

  1) When determining whether to apply the min/max overlap predicate;
  2) When modifying query option minmax_filter_threshold and
     minmax_filtering_level for applying min/max filters to sort or
     partition columns.

When PARQUET_READ_STATISTICS is true or 1, then either will proceed.

Testing:
  1. Add a new test in TestOverlapMinMaxFilters;
  2. Ran the core test successfully.

Change-Id: I52203e73502a35a275decb602b063982b9cad26e
Reviewed-on: http://gerrit.cloudera.org:8080/18071
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0
> ----------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-11047
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11047
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.0.0
>            Reporter: Riza Suminto
>            Assignee: Qifan Chen
>            Priority: Major
>
> There is a conflict happening in HdfsScanNode.java vs RuntimeFilterGenerator.java when initializing overlap predicate.
> In HdfsScanNode.java, computeStatsTupleAndConjuncts that init statsTuple_ will not be called because PARQUET_READ_STATISTICS=0.
> [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L409]
>  
> On the other hand, in RuntimeFilterGenerator.java, disable_overlap_filter is set to false without considering what is the value of PARQUET_READ_STATISTICS.
> [https://github.com/apache/impala/blob/9d61bc4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java#L915]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org