You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hankó Gergely (Jira)" <ji...@apache.org> on 2022/09/21 10:22:00 UTC
[jira] [Updated] (HIVE-26552) PartitionConditionRemover doesn't remove constant filter with structs inside
[ https://issues.apache.org/jira/browse/HIVE-26552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hankó Gergely updated HIVE-26552:
---------------------------------
Description:
Repro:
{code:java}
set hive.fetch.task.conversion=none;
create table test (a string) partitioned by (y string, m string);
insert into test values ('aa', 2022, 9);
explain vectorization expression select * from test where (y=year(date_sub('2022-09-11',4)) and m=month(date_sub('2022-09-11',4))) or (y=year(date_sub('2022-09-11',10)) and m=month(date_sub('2022-09-11',10)) ); {code}
Actual:
{code:java}
(...)
Filter Operator
Filter Vectorization:
className: VectorFilterOperator
native: true
predicateExpression: SelectColumnIsTrue(col 5:boolean)(children: VectorUDFAdaptor((const struct(2022.0D,9.0D)) IN (const struct(2022.0D,9.0D), const struct(2022.0D,9.0D))) -> 5:boolean)
predicate: (const struct(2022.0D,9.0D)) IN (const struct(2022.0D,9.0D), const struct(2022.0D,9.0D)) (type: boolean)
Statistics: Num rows: 1 Data size: 454 Basic stats: COMPLETE Column stats: COMPLETE
(...){code}
Expected:
The filter operator should be optimized out similarly as it is removed in the following query:
{code:java}
explain vectorization expression select * from test where (y=year(date_sub('2022-09-11',4))) or (y=year(date_sub('2022-09-11',10))); {code}
was:
Repro:
{code:java}
set hive.fetch.task.conversion=none;
create table test (a string) partitioned by (y string, m string);
insert into test values ('aa', 2022, 9);
explain vectorization expression select * from test where (y=year(date_sub('2022-09-11',4)) and m=month(date_sub('2022-09-11',4))) or (y=year(date_sub('2022-09-11',10)) and m=month(date_sub('2022-09-11',10)) ); {code}
Actual:
{code:java}
Filter Operator
Filter Vectorization:
className: VectorFilterOperator
native: true
predicateExpression: SelectColumnIsTrue(col 5:boolean)(children: VectorUDFAdaptor((const struct(2022.0D,9.0D)) IN (const struct(2022.0D,9.0D), const struct(2022.0D,9.0D))) -> 5:boolean)
predicate: (const struct(2022.0D,9.0D)) IN (const struct(2022.0D,9.0D), const struct(2022.0D,9.0D)) (type: boolean)
Statistics: Num rows: 1 Data size: 454 Basic stats: COMPLETE Column stats: COMPLETE {code}
Expected:
The filter operator should be optimized out similarly as it is removed in the following query:
{code:java}
explain vectorization expression select * from test where (y=year(date_sub('2022-09-11',4))) or (y=year(date_sub('2022-09-11',10))); {code}
> PartitionConditionRemover doesn't remove constant filter with structs inside
> ----------------------------------------------------------------------------
>
> Key: HIVE-26552
> URL: https://issues.apache.org/jira/browse/HIVE-26552
> Project: Hive
> Issue Type: Improvement
> Reporter: Hankó Gergely
> Priority: Major
>
> Repro:
> {code:java}
> set hive.fetch.task.conversion=none;
> create table test (a string) partitioned by (y string, m string);
> insert into test values ('aa', 2022, 9);
> explain vectorization expression select * from test where (y=year(date_sub('2022-09-11',4)) and m=month(date_sub('2022-09-11',4))) or (y=year(date_sub('2022-09-11',10)) and m=month(date_sub('2022-09-11',10)) ); {code}
> Actual:
> {code:java}
> (...)
> Filter Operator
> Filter Vectorization:
> className: VectorFilterOperator
> native: true
> predicateExpression: SelectColumnIsTrue(col 5:boolean)(children: VectorUDFAdaptor((const struct(2022.0D,9.0D)) IN (const struct(2022.0D,9.0D), const struct(2022.0D,9.0D))) -> 5:boolean)
> predicate: (const struct(2022.0D,9.0D)) IN (const struct(2022.0D,9.0D), const struct(2022.0D,9.0D)) (type: boolean)
> Statistics: Num rows: 1 Data size: 454 Basic stats: COMPLETE Column stats: COMPLETE
> (...){code}
> Expected:
> The filter operator should be optimized out similarly as it is removed in the following query:
> {code:java}
> explain vectorization expression select * from test where (y=year(date_sub('2022-09-11',4))) or (y=year(date_sub('2022-09-11',10))); {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)