You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Alessandro Solimando (Jira)" <ji...@apache.org> on 2022/01/23 11:54:00 UTC
[jira] [Commented] (HIVE-25758) OOM due to recursive application of CBO rules
[ https://issues.apache.org/jira/browse/HIVE-25758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480604#comment-17480604 ]
Alessandro Solimando commented on HIVE-25758:
---------------------------------------------
Yet another problematic query falling under the same issue:
{noformat}
CREATE TABLE tableA
(
`bd_id` bigint,
`quota_type` string
);
select a.bd_id
from (
select t.bd_id
from tableA t
where (t.bd_id = 8 and t.quota_type in('A','C')) or (t.bd_id = 9 and t.quota_type in ('A','B'))
) a join (
select t.bd_id
from tableA t
where t.bd_id = 9 and t.quota_type in ('A','B')
union all
select t.bd_id
from tableA t
where (t.bd_id = 8 and t.quota_type in('A','C')) or (t.bd_id = 9 and t.quota_type in ('A','B'))
) b on a.bd_id = b.bd_id
where a.bd_id = 8 or a.bd_id <>8;
{noformat}
> OOM due to recursive application of CBO rules
> ---------------------------------------------
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
> Issue Type: Bug
> Components: CBO, Query Planning
> Affects Versions: 4.0.0
> Reporter: Alessandro Solimando
> Assignee: Alessandro Solimando
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
>
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
> SELECT c.month,
> d.con_usd
> FROM
> (SELECT cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 'yyyy-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
> FROM test1
> UNION ALL
> SELECT month
> FROM test2
> WHERE month = 202110) c
> JOIN test3 d ON c.month = d.mth; {code}
>
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
> HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], cost=[not available])
> HiveProject(month=[$0])
> HiveUnion(all=[true])
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, _UTF-16LE'yyyy-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
> HiveTableScan(table=[[default, test1]], table:alias=[test1])
> HiveProject(month=[$0])
> HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
> HiveTableScan(table=[[default, test2]], table:alias=[test2])
> HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>
> Then, the HEP planner will keep expanding the filter expression with redundant expressions, such as the following, where the identical CAST expression is present multiple times:
>
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 202110)){noformat}
>
> The problem seems to come from a bad interaction of at least _HiveFilterProjectTransposeRule_ and {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified even further.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)