You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2022/10/19 11:54:00 UTC

[jira] [Resolved] (HIVE-26638) Replace in-house CBO reduce expressions rules with Calcite's built-in classes

     [ https://issues.apache.org/jira/browse/HIVE-26638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stamatis Zampetakis resolved HIVE-26638.
----------------------------------------
    Fix Version/s: 4.0.0
                   4.0.0-alpha-2
       Resolution: Fixed

Fixed in https://github.com/apache/hive/commit/5a2b42982adeca506daf5bec435dfc51b4522638. Thanks for the review [~kkasa]!

> Replace in-house CBO reduce expressions rules with Calcite's built-in classes
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-26638
>                 URL: https://issues.apache.org/jira/browse/HIVE-26638
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0, 4.0.0-alpha-2
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The goal of this ticket is to remove Hive specific code in [HiveReduceExpressionsRule|https://github.com/apache/hive/blob/b48c1bf11c4f75ba2c894e4732a96813ddde1414/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java] and use exclusively the respective Calcite classes (i.e., [ReduceExpressionsRule|https://github.com/apache/calcite/blob/2c30a56158cdd351d35725006bc1f76bb6aac75b/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java]) to reduce maintenance overhead and facilitate code evolution.
> Currently the only difference between in-house (HiveReduceExpressionsRule) and built-in (ReduceExpressionsRule) reduce expressions rules lies in the way we treat the {{Filter}} operator (i.e., FilterReduceExpressionsRule).
> There are four differences when comparing the in-house code with the respective part in Calcite 1.25.0 that are Hive specific.
> +Match nullability when reducing expressions+
> When we reduce filters we always set {{matchNullability}} (last parameter) to false.
> {code:java}
> if (reduceExpressions(filter, expList, predicates, true, false)) {
> {code}
> This means that the original and reduced expression can have a slightly different type in terms of nullability; the original is nullable and the reduced is not nullable. When the value is true the type can be preserved by adding a "nullability" CAST, which is a cast to the same type which differs only to if it is nullable or not.
> Hardcoding {{matchNullability}} to false was done as part of the upgrade in Calcite 1.15.0 (HIVE-18068) where the behavior of the rule became configurable (CALCITE-2041).
> +Remove nullability cast explicitly+
> When the expression is reduced we try to remove the nullability cast; if there is one.
> {code:java}
> if (RexUtil.isNullabilityCast(filter.getCluster().getTypeFactory(), newConditionExp)) {
> 	newConditionExp = ((RexCall) newConditionExp).getOperands().get(0);
> }
> {code}
> The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316). However, the code is redundant as of HIVE-18068; setting {{matchNullability}} to {{false}} no longer generates nullability casts during the reduction.
> +Avoid creating filters with condition of type NULL+
> {code:java}
> if(newConditionExp.getType().getSqlTypeName() == SqlTypeName.NULL) {
> 	newConditionExp = call.builder().cast(newConditionExp, SqlTypeName.BOOLEAN);
> }
> {code}
> Hive tries to cast such expressions to BOOLEAN to avoid the weird (and possibly problematic) situation of having a condition with NULL type.
> In Calcite, there is specific code for detecting if the new condition is the NULL literal (with NULL type) and if that's the case it turns the relation to empty.
> {code:java}
> } else if (newConditionExp instanceof RexLiteral
>   || RexUtil.isNullLiteral(newConditionExp, true)) {
> call.transformTo(createEmptyRelOrEquivalent(call, filter));
> {code}
> Due to that the Hive specific code is redundant if the Calcite rule is used.
> +Bail out when input to reduceNotNullableFilter is not a RexCall+
> {code:java}
> if (!(rexCall.getOperands().get(0) instanceof RexCall)) {
>       // If child is not a RexCall instance, we can bail out
>       return;
> }
> {code}
> The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316) but it does not add any functional value.
> The instanceof check is redundant since the code in reduceNotNullableFilter [is a noop|https://github.com/apache/hive/blob/6e8fc53fb68898d1a404435859cea5bbc79200a4/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java#L228] when the expression/call is not one of the following: IS_NULL, IS_UNKNOWN, IS_NOT_NULL, which are all rex calls.
> +Summary+
> All of the Hive specific changes mentioned previously can be safely replaced by appropriate uses of the Calcite APIs without affecting the behavior of CBO.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)