You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2022/10/17 10:29:00 UTC

[jira] [Created] (HIVE-26638) Replace in-house CBO reduce expressions rules with Calcite's built-in classes

Stamatis Zampetakis created HIVE-26638:
------------------------------------------

             Summary: Replace in-house CBO reduce expressions rules with Calcite's built-in classes
                 Key: HIVE-26638
                 URL: https://issues.apache.org/jira/browse/HIVE-26638
             Project: Hive
          Issue Type: Improvement
          Components: CBO
            Reporter: Stamatis Zampetakis
            Assignee: Stamatis Zampetakis


The goal of this ticket is to remove Hive specific code in [HiveReduceExpressionsRule|https://github.com/apache/hive/blob/b48c1bf11c4f75ba2c894e4732a96813ddde1414/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java] and use exclusively the respective Calcite classes (i.e., [ReduceExpressionsRule|https://github.com/apache/calcite/blob/2c30a56158cdd351d35725006bc1f76bb6aac75b/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java]) to reduce maintenance overhead and facilitate code evolution.

Currently the only difference between in-house (HiveReduceExpressionsRule) and built-in (ReduceExpressionsRule) reduce expressions rules lies in the way we treat the {{Filter}} operator (i.e., FilterReduceExpressionsRule).

There are four differences when comparing the in-house code with the respective part in Calcite 1.25.0 that are Hive specific.

+Match nullability when reducing expressions+
When we reduce filters we always set {{matchNullability}} (last parameter) to false.
{code:java}
if (reduceExpressions(filter, expList, predicates, true, false)) {
{code}
This means that the original and reduced expression can have a slightly different type in terms of nullability; the original is nullable and the reduced is not nullable. When the value is true the type can be preserved by adding a "nullability" CAST, which is a cast to the same type which differs only to if it is nullable or not.

Hardcoding {{matchNullability}} to false was done as part of the upgrade in Calcite 1.15.0 (HIVE-18068) where the behavior of the rule became configurable (CALCITE-2041).

+Remove nullability cast explicitly+
When the expression is reduced we try to remove the nullability cast; if there is one.
{code:java}
if (RexUtil.isNullabilityCast(filter.getCluster().getTypeFactory(), newConditionExp)) {
	newConditionExp = ((RexCall) newConditionExp).getOperands().get(0);
}
{code}
The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316). However, the code is redundant as of HIVE-18068; setting {{matchNullability}} to {{false}} no longer generates nullability casts during the reduction.

+Avoid creating filters with condition of type NULL+
{code:java}
if(newConditionExp.getType().getSqlTypeName() == SqlTypeName.NULL) {
	newConditionExp = call.builder().cast(newConditionExp, SqlTypeName.BOOLEAN);
}
{code}
Hive tries to cast such expressions to BOOLEAN to avoid the weird (and possibly problematic) situation of having a condition with NULL type.

In Calcite, there is specific code for detecting if the new condition is the NULL literal (with NULL type) and if that's the case it turns the relation to empty.
{code:java}
} else if (newConditionExp instanceof RexLiteral
  || RexUtil.isNullLiteral(newConditionExp, true)) {
call.transformTo(createEmptyRelOrEquivalent(call, filter));
{code}
Due to that the Hive specific code is redundant if the Calcite rule is used.

+Bail out when input to reduceNotNullableFilter is not a RexCall+
{code:java}
if (!(rexCall.getOperands().get(0) instanceof RexCall)) {
      // If child is not a RexCall instance, we can bail out
      return;
}
{code}
The code was added as part of the upgrade to Calcite 1.10.0 (HIVE-13316) but it does not add any functional value.
The instanceof check is redundant since the code in reduceNotNullableFilter [is a noop|https://github.com/apache/hive/blob/6e8fc53fb68898d1a404435859cea5bbc79200a4/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveReduceExpressionsRule.java#L228] when the expression/call is not one of the following: IS_NULL, IS_UNKNOWN, IS_NOT_NULL, which are all rex calls.

+Summary+

All of the Hive specific changes mentioned previously can be safely replaced by appropriate uses of the Calcite APIs without affecting the behavior of CBO.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)