You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/02/17 17:00:20 UTC

[GitHub] [hive] jcamachor commented on a change in pull request #1981: HIVE-24775: Incorrect null handling when rebuilding Materialized view incrementally

jcamachor commented on a change in pull request #1981:
URL: https://github.com/apache/hive/pull/1981#discussion_r577759809



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
##########
@@ -117,17 +116,17 @@ public void onMatch(RelOptRuleCall call) {
       projExprs.add(rightRef);
       joinConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.EQUALS,
           ImmutableList.of(leftRef, rightRef)));
-      filterConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL,
-          ImmutableList.of(leftRef)));
     }
     // 3) Add the expressions that correspond to the aggregation
     // functions
-    RexNode caseFilterCond = RexUtil.composeConjunction(rexBuilder, filterConjs);
+    List<RexNode> filterConjs = new ArrayList<>();
     for (int i = 0, leftPos = groupCount, rightPos = totalCount + groupCount;
          leftPos < totalCount; i++, leftPos++, rightPos++) {
       // case when mv2.deptno IS NULL AND mv2.deptname IS NULL then s else source.s + mv2.s end
       RexNode leftRef = rexBuilder.makeInputRef(
           joinLeftInput.getRowType().getFieldList().get(leftPos).getType(), leftPos);
+      filterConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL,

Review comment:
       This `IS_NULL` filter should not be here since we do not have any guarantees the aggregate results may or may not produce nulls?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
##########
@@ -117,17 +116,17 @@ public void onMatch(RelOptRuleCall call) {
       projExprs.add(rightRef);
       joinConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.EQUALS,

Review comment:
       Instead of doing the transformation in the rewrite to AST method, let's do it here. That will be more consistent and decrease rewriting at the AST level.
   
   In particular, this should be `SqlStdOperatorTable .IS_NOT_DISTINCT_FROM`.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
##########
@@ -158,7 +157,8 @@ public void onMatch(RelOptRuleCall call) {
             + " recognized: " + aggCall);
       }
       projExprs.add(rexBuilder.makeCall(SqlStdOperatorTable.CASE,
-          ImmutableList.of(caseFilterCond, rightRef, elseReturn)));
+          ImmutableList.of(rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL,
+                  ImmutableList.of(leftRef)), rightRef, elseReturn)));
     }
     RexNode joinCond = RexUtil.composeConjunction(rexBuilder, joinConjs);
     RexNode filterCond = RexUtil.composeConjunction(rexBuilder, filterConjs);

Review comment:
       The .OR condition below would change too. Each branch of the OR is supposed to filter either insert/update operation. Can we use `NOT` on top of the join condition to create the second disjunct for the OR?
   
   ```
   // (mv2.deptno <=> source.deptno AND mv2.deptname <=> source.deptname)
   //  OR NOT(mv2.deptno <=> source.deptno AND mv2.deptname <=> source.deptname)
   ```

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
##########
@@ -117,17 +116,17 @@ public void onMatch(RelOptRuleCall call) {
       projExprs.add(rightRef);
       joinConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.EQUALS,
           ImmutableList.of(leftRef, rightRef)));
-      filterConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL,

Review comment:
       In this case, this filter condition should be the same one that it is introduced for the join operator (with `SqlStdOperatorTable .IS_NOT_DISTINCT_FROM`)

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
##########
@@ -1118,6 +1118,18 @@ Table materializeCTE(String cteName, CTEClause cte) throws HiveException {
   }
 
   private void fixUpASTAggregateIncrementalRebuild(ASTNode newAST) throws SemanticException {
+    // Replace equality operators with null safe equality operators in join condition

Review comment:
       This will probably change if the handling is done above. I am hoping the method can stay almost as it was, except for the condition in old L1255 to infer the insert vs update branch, which could probably be done based on `NOT`?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
##########
@@ -117,17 +116,17 @@ public void onMatch(RelOptRuleCall call) {
       projExprs.add(rightRef);
       joinConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.EQUALS,
           ImmutableList.of(leftRef, rightRef)));
-      filterConjs.add(rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL,
-          ImmutableList.of(leftRef)));
     }
     // 3) Add the expressions that correspond to the aggregation
     // functions
-    RexNode caseFilterCond = RexUtil.composeConjunction(rexBuilder, filterConjs);
+    List<RexNode> filterConjs = new ArrayList<>();
     for (int i = 0, leftPos = groupCount, rightPos = totalCount + groupCount;
          leftPos < totalCount; i++, leftPos++, rightPos++) {
       // case when mv2.deptno IS NULL AND mv2.deptname IS NULL then s else source.s + mv2.s end

Review comment:
       This block should be reformulated to the following:
   ```
   // case when mv2.deptno <=> source.deptno AND mv2.deptname <=> source.deptname then source.s + mv2.s else s end
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org