You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2020/07/14 07:43:51 UTC

[GitHub] [calcite] danny0405 opened a new pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

danny0405 opened a new pull request #2065:
URL: https://github.com/apache/calcite/pull/2065


   …odes normalization (part1)
   
   * Add a new component named RexNormalize
   * By default, we only normalize the RexCalls during planning phrase, and
   there is no way to normalize it when constructing the calls now
   * Recover the plan diffs
   * Changes the RexCall#equals to be semantic equivalent, which i think is
   not a good design, we should promote to have a digest abstraction just
   like RelNode


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455547776



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();
+  }
+
+  /** Computes the hashcode of a rex call, we ignore the operands sequence when
+   * the call is symmetrical. */
+  public static int hashCode(SqlOperator operator, List<RexNode> operands) {
+    if (!allowsNormalize()) {
+      return Objects.hash(operator, operands);
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return Objects.hash(operator, operands);
+    }
+    if (isSymmetricalCall(operator, operands.get(0), operands.get(1))) {
+      return Objects.hash(operator, unorderedHash(operands, 17));
+    }
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final int x = kind.reverse().compareTo(kind);
+    if (x < 0) {
+      // Note that Objects.hash([obj1, obj2]) is different from
+      // Objects.hash(obj1, obj2)
+      return Objects.hash(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operands.get(1), operands.get(0)));
+    } else {
+      return Objects.hash(operator, operands);
+    }
+  }
+
+  /** Returns whether a call is symmetrical. **/
+  private static boolean isSymmetricalCall(
+      SqlOperator operator,
+      RexNode operand0,
+      RexNode operand1) {
+    return operator.isSymmetrical()
+        || SqlKind.SYMMETRICAL_SAME_ARG_TYPE.contains(operator.getKind())
+            && SqlTypeUtil.equalSansNullability(operand0.getType(), operand1.getType());
+  }
+
+  /**
+   * The digest of {@code RexNode} is normalized by default.
+   *
+   * @return true if the digest needs to be normalized
+   */
+  private static boolean allowsNormalize() {
+    return CalciteSystemProperty.ENABLE_REX_DIGEST_NORMALIZE.value();
+  }
+
+  /** Compute a hash that is symmetric in its arguments - that is a hash
+   *  where the order of appearance of elements does not matter.
+   *  This is useful for hashing symmetric rex calls, for example.
+   */
+  private static int unorderedHash(List<?> xs, int seed) {

Review comment:
       Agree, would cache the normalized value instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454321119



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(
+        or(input(tBool(), 1), input(tBool(), 0)),

Review comment:
       Can you please use `vBool(1)` instead of `input(tBool(), 1)`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454935116



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(or(vBool(1), vBool(0)), vBool(0));
+    final RexNode node1 = and(vBool(0), or(vBool(0), vBool(1)));
+    assertNodeEquals(node0, node1, true);

Review comment:
       It looks like the nodes are used only once, so I would suggest to inline `node0`, `node1` and similar variables.
   
   For instance:
   
   ```java
   assertNodeEquals( // or assertNodeNotEqual
       and(or(vBool(1), vBool(0)), vBool(0)),
       and(vBool(0), or(vBool(0), vBool(1)))
   );
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454935116



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(or(vBool(1), vBool(0)), vBool(0));
+    final RexNode node1 = and(vBool(0), or(vBool(0), vBool(1)));
+    assertNodeEquals(node0, node1, true);

Review comment:
       It looks like the nodes are used only once, so I would suggest to inline `node0`, `node1` and similar variables.
   
   For instance:
   
   ```java
   assertNodeEquals(
       and(or(vBool(1), vBool(0)), vBool(0)),
       and(vBool(0), or(vBool(0), vBool(1)))
   );
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456172459



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       We might need to discuss on the mail thread about the name, personally i have no strong objections on the name, i did prefer `symmetrical` than `inputNonSensitive`, the latter is not even a terminology.
   
   The `SqlKind.SYMMETRICAL` was first added in 1.22 by @vlsi , maybe he would give better explanation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454319183



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(
+        or(input(tBool(), 1), input(tBool(), 0)),
+        input(tBool(), 0));
+    final RexNode node1 = and(
+        input(tBool(), 0),
+        or(input(tBool(), 0), input(tBool(), 1)));
+    assertThat(node0.equals(node1), is(true));
+
+    final RexNode node2 = eq(input(tVarchar(), 0), literal("0123456789012345"));
+    final RexNode node3 = eq(literal("0123456789012345"), input(tVarchar(), 0));
+    assertThat(node2.equals(node3), is(true));

Review comment:
       Please refactor this `assertThat(node2.equals(node3), is(true));` into a helper method, so the failure message contains the proper context information (e.g. nodes that are compared with each other)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456172459



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       We might need to discuss on the mail thread about the name, personally i have no strong objections on the name, personally i prefer `symmetrical` than `inputNonSensitive`, the latter is not even a terminology.
   
   The `SqlKind.SYMMETRICAL` was first added in 1.22 by @vlsi , maybe he would give better explanation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455545086



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();
+  }
+
+  /** Computes the hashcode of a rex call, we ignore the operands sequence when
+   * the call is symmetrical. */
+  public static int hashCode(SqlOperator operator, List<RexNode> operands) {
+    if (!allowsNormalize()) {
+      return Objects.hash(operator, operands);
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return Objects.hash(operator, operands);
+    }
+    if (isSymmetricalCall(operator, operands.get(0), operands.get(1))) {
+      return Objects.hash(operator, unorderedHash(operands, 17));
+    }
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final int x = kind.reverse().compareTo(kind);
+    if (x < 0) {
+      // Note that Objects.hash([obj1, obj2]) is different from
+      // Objects.hash(obj1, obj2)
+      return Objects.hash(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operands.get(1), operands.get(0)));
+    } else {
+      return Objects.hash(operator, operands);
+    }
+  }
+
+  /** Returns whether a call is symmetrical. **/
+  private static boolean isSymmetricalCall(
+      SqlOperator operator,
+      RexNode operand0,
+      RexNode operand1) {
+    return operator.isSymmetrical()
+        || SqlKind.SYMMETRICAL_SAME_ARG_TYPE.contains(operator.getKind())
+            && SqlTypeUtil.equalSansNullability(operand0.getType(), operand1.getType());
+  }
+
+  /**
+   * The digest of {@code RexNode} is normalized by default.
+   *
+   * @return true if the digest needs to be normalized
+   */
+  private static boolean allowsNormalize() {
+    return CalciteSystemProperty.ENABLE_REX_DIGEST_NORMALIZE.value();
+  }
+
+  /** Compute a hash that is symmetric in its arguments - that is a hash
+   *  where the order of appearance of elements does not matter.
+   *  This is useful for hashing symmetric rex calls, for example.
+   */
+  private static int unorderedHash(List<?> xs, int seed) {

Review comment:
       Could you please inline the seed into the method?
   I don't think we would need to change it often, so having less unknowns in the API helps.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455712592



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();

Review comment:
       Oh, i missed it, what's the difference.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456172459



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       We might need to discuss on the mail thread about the name, personally i have no strong objections on the name, but i prefer `symmetrical` is better than `inputNonSensitive`, the latter is not even a terminology.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 edited a comment on pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 edited a comment on pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#issuecomment-659110386


   > @danny0405 Can we merge the PR after 1.24 is released? Thanks.
   
   I think we have to, it is still under discussion. But it is best that we can include it in 1.24 because it reduces many plan diffs which are not necessary.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454322020



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(
+        or(input(tBool(), 1), input(tBool(), 0)),

Review comment:
       I can, let me fix it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454941737



##########
File path: core/src/test/resources/org/apache/calcite/test/TopDownOptTest.xml
##########
@@ -606,7 +606,7 @@ on r.job=s.job and r.ename=s.ename]]>
     <Resource name="planBefore">
       <![CDATA[
 LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3], ENAME0=[$4], JOB0=[$5], MGR=[$6], EXPR$3=[$7])
-  LogicalJoin(condition=[AND(=($1, $5), =($0, $4))], joinType=[inner])
+  LogicalJoin(condition=[AND(=($5, $1), =($4, $0))], joinType=[inner])

Review comment:
       This is just the original plan, without any operands reordering, it is changed in CALCITE-3786 and revert back in this PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] hsyuan commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
hsyuan commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456173374



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       Sure, you can send out to the mailing list.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] hsyuan commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
hsyuan commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456170450



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       So would you rather believe that Scala guys are always right, even when the dictionary clearly says otherwise?
   I do believe you have the common sense of what symmetric encryption algorithm is.
   FYI, the original method name of `MurmurHash3#unorderedHash` was `symmetricHash`, it was later [renamed](https://github.com/scala/scala/commit/60fb9ec19b50cd8059122ccffd72014b3eefc51f#diff-1c768a8500fbe02dc18f6333be992081R98) to `unorderedHash`, but the author forgot to update the comment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455549625



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.hamcrest.CoreMatchers;
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.CoreMatchers.equalTo;
+import static org.hamcrest.MatcherAssert.assertThat;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    assertNodeEquals(
+        and(or(vBool(1), vBool(0)), vBool(0)),
+        and(vBool(0), or(vBool(0), vBool(1))));

Review comment:
       Can you please add a test case like
   
   ```
       assertNodeEquals(
           and(or(vBool(1), vBool(0)), vBool(0)),
           and(or(vBool(0), vBool(1)), vBool(0))
       );
   ```
   
   ?
   
   Does it pass?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454942390



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(or(vBool(1), vBool(0)), vBool(0));
+    final RexNode node1 = and(vBool(0), or(vBool(0), vBool(1)));
+    assertNodeEquals(node0, node1, true);

Review comment:
       Will change to inline, thanks ~




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] julianhyde commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
julianhyde commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456231827



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       I'm siding with @danny0405 on this one. His use conforms to the mathematical concept of a [symmetric function](https://en.wikipedia.org/wiki/Symmetric_function).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] hsyuan commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
hsyuan commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456685038



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       Well, if you think the meaning the method name `isSymmetrical` conveys is crystal clear and without any ambiguity...
   
   It has several meanings:
   - The one you mentioned above, input order insensitive (forgive my ignorance)
   - https://www.thefreedictionary.com/Symetrical or https://en.wikipedia.org/wiki/Symmetry_(geometry)
   - [Symmetric encryption](https://en.wikipedia.org/wiki/Symmetric-key_algorithm) uses it to denote encryption and decryption use the same key
   
   I do believe there is a reason Scala renamed `symmetricHash` to `unorderedHash`, [Cockroachdb](https://github.com/cockroachdb/cockroach/blob/a6a58f4bd1d6b9a7d9f0de2c690d35b51d4f41ca/pkg/sql/opt/optbuilder/groupby.go#L250) call the method `isOrderingSensitive` instead of others.
   
   For those who lacks of mathematical background, I don't think it is a good naming public API, which is not self explanatory.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454931771



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(or(vBool(1), vBool(0)), vBool(0));
+    final RexNode node1 = and(vBool(0), or(vBool(0), vBool(1)));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = eq(vVarchar(0), literal("0123456789012345"));
+    final RexNode node3 = eq(literal("0123456789012345"), vVarchar(0));
+    assertNodeEquals(node2, node3, true);
+
+    final RexNode node4 = eq(vVarchar(0), literal("01"));
+    final RexNode node5 = eq(literal("01"), vVarchar(0));
+    assertNodeEquals(node4, node5, true);
+  }
+
+  @Test void reversibleNormalizedToLess() {
+    // Same type operands.
+    final RexNode node0 = lt(vBool(0), vBool(0));
+    final RexNode node1 = gt(vBool(0), vBool(0));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = le(vBool(0), vBool(0));
+    final RexNode node3 = ge(vBool(0), vBool(0));
+    assertNodeEquals(node2, node3, true);
+
+    // Different type operands.
+    final RexNode node4 = lt(vSmallInt(0), vInt(1));
+    final RexNode node5 = gt(vInt(1), vSmallInt(0));
+    assertNodeEquals(node4, node5, true);
+
+    final RexNode node6 = le(vSmallInt(0), vInt(1));
+    final RexNode node7 = ge(vInt(1), vSmallInt(0));
+    assertNodeEquals(node6, node7, true);
+  }
+
+  @Test void reversibleDifferentArgTypesShouldNotBeShuffled() {
+    final RexNode node0 = plus(vSmallInt(1), vInt(0));
+    final RexNode node1 = plus(vInt(0), vSmallInt(1));
+    assertNodeEquals(node0, node1, false);
+
+    final RexNode node2 = mul(vSmallInt(0), vInt(1));
+    final RexNode node3 = mul(vInt(1), vSmallInt(0));
+    assertNodeEquals(node2, node3, false);
+  }
+
+  @Test void reversibleDifferentNullabilityArgsAreNormalized() {
+    final RexNode node0 = plus(vIntNotNull(0), vInt(1));
+    final RexNode node1 = plus(vInt(1), vIntNotNull(0));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = mul(vIntNotNull(1), vInt(0));
+    final RexNode node3 = mul(vInt(0), vIntNotNull(1));
+    assertNodeEquals(node2, node3, true);
+  }
+
+  @Test void symmetricalDifferentArgOps() {
+    final RexNode node0 = eq(vBool(0), vBool(1));
+    final RexNode node1 = eq(vBool(1), vBool(0));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = ne(vBool(0), vBool(1));
+    final RexNode node3 = ne(vBool(1), vBool(0));
+    assertNodeEquals(node2, node3, true);
+  }
+
+  @Test void reversibleDifferentArgOps() {
+    final RexNode node0 = lt(vBool(0), vBool(1));
+    final RexNode node1 = lt(vBool(1), vBool(0));
+    assertNodeEquals(node0, node1, false);
+
+    final RexNode node2 = le(vBool(0), vBool(1));
+    final RexNode node3 = le(vBool(1), vBool(0));
+    assertNodeEquals(node2, node3, false);
+
+    final RexNode node4 = gt(vBool(0), vBool(1));
+    final RexNode node5 = gt(vBool(1), vBool(0));
+    assertNodeEquals(node4, node5, false);
+
+    final RexNode node6 = ge(vBool(0), vBool(1));
+    final RexNode node7 = ge(vBool(1), vBool(0));
+    assertNodeEquals(node6, node7, false);
+  }
+
+  /** Assert whether two rex nodes are equal. */
+  private static void assertNodeEquals(RexNode node1, RexNode node2, boolean equal) {
+    StringBuilder reason = new StringBuilder("The two rex nodes expect to be ");
+    if (!equal) {
+      reason.append("not ");
+    }
+    reason.append("equal");

Review comment:
       Well, let me fix it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454935728



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(or(vBool(1), vBool(0)), vBool(0));
+    final RexNode node1 = and(vBool(0), or(vBool(0), vBool(1)));
+    assertNodeEquals(node0, node1, true);

Review comment:
       An alternative option is to use `Parameterized` tests.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456172459



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       We might need to discuss on the mail thread about the name, personally i have no strong objections on the name, personally i prefer `symmetrical` than `inputNonSensitive`, the latter is not even a terminology.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454929436



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(or(vBool(1), vBool(0)), vBool(0));
+    final RexNode node1 = and(vBool(0), or(vBool(0), vBool(1)));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = eq(vVarchar(0), literal("0123456789012345"));
+    final RexNode node3 = eq(literal("0123456789012345"), vVarchar(0));
+    assertNodeEquals(node2, node3, true);
+
+    final RexNode node4 = eq(vVarchar(0), literal("01"));
+    final RexNode node5 = eq(literal("01"), vVarchar(0));
+    assertNodeEquals(node4, node5, true);
+  }
+
+  @Test void reversibleNormalizedToLess() {
+    // Same type operands.
+    final RexNode node0 = lt(vBool(0), vBool(0));
+    final RexNode node1 = gt(vBool(0), vBool(0));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = le(vBool(0), vBool(0));
+    final RexNode node3 = ge(vBool(0), vBool(0));
+    assertNodeEquals(node2, node3, true);
+
+    // Different type operands.
+    final RexNode node4 = lt(vSmallInt(0), vInt(1));
+    final RexNode node5 = gt(vInt(1), vSmallInt(0));
+    assertNodeEquals(node4, node5, true);
+
+    final RexNode node6 = le(vSmallInt(0), vInt(1));
+    final RexNode node7 = ge(vInt(1), vSmallInt(0));
+    assertNodeEquals(node6, node7, true);
+  }
+
+  @Test void reversibleDifferentArgTypesShouldNotBeShuffled() {
+    final RexNode node0 = plus(vSmallInt(1), vInt(0));
+    final RexNode node1 = plus(vInt(0), vSmallInt(1));
+    assertNodeEquals(node0, node1, false);
+
+    final RexNode node2 = mul(vSmallInt(0), vInt(1));
+    final RexNode node3 = mul(vInt(1), vSmallInt(0));
+    assertNodeEquals(node2, node3, false);
+  }
+
+  @Test void reversibleDifferentNullabilityArgsAreNormalized() {
+    final RexNode node0 = plus(vIntNotNull(0), vInt(1));
+    final RexNode node1 = plus(vInt(1), vIntNotNull(0));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = mul(vIntNotNull(1), vInt(0));
+    final RexNode node3 = mul(vInt(0), vIntNotNull(1));
+    assertNodeEquals(node2, node3, true);
+  }
+
+  @Test void symmetricalDifferentArgOps() {
+    final RexNode node0 = eq(vBool(0), vBool(1));
+    final RexNode node1 = eq(vBool(1), vBool(0));
+    assertNodeEquals(node0, node1, true);
+
+    final RexNode node2 = ne(vBool(0), vBool(1));
+    final RexNode node3 = ne(vBool(1), vBool(0));
+    assertNodeEquals(node2, node3, true);
+  }
+
+  @Test void reversibleDifferentArgOps() {
+    final RexNode node0 = lt(vBool(0), vBool(1));
+    final RexNode node1 = lt(vBool(1), vBool(0));
+    assertNodeEquals(node0, node1, false);
+
+    final RexNode node2 = le(vBool(0), vBool(1));
+    final RexNode node3 = le(vBool(1), vBool(0));
+    assertNodeEquals(node2, node3, false);
+
+    final RexNode node4 = gt(vBool(0), vBool(1));
+    final RexNode node5 = gt(vBool(1), vBool(0));
+    assertNodeEquals(node4, node5, false);
+
+    final RexNode node6 = ge(vBool(0), vBool(1));
+    final RexNode node7 = ge(vBool(1), vBool(0));
+    assertNodeEquals(node6, node7, false);
+  }
+
+  /** Assert whether two rex nodes are equal. */
+  private static void assertNodeEquals(RexNode node1, RexNode node2, boolean equal) {
+    StringBuilder reason = new StringBuilder("The two rex nodes expect to be ");
+    if (!equal) {
+      reason.append("not ");
+    }
+    reason.append("equal");

Review comment:
       @danny0405 , the reason must include the original node representations.
   
   The error message like `The two rex nodes expect to be equal, expected: true got: false` does not really help to understand the nature of the failure.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] hsyuan commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
hsyuan commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455806392



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();
+  }
+
+  /** Computes the hashcode of a rex call, we ignore the operands sequence when
+   * the call is symmetrical. */
+  public static int hashCode(SqlOperator operator, List<RexNode> operands) {
+    if (!allowsNormalize()) {
+      return Objects.hash(operator, operands);
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return Objects.hash(operator, operands);
+    }
+    if (isSymmetricalCall(operator, operands.get(0), operands.get(1))) {
+      return Objects.hash(operator, unorderedHash(operands, 17));
+    }
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final int x = kind.reverse().compareTo(kind);
+    if (x < 0) {
+      // Note that Objects.hash([obj1, obj2]) is different from
+      // Objects.hash(obj1, obj2)
+      return Objects.hash(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operands.get(1), operands.get(0)));
+    } else {
+      return Objects.hash(operator, operands);
+    }
+  }
+
+  /** Returns whether a call is symmetrical. **/
+  private static boolean isSymmetricalCall(
+      SqlOperator operator,
+      RexNode operand0,
+      RexNode operand1) {
+    return operator.isSymmetrical()
+        || SqlKind.SYMMETRICAL_SAME_ARG_TYPE.contains(operator.getKind())
+            && SqlTypeUtil.equalSansNullability(operand0.getType(), operand1.getType());
+  }
+
+  /**
+   * The digest of {@code RexNode} is normalized by default.
+   *
+   * @return true if the digest needs to be normalized
+   */
+  private static boolean allowsNormalize() {
+    return CalciteSystemProperty.ENABLE_REX_DIGEST_NORMALIZE.value();
+  }
+
+  /** Compute a hash that is symmetric in its arguments - that is a hash
+   *  where the order of appearance of elements does not matter.
+   *  This is useful for hashing symmetric rex calls, for example.
+   */
+  private static int unorderedHash(List<?> xs, int seed) {

Review comment:
       The thing is it is quite possible that you don't even get chance to call `equals` at all, in which case, it is wasteful to normalize it in computing hashcode.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455815612



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();
+  }
+
+  /** Computes the hashcode of a rex call, we ignore the operands sequence when
+   * the call is symmetrical. */
+  public static int hashCode(SqlOperator operator, List<RexNode> operands) {
+    if (!allowsNormalize()) {
+      return Objects.hash(operator, operands);
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return Objects.hash(operator, operands);
+    }
+    if (isSymmetricalCall(operator, operands.get(0), operands.get(1))) {
+      return Objects.hash(operator, unorderedHash(operands, 17));
+    }
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final int x = kind.reverse().compareTo(kind);
+    if (x < 0) {
+      // Note that Objects.hash([obj1, obj2]) is different from
+      // Objects.hash(obj1, obj2)
+      return Objects.hash(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operands.get(1), operands.get(0)));
+    } else {
+      return Objects.hash(operator, operands);
+    }
+  }
+
+  /** Returns whether a call is symmetrical. **/
+  private static boolean isSymmetricalCall(
+      SqlOperator operator,
+      RexNode operand0,
+      RexNode operand1) {
+    return operator.isSymmetrical()
+        || SqlKind.SYMMETRICAL_SAME_ARG_TYPE.contains(operator.getKind())
+            && SqlTypeUtil.equalSansNullability(operand0.getType(), operand1.getType());
+  }
+
+  /**
+   * The digest of {@code RexNode} is normalized by default.
+   *
+   * @return true if the digest needs to be normalized
+   */
+  private static boolean allowsNormalize() {
+    return CalciteSystemProperty.ENABLE_REX_DIGEST_NORMALIZE.value();
+  }
+
+  /** Compute a hash that is symmetric in its arguments - that is a hash
+   *  where the order of appearance of elements does not matter.
+   *  This is useful for hashing symmetric rex calls, for example.
+   */
+  private static int unorderedHash(List<?> xs, int seed) {

Review comment:
       That makes sense!
   
   In other words, use `unorderedHash` in `hashCode`, and compute/store the normalization only for `equals`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454936852



##########
File path: core/src/test/resources/org/apache/calcite/test/TopDownOptTest.xml
##########
@@ -606,7 +606,7 @@ on r.job=s.job and r.ename=s.ename]]>
     <Resource name="planBefore">
       <![CDATA[
 LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3], ENAME0=[$4], JOB0=[$5], MGR=[$6], EXPR$3=[$7])
-  LogicalJoin(condition=[AND(=($1, $5), =($0, $4))], joinType=[inner])
+  LogicalJoin(condition=[AND(=($5, $1), =($4, $0))], joinType=[inner])

Review comment:
       This change looks odd. Is there a reason why `$5` is placed before `$1`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456143492



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       Not that misleading, i found the `MurmurHash3#unorderedHash` also use the word symmetric to describe the similar semantic.
   
   [1] https://github.com/scala/scala/blob/2.11.x/src/library/scala/util/hashing/MurmurHash3.scala#L88




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] hsyuan commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
hsyuan commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455798440



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       The word SYMMETRICAL is very misleading, especially for public method `isSymmetrical`. It doesn't mean order insensitive.
   https://www.thefreedictionary.com/Symetrical




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 merged pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 merged pull request #2065:
URL: https://github.com/apache/calcite/pull/2065


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456143492



##########
File path: core/src/main/java/org/apache/calcite/sql/SqlOperator.java
##########
@@ -929,6 +929,17 @@ public boolean isDeterministic() {
     return true;
   }
 
+  /**
+   * Returns whether a call to this operator is not sensitive to the operands input order.
+   * An operator is symmetrical if the call returns the same result when
+   * the operands are shuffled.
+   *
+   * <p>By default, returns true for {@link SqlKind#SYMMETRICAL}.
+   */
+  public boolean isSymmetrical() {
+    return SqlKind.SYMMETRICAL.contains(kind);

Review comment:
       Yes, we might have a discussion in the mail thread for the name 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#issuecomment-659110386


   > @danny0405 Can we merge the PR after 1.24 is released? Thanks.
   
   I think we have to, it is still under discussion. But i think it is best that we can include it in 1.24 because it reduces many plan diffs which are not necessary.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r454322335



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.core.Is.is;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+  @Test void digestIsNormalized() {
+    final RexNode node0 = and(
+        or(input(tBool(), 1), input(tBool(), 0)),
+        input(tBool(), 0));
+    final RexNode node1 = and(
+        input(tBool(), 0),
+        or(input(tBool(), 0), input(tBool(), 1)));
+    assertThat(node0.equals(node1), is(true));
+
+    final RexNode node2 = eq(input(tVarchar(), 0), literal("0123456789012345"));
+    final RexNode node3 = eq(literal("0123456789012345"), input(tVarchar(), 0));
+    assertThat(node2.equals(node3), is(true));

Review comment:
       Okey.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455546439



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();
+  }
+
+  /** Computes the hashcode of a rex call, we ignore the operands sequence when
+   * the call is symmetrical. */
+  public static int hashCode(SqlOperator operator, List<RexNode> operands) {
+    if (!allowsNormalize()) {
+      return Objects.hash(operator, operands);
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return Objects.hash(operator, operands);
+    }
+    if (isSymmetricalCall(operator, operands.get(0), operands.get(1))) {
+      return Objects.hash(operator, unorderedHash(operands, 17));
+    }
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final int x = kind.reverse().compareTo(kind);
+    if (x < 0) {
+      // Note that Objects.hash([obj1, obj2]) is different from
+      // Objects.hash(obj1, obj2)
+      return Objects.hash(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operands.get(1), operands.get(0)));
+    } else {
+      return Objects.hash(operator, operands);
+    }
+  }
+
+  /** Returns whether a call is symmetrical. **/
+  private static boolean isSymmetricalCall(
+      SqlOperator operator,
+      RexNode operand0,
+      RexNode operand1) {
+    return operator.isSymmetrical()
+        || SqlKind.SYMMETRICAL_SAME_ARG_TYPE.contains(operator.getKind())
+            && SqlTypeUtil.equalSansNullability(operand0.getType(), operand1.getType());
+  }
+
+  /**
+   * The digest of {@code RexNode} is normalized by default.
+   *
+   * @return true if the digest needs to be normalized
+   */
+  private static boolean allowsNormalize() {
+    return CalciteSystemProperty.ENABLE_REX_DIGEST_NORMALIZE.value();
+  }
+
+  /** Compute a hash that is symmetric in its arguments - that is a hash
+   *  where the order of appearance of elements does not matter.
+   *  This is useful for hashing symmetric rex calls, for example.
+   */
+  private static int unorderedHash(List<?> xs, int seed) {

Review comment:
       By the way, I don't think `unorderedHash` helps here.
   We still need to be able to implement `equals`, and I don't think there's an easy way to compute `equals` without normalization.
   
   Then, I think we could normalize and cache the normalized value, then we can use a regular hash functions rather than `unordered` ones.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] vlsi commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455704975



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();

Review comment:
       Please use `Integer#compare`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] chunweilei commented on pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
chunweilei commented on pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#issuecomment-658750345


   @danny0405 Can we merge the PR after 1.24 is released? Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r455557577



##########
File path: core/src/test/java/org/apache/calcite/rex/RexNormalizeTest.java
##########
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.hamcrest.CoreMatchers;
+import org.junit.jupiter.api.Test;
+
+import static org.hamcrest.CoreMatchers.equalTo;
+import static org.hamcrest.MatcherAssert.assertThat;
+
+/** Test cases for {@link RexNormalize}. */
+class RexNormalizeTest extends RexProgramTestBase {
+
+  @Test void digestIsNormalized() {
+    assertNodeEquals(
+        and(or(vBool(1), vBool(0)), vBool(0)),
+        and(vBool(0), or(vBool(0), vBool(1))));

Review comment:
       Added and passed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [calcite] danny0405 commented on a change in pull request #2065: [CALCITE-4073] Add a new component RexNormalize for more effect rex n…

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #2065:
URL: https://github.com/apache/calcite/pull/2065#discussion_r456173974



##########
File path: core/src/main/java/org/apache/calcite/rex/RexNormalize.java
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.rex;
+
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.util.Pair;
+
+import com.google.common.collect.ImmutableList;
+
+import org.apiguardian.api.API;
+
+import java.util.List;
+import java.util.Objects;
+
+/**
+ * Context required to normalize a row-expression.
+ *
+ * <p>Currently, only simple normalization is supported, such as:
+ *
+ * <ul>
+ *   <li>$2 = $1 &rarr; $1 = $2</li>
+ *   <li>$2 &gt; $1 &rarr; $1 &lt; $2</li>
+ *   <li>1.23 = $1 &rarr; $1 = 1.23</li>
+ *   <li>OR(OR(udf($1), $2), $3) &rarr; OR($3, OR($2, udf($1)))</li>
+ * </ul>
+ *
+ * <p>In the future, this component may extend to support more normalization cases
+ * for general promotion. e.g. the strategy to decide which operand is more complex
+ * should be more smart.
+ *
+ * <p>There is no one normalization strategy that works for all cases, and no consensus about what
+ * the desired strategies should be. So by default, the normalization is disabled. We do force
+ * normalization when computing the digest of {@link RexCall}s during planner planning.
+ */
+public class RexNormalize {
+
+  private RexNormalize() {}
+
+  /**
+   * Normalizes the variables of a rex call.
+   *
+   * @param operator The operator
+   * @param operands The operands
+   *
+   * @return normalized variables of the call or the original
+   * if there is no need to normalize
+   */
+  @API(since = "1.24", status = API.Status.EXPERIMENTAL)
+  public static Pair<SqlOperator, List<RexNode>> normalize(
+      SqlOperator operator,
+      List<RexNode> operands) {
+    final Pair<SqlOperator, List<RexNode>> original = Pair.of(operator, operands);
+    if (!allowsNormalize()) {
+      return original;
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return original;
+    }
+
+    final RexNode operand0 = operands.get(0);
+    final RexNode operand1 = operands.get(1);
+
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final SqlKind reversedKind = kind.reverse();
+    final int x = reversedKind.compareTo(kind);
+    if (x < 0) {
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    if (x > 0) {
+      return original;
+    }
+
+    if (!isSymmetricalCall(operator, operand0, operand1)) {
+      return original;
+    }
+
+    if (reorderOperands(operand0, operand1) < 0) {
+      // $0=$1 is the same as $1=$0, so we make sure the digest is the same for them.
+
+      // When $1 > $0 is normalized, the operation needs to be flipped
+      // so we sort arguments first, then flip the sign.
+      return Pair.of(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operand1, operand0));
+    }
+    return original;
+  }
+
+  /**
+   * Compares two operands to see which one should be normalized to be in front of the other.
+   *
+   * <p>We can always use the #hashCode to reorder the operands, do it as a fallback to keep
+   * good readability.
+   *
+   * @param operand0  First operand
+   * @param operand1  Second operand
+   *
+   * @return non-negative (>=0) if {@code operand0} should be in the front,
+   * negative if {@code operand1} should be in the front
+   */
+  private static int reorderOperands(RexNode operand0, RexNode operand1) {
+    // Reorder the operands based on the SqlKind enumeration sequence,
+    // smaller is in the behind, e.g. the literal is behind of input ref and AND, OR.
+    int x = operand0.getKind().compareTo(operand1.getKind());
+    // If the operands are same kind, use the hashcode to reorder.
+    // Note: the RexInputRef's hash code is its index.
+    return x != 0 ? x : operand1.hashCode() - operand0.hashCode();
+  }
+
+  /** Computes the hashcode of a rex call, we ignore the operands sequence when
+   * the call is symmetrical. */
+  public static int hashCode(SqlOperator operator, List<RexNode> operands) {
+    if (!allowsNormalize()) {
+      return Objects.hash(operator, operands);
+    }
+    // Quick check.
+    if (operands.size() != 2) {
+      return Objects.hash(operator, operands);
+    }
+    if (isSymmetricalCall(operator, operands.get(0), operands.get(1))) {
+      return Objects.hash(operator, unorderedHash(operands, 17));
+    }
+    // If arguments are the same, then we normalize < vs >
+    // '<' == 60, '>' == 62, so we prefer <.
+    final SqlKind kind = operator.getKind();
+    final int x = kind.reverse().compareTo(kind);
+    if (x < 0) {
+      // Note that Objects.hash([obj1, obj2]) is different from
+      // Objects.hash(obj1, obj2)
+      return Objects.hash(
+          SqlStdOperatorTable.reverse(operator),
+          ImmutableList.of(operands.get(1), operands.get(0)));
+    } else {
+      return Objects.hash(operator, operands);
+    }
+  }
+
+  /** Returns whether a call is symmetrical. **/
+  private static boolean isSymmetricalCall(
+      SqlOperator operator,
+      RexNode operand0,
+      RexNode operand1) {
+    return operator.isSymmetrical()
+        || SqlKind.SYMMETRICAL_SAME_ARG_TYPE.contains(operator.getKind())
+            && SqlTypeUtil.equalSansNullability(operand0.getType(), operand1.getType());
+  }
+
+  /**
+   * The digest of {@code RexNode} is normalized by default.
+   *
+   * @return true if the digest needs to be normalized
+   */
+  private static boolean allowsNormalize() {
+    return CalciteSystemProperty.ENABLE_REX_DIGEST_NORMALIZE.value();
+  }
+
+  /** Compute a hash that is symmetric in its arguments - that is a hash
+   *  where the order of appearance of elements does not matter.
+   *  This is useful for hashing symmetric rex calls, for example.
+   */
+  private static int unorderedHash(List<?> xs, int seed) {

Review comment:
       I'm fine to keep the `unorderedHash`, one drawback with it is that the logic to decide whether operands need reordering must be strictly same with the `#normalize`, which i don't like because it is error-prone.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org