You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/11/07 13:24:07 UTC

Re: [PR] [SPARK-35564][SQL] Improve subexpression elimination [spark]

cloud-fan commented on code in PR #41677:
URL: https://github.com/apache/spark/pull/41677#discussion_r1384909292


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala:
##########
@@ -30,211 +30,382 @@ import org.apache.spark.util.Utils
  * This class is used to compute equality of (sub)expression trees. Expressions can be added
  * to this class and they subsequently query for expression equality. Expression trees are
  * considered equal if for the same input(s), the same result is produced.
+ *
+ * Please note that `EquivalentExpressions` is mainly used in subexpression elimination where common
+ * non-leaf expression subtrees are calculated, but there there is one special use case in
+ * `PhysicalAggregation` where `EquivalentExpressions` is used as a mutable set of non-deterministic
+ * expressions. For that special use case we have the `allowLeafExpressions` config.
  */
 class EquivalentExpressions(
-    skipForShortcutEnable: Boolean = SQLConf.get.subexpressionEliminationSkipForShotcutExpr) {
+    skipForShortcutEnable: Boolean = SQLConf.get.subexpressionEliminationSkipForShotcutExpr,
+    minConditionalCount: Option[Double] =
+      Some(SQLConf.get.subexpressionEliminationMinExpectedConditionalEvaluationCount)
+        .filter(_ >= 0d),
+    allowLeafExpressions: Boolean = false) {
+
+  // The subexpressions are stored by height to speed up certain calculations.

Review Comment:
   sorted by height?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org