You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "rednaxelafx (via GitHub)" <gi...@apache.org> on 2023/03/21 00:38:10 UTC

[GitHub] [spark] rednaxelafx commented on pull request #40488: [SPARK-42851][SQL] Replace EquivalentExpressions with mutable map in PhysicalAggregation

rednaxelafx commented on PR #40488:
URL: https://github.com/apache/spark/pull/40488#issuecomment-1477131544

   Before the recent rounds of changes to EquivalentExpressions, the old `addExprTree` used to call `addExpr` in its core:
   https://github.com/apache/spark/blob/branch-2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala#L90
   That was still the case when `PhysicalAggregation` started using this mechanism to deduplicate expressions. I guess it started becoming "detached" from the main path when the recent refactoring happened that allows updating a separate equivalence map instead of the "main" one.
   
   Your proposed PR here further orphans that function from any actual use. Which is okay for keeping binary compatibility as much as possible.
   The inlined dedup logic in `PhysicalAggregation` looks rather ugly though. I don't have a strong opinion about that as long as other reviewers are okay. I'd prefer still retaining some sort of encapsulated collection for the dedup usage.
   
   BTW I updated my PR's test case because it makes more sense to check the return value from `addExpr` on the second invocation rather than on the first (both "not supported expression" and actual new expression cases would have gotten a `false` return value if we add that guard to the `addExpr()` function).
   https://github.com/apache/spark/pull/40473/commits/28d101ee6765c5453189fa62d6b8ade1568d99d2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org