You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/04 01:44:02 UTC

[GitHub] [spark] JoshRosen opened a new pull request, #36765: [SPARK-39259][FOLLOWUP] Fix source and binary incompatibilities in transformDownWithSubqueries

JoshRosen opened a new pull request, #36765:
URL: https://github.com/apache/spark/pull/36765

   ### What changes were proposed in this pull request?
   
   This is a followup to #36654. That PR modified `QueryPlan.transformDownWithSubqueries` to add tree pattern pruning.
   
   In this PR, I roll back the change to that method's signature and instead add a new `transformDownWithSubqueriesAndPruning` method.
   
   ### Why are the changes needed?
   
   The original change breaks binary and source compatibility in Catalyst. Technically speaking, Catalyst APIs are considered internal to Spark and are subject to change between minor releases (see [source](https://github.com/apache/spark/blob/bb51add5c79558df863d37965603387d40cc4387/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/package.scala#L20-L24)), but I think it's nice to try to avoid API breakage when possible.
   
   While trying to compile some custom Catalyst code, I ran into issues when trying to call the `transformDownWithSubqueries` method without supplying pattern bits. If I do `transformDownWithSubqueries() { f} ` then I get a compilation error. I think this is due to the first parameter group containing all default parameters.
   
   This PR's solution of adding a new `transformDownWithSubqueriesAndPruning` method solves this problem. It's also more consistent with the naming convention of other pruning-enabled tree transformation methods.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Existing tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #36765: [SPARK-39259][SQL][FOLLOWUP] Fix source and binary incompatibilities in transformDownWithSubqueries

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on PR #36765:
URL: https://github.com/apache/spark/pull/36765#issuecomment-1146546802

   +1, LGTM. Merging to master/3.3.
   Thank you, @JoshRosen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #36765: [SPARK-39259][SQL][FOLLOWUP] Fix source and binary incompatibilities in transformDownWithSubqueries

Posted by GitBox <gi...@apache.org>.
MaxGekk closed pull request #36765: [SPARK-39259][SQL][FOLLOWUP] Fix source and binary incompatibilities in transformDownWithSubqueries
URL: https://github.com/apache/spark/pull/36765


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36765: [SPARK-39259][SQL][FOLLOWUP] Fix source and binary incompatibilities in transformDownWithSubqueries

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #36765:
URL: https://github.com/apache/spark/pull/36765#discussion_r889500183


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -479,18 +479,28 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
    * first to this node, then this node's subqueries and finally this node's children.
    * When the partial function does not apply to a given node, it is left unchanged.
    */
-  def transformDownWithSubqueries(
-    cond: TreePatternBits => Boolean = AlwaysProcess.fn, ruleId: RuleId = UnknownRuleId)
-    (f: PartialFunction[PlanType, PlanType])
-: PlanType = {
+  def transformDownWithSubqueries(f: PartialFunction[PlanType, PlanType]): PlanType = {
+    transformDownWithSubqueriesAndPruning(AlwaysProcess.fn, UnknownRuleId)(f)
+  }
+
+  /**
+   * This method is the top-down (pre-order) counterpart of transformUpWithSubqueries.
+   * Returns a copy of this node where the given partial function has been recursively applied
+   * first to this node, then this node's subqueries and finally this node's children.
+   * When the partial function does not apply to a given node, it is left unchanged.
+   */
+  def transformDownWithSubqueriesAndPruning(

Review Comment:
   I was about to say we shouldn't make these changes for binary compatibility for internal API (e.g., https://github.com/apache/spark/pull/35378) but reading the codes, it looks more like a refactoring. So LGTM from me 2.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org