You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/01 17:44:28 UTC

[GitHub] [spark] allisonwang-db commented on a diff in pull request #38851: [SPARK-41338][SQL] Resolve outer references and normal columns in the same analyzer batch

allisonwang-db commented on code in PR #38851:
URL: https://github.com/apache/spark/pull/38851#discussion_r1037402070


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -2109,6 +2110,51 @@ class Analyzer(override val catalogManager: CatalogManager)
     }
   }
 
+  /**
+   * Resolves `UnresolvedAttribute` to `OuterReference` if we are resolving subquery plans (when
+   * `AnalysisContext.get.outerPlan` is set).
+   */
+  object ResolveOuterReferences extends Rule[LogicalPlan] {
+    override def apply(plan: LogicalPlan): LogicalPlan = {
+      // Only apply this rule if we are resolving subquery plans.
+      if (AnalysisContext.get.outerPlan.isEmpty) return plan
+
+      // We must run these 3 rules first, as they also resolve `UnresolvedAttribute` and have
+      // higher priority than outer reference resolution.
+      val prepared = ResolveAggregateFunctions(ResolveMissingReferences(ResolveReferences(plan)))
+      prepared.resolveOperatorsDownWithPruning(_.containsPattern(UNRESOLVED_ATTRIBUTE)) {
+        // Handle `Generate` specially here, because `Generate.generatorOutput` starts with
+        // `UnresolvedAttribute` but we should never resolve it to outer references. It's a bit
+        // hacky that `Generate` uses `UnresolvedAttribute` to store the generator column names,
+        // we should clean it up later.
+        case g: Generate if g.childrenResolved && !g.resolved =>

Review Comment:
   Do we have a unit test for this case?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -2109,6 +2110,51 @@ class Analyzer(override val catalogManager: CatalogManager)
     }
   }
 
+  /**
+   * Resolves `UnresolvedAttribute` to `OuterReference` if we are resolving subquery plans (when
+   * `AnalysisContext.get.outerPlan` is set).
+   */
+  object ResolveOuterReferences extends Rule[LogicalPlan] {
+    override def apply(plan: LogicalPlan): LogicalPlan = {
+      // Only apply this rule if we are resolving subquery plans.
+      if (AnalysisContext.get.outerPlan.isEmpty) return plan
+
+      // We must run these 3 rules first, as they also resolve `UnresolvedAttribute` and have
+      // higher priority than outer reference resolution.
+      val prepared = ResolveAggregateFunctions(ResolveMissingReferences(ResolveReferences(plan)))

Review Comment:
   I guess one disadvantage of running these three rules inside ResolveOuterReferences is that they are not visible in the plan change log.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org