You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "dtenedor (via GitHub)" <gi...@apache.org> on 2023/08/24 22:36:48 UTC

[GitHub] [spark] dtenedor commented on a diff in pull request #42663: [SPARK-44952][SQL][PYTHON] Support named arguments in aggregate Pandas UDFs

dtenedor commented on code in PR #42663:
URL: https://github.com/apache/spark/pull/42663#discussion_r1304938615


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -3021,6 +3021,13 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor
           ne
         case e: Expression if e.foldable =>
           e // No need to create an attribute reference if it will be evaluated as a Literal.
+        case e: NamedArgumentExpression =>
+          // For NamedArgumentExpression, we extract the value and replace it with
+          // an AttributeReference (with an internal column name, e.g. "_w0").
+          NamedArgumentExpression(
+            e.key,
+            extractedExprMap.getOrElseUpdate(e.canonicalized,
+              Alias(e.value, s"_w${extractedExprMap.size}")()).toAttribute)

Review Comment:
   Good question, the reason the `NamedArgumentExpression` is marked as `Unevaluable` is because the intention is for the analyzer to match the provided argument (name, value) pairs for a function call and compare them against the expected ordered parameter list (including parameter names and types) of the function signature.
   
   By the end of analysis, these expressions should be gone as a result of rearranging the provided function arguments to match the expected order, if necessary.
   
   We might just want to deduplicate the logic from L3034-3035 into one place that we can reuse here, but the general idea looks right.
   
   [1] https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/FunctionBuilderBase.scala#L74-L179



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org