You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@spark.apache.org by we...@apache.org on 2023/02/01 09:41:23 UTC

[spark] branch branch-3.3 updated: [SPARK-42259][SQL] ResolveGroupingAnalytics should take care of Python UDAF

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 80e8df11d7e [SPARK-42259][SQL] ResolveGroupingAnalytics should take care of Python UDAF
80e8df11d7e is described below

commit 80e8df11d7e2c135ef707c1c1626b976a8dc09a0
Author: Wenchen Fan <we...@databricks.com>
AuthorDate: Wed Feb 1 17:36:14 2023 +0800

    [SPARK-42259][SQL] ResolveGroupingAnalytics should take care of Python UDAF
    
    This is a long-standing correctness issue with Python UDAF and grouping analytics. The rule `ResolveGroupingAnalytics` should take care of Python UDAF when matching aggregate expressions.
    
    bug fix
    
    Yes, the query result was wrong before
    
    existing tests
    
    Closes #39824 from cloud-fan/python.
    
    Authored-by: Wenchen Fan <we...@databricks.com>
    Signed-off-by: Wenchen Fan <we...@databricks.com>
    (cherry picked from commit 1219c8492376e038894111cd5d922229260482e7)
    Signed-off-by: Wenchen Fan <we...@databricks.com>
---
 .../main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala    | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 84aa06baaff..881f2cc2078 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -617,7 +617,7 @@ class Analyzer(override val catalogManager: CatalogManager)
         // AggregateExpression should be computed on the unmodified value of its argument
         // expressions, so we should not replace any references to grouping expression
         // inside it.
-        case e: AggregateExpression =>
+        case e if AggregateExpression.isAggregate(e) =>
           aggsBuffer += e
           e
         case e if isPartOfAggregation(e) => e


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org