You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/08/05 14:51:26 UTC

[GitHub] [spark] skonto commented on issue #25362: [SPARK-27921][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-analytics.sql'

skonto commented on issue #25362: [SPARK-27921][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-analytics.sql'
URL: https://github.com/apache/spark/pull/25362#issuecomment-518266785
 
 
   @HyukjinKwon fyi. One thing I noticed is that:
   
   ```
   -SELECT a, b, SUM(b) FROM testData GROUP BY a, b WITH ROLLUP
   +SELECT a, b, udf(SUM(b)) FROM testData GROUP BY udf(a), b WITH ROLLUP
    -- !query 4 schema
   -struct<a:int,b:int,sum(b):bigint>
   +struct<>
    -- !query 4 output
   -1      1       1
   -1      2       2
   -1      NULL    3
   -2      1       1
   -2      2       2
   -2      NULL    3
   -3      1       1
   -3      2       2
   -3      NULL    3
   -NULL   NULL    9
   +org.apache.spark.sql.AnalysisException
   +expression 'testdata.`a`' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() (or first_value) if you don't care which value you get.;
    ```
   Shouldnt this be allowed?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org