You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/07 17:57:54 UTC

[GitHub] [spark] arkguil commented on issue #22696: [SPARK-25708][SQL] HAVING without GROUP BY means global aggregate

arkguil commented on issue #22696: [SPARK-25708][SQL] HAVING without GROUP BY means global aggregate
URL: https://github.com/apache/spark/pull/22696#issuecomment-490184967
 
 
   @cloud-fan / @gatorsmile , just stumbled on this while investigating an issue with a query while migrating to 2.4...
   
   Seems like the fix over simplified the original intent. It should be totally ok to do something like 
   
   `select id from range(10) having id > 5`
   
   Having is applied on the result of `select id from range(10)`, and since id is in the resultset, this should not fail with `grouping expressions sequence is empty, and '`id`' is not an aggregate function`.
   
   The previous SQL should be interpreted as 
   
   `select id from range(10) group by id having id > 5`
   
   Which is what the previous plan was doing... This is easier to see when using a window function:
   
   `select id, max(id) over () as `max_id` from range(10) where id > 5 having max_id = id`
   
   The window will be generated then the filter applied on the result. You can't apply a where on `max_id` since it is only available after `select id, max(id) over () as `max_id` from range(10) where id > 5` is executed.
   
   Can you explain what this change fixes exactly?
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org