You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/06/01 00:56:55 UTC

[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19193
  
    For 1), yes let's forbid it.
    
    For 2), My feeling is, in `Dataset` API we don't need `having` because it's easy to change the order of the operator, users can call `filter` first then `agg`. While in SQL you will need subquery so `having` is convenient.
    
    For `df.groupBy('a).agg(max('b), rank().over(window)).where(sum('b) === 5)`, I think it's valid to fail, as Spark is not smart enough to rewrite your query and make it work. If we can find a way to rewrite and fix the query, we can support it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org