You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/06/01 00:56:55 UTC
[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19193
For 1), yes let's forbid it.
For 2), My feeling is, in `Dataset` API we don't need `having` because it's easy to change the order of the operator, users can call `filter` first then `agg`. While in SQL you will need subquery so `having` is convenient.
For `df.groupBy('a).agg(max('b), rank().over(window)).where(sum('b) === 5)`, I think it's valid to fail, as Spark is not smart enough to rewrite your query and make it work. If we can find a way to rewrite and fix the query, we can support it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org