You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2018/08/25 14:38:17 UTC
[GitHub] spark pull request #21330: [SPARK-22234] Support distinct window functions
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21330#discussion_r212800158
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
@@ -1883,7 +1883,19 @@ class Analyzer(
// Second, we group extractedWindowExprBuffer based on their Partition and Order Specs.
val groupedWindowExpressions = extractedWindowExprBuffer.groupBy { expr =>
val distinctWindowSpec = expr.collect {
- case window: WindowExpression => window.windowSpec
+ case window: WindowExpression =>
+ val winExpr = window.windowFunction
+ val distinctOpt = winExpr.find (expr => expr.isInstanceOf[AggregateExpression]
+ && expr.asInstanceOf[AggregateExpression].isDistinct)
+ if (distinctOpt.nonEmpty && window.windowSpec.orderSpec.nonEmpty) {
+ failAnalysis(s"ORDER BY cannot be used with DISTINCT: $window")
--- End diff --
Just out of curiosity, does hive have the same limitation? If so, the current way, roughly ordered rows and checking previous row for distinct windows makes sense to me.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org