You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2018/08/25 14:38:17 UTC

[GitHub] spark pull request #21330: [SPARK-22234] Support distinct window functions

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21330#discussion_r212800158
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -1883,7 +1883,19 @@ class Analyzer(
           // Second, we group extractedWindowExprBuffer based on their Partition and Order Specs.
           val groupedWindowExpressions = extractedWindowExprBuffer.groupBy { expr =>
             val distinctWindowSpec = expr.collect {
    -          case window: WindowExpression => window.windowSpec
    +          case window: WindowExpression =>
    +            val winExpr = window.windowFunction
    +            val distinctOpt = winExpr.find (expr => expr.isInstanceOf[AggregateExpression]
    +                && expr.asInstanceOf[AggregateExpression].isDistinct)
    +            if (distinctOpt.nonEmpty && window.windowSpec.orderSpec.nonEmpty) {
    +              failAnalysis(s"ORDER BY cannot be used with DISTINCT: $window")
    --- End diff --
    
    Just out of curiosity, does hive have the same limitation? If so, the current way, roughly ordered rows and checking previous row for distinct windows makes sense to me.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org