You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by maropu <gi...@git.apache.org> on 2018/01/07 03:58:36 UTC

[GitHub] spark issue #20174: [SPARK-22951][SQL] aggregate should not produce empty ro...

Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/20174
  
    (This is not related to this pr and too trivial things though, I just leave comments) `PropagateEmptyRelation` does not collapse `spark.emptyDataFrame.dropDuplicates` because `spark.emptyDataFrame` uses `ExistingRDD` instead of empty `LocalRelation`;
    
    ```
    scala> spark.emptyDataFrame.dropDuplicates.explain(true)
    == Parsed Logical Plan ==
    Deduplicate
    +- AnalysisBarrier LogicalRDD false
    
    == Analyzed Logical Plan ==
    Deduplicate
    +- LogicalRDD false
    
    == Optimized Logical Plan ==
    Aggregate
    +- LogicalRDD false
    
    == Physical Plan ==
    *HashAggregate(keys=[], functions=[], output=[])
    +- Exchange SinglePartition
       +- *HashAggregate(keys=[], functions=[], output=[])
          +- Scan ExistingRDD[]
    
    scala> Seq.empty[Tuple2[Int, Int]].toDF("a", "b").dropDuplicates.explain(true)
    == Parsed Logical Plan ==
    Deduplicate [a#8, b#9]
    +- AnalysisBarrier Project [_1#5 AS a#8, _2#6 AS b#9]
    
    == Analyzed Logical Plan ==
    a: int, b: int
    Deduplicate [a#8, b#9]
    +- Project [_1#5 AS a#8, _2#6 AS b#9]
       +- LocalRelation <empty>, [_1#5, _2#6]
    
    == Optimized Logical Plan ==
    LocalRelation <empty>, [a#8, b#9]
    
    == Physical Plan ==
    LocalTableScan <empty>, [a#8, b#9]
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org