You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2018/09/18 22:24:42 UTC

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16677#discussion_r218614872
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
    @@ -44,18 +45,23 @@ private[spark] sealed trait MapStatus {
        * necessary for correctness, since block fetchers are allowed to skip zero-size blocks.
        */
       def getSizeForBlock(reduceId: Int): Long
    +
    +  /**
    +   * The number of outputs for the map task.
    +   */
    +  def numberOfOutput: Long
    --- End diff --
    
    what does this mean? output blocks? output files?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org