You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2018/09/18 22:50:18 UTC

[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/22456

    [SPARK-19355][SQL] Fix variable names numberOfOutput

    ## What changes were proposed in this pull request?
    SPARK-19355 introduced a variable / method called numberOfOutput, which is a really bad name because it is unclear whether it is a block, or a row. This patch renamed it numRecords, and also changed couple other places to make them consistent.
    
    ## How was this patch tested?
    Should be covered by existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark SPARK-19355

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22456
    
----
commit 793fc19d2519f47f5f3278b79e827f1159d9e440
Author: Reynold Xin <rx...@...>
Date:   2018-09-18T22:47:57Z

    [SPARK-19355][SQL] Fix variable names.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22456#discussion_r218666270
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
    @@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
     
     /**
      * Result returned by a ShuffleMapTask to a scheduler. Includes the block manager address that the
    - * task ran on, the sizes of outputs for each reducer, and the number of outputs of the map task,
    + * task ran on, the sizes of outputs for each reducer, and the number of records of the map task,
    --- End diff --
    
    size was about bytes; so it doesn't really matter whether it's a record or a row or a block. it's also already pointed out below that it's about bytes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22456#discussion_r218685917
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
    @@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
     
     /**
      * Result returned by a ShuffleMapTask to a scheduler. Includes the block manager address that the
    - * task ran on, the sizes of outputs for each reducer, and the number of outputs of the map task,
    + * task ran on, the sizes of outputs for each reducer, and the number of records of the map task,
    --- End diff --
    
    As we are going to revert the sequence of prs, do we still need this?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    **[Test build #96206 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96206/testReport)** for PR 22456 at commit [`793fc19`](https://github.com/apache/spark/commit/793fc19d2519f47f5f3278b79e827f1159d9e440).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96206/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22456#discussion_r218651070
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
    @@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
     
     /**
      * Result returned by a ShuffleMapTask to a scheduler. Includes the block manager address that the
    - * task ran on, the sizes of outputs for each reducer, and the number of outputs of the map task,
    + * task ran on, the sizes of outputs for each reducer, and the number of records of the map task,
    --- End diff --
    
    Shall we also change `the sizes of outputs for each reducer` to `the sizes of output records for each reducer`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    **[Test build #96206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96206/testReport)** for PR 22456 at commit [`793fc19`](https://github.com/apache/spark/commit/793fc19d2519f47f5f3278b79e827f1159d9e440).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    cc @hvanhovell @cloud-fan 
    
    also @viirya please don't use such cryptic variable names ... we also need to fix the documentation for the config flag - it's arcane.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...

Posted by rxin <gi...@git.apache.org>.
Github user rxin closed the pull request at:

    https://github.com/apache/spark/pull/22456


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22456
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3205/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org