You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2018/09/18 22:50:18 UTC
[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/22456
[SPARK-19355][SQL] Fix variable names numberOfOutput
## What changes were proposed in this pull request?
SPARK-19355 introduced a variable / method called numberOfOutput, which is a really bad name because it is unclear whether it is a block, or a row. This patch renamed it numRecords, and also changed couple other places to make them consistent.
## How was this patch tested?
Should be covered by existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rxin/spark SPARK-19355
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22456.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22456
----
commit 793fc19d2519f47f5f3278b79e827f1159d9e440
Author: Reynold Xin <rx...@...>
Date: 2018-09-18T22:47:57Z
[SPARK-19355][SQL] Fix variable names.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22456#discussion_r218666270
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
/**
* Result returned by a ShuffleMapTask to a scheduler. Includes the block manager address that the
- * task ran on, the sizes of outputs for each reducer, and the number of outputs of the map task,
+ * task ran on, the sizes of outputs for each reducer, and the number of records of the map task,
--- End diff --
size was about bytes; so it doesn't really matter whether it's a record or a row or a block. it's also already pointed out below that it's about bytes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22456#discussion_r218685917
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
/**
* Result returned by a ShuffleMapTask to a scheduler. Includes the block manager address that the
- * task ran on, the sizes of outputs for each reducer, and the number of outputs of the map task,
+ * task ran on, the sizes of outputs for each reducer, and the number of records of the map task,
--- End diff --
As we are going to revert the sequence of prs, do we still need this?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22456
**[Test build #96206 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96206/testReport)** for PR 22456 at commit [`793fc19`](https://github.com/apache/spark/commit/793fc19d2519f47f5f3278b79e827f1159d9e440).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22456
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96206/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22456#discussion_r218651070
--- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala ---
@@ -31,7 +31,7 @@ import org.apache.spark.util.Utils
/**
* Result returned by a ShuffleMapTask to a scheduler. Includes the block manager address that the
- * task ran on, the sizes of outputs for each reducer, and the number of outputs of the map task,
+ * task ran on, the sizes of outputs for each reducer, and the number of records of the map task,
--- End diff --
Shall we also change `the sizes of outputs for each reducer` to `the sizes of output records for each reducer`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22456
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22456
**[Test build #96206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96206/testReport)** for PR 22456 at commit [`793fc19`](https://github.com/apache/spark/commit/793fc19d2519f47f5f3278b79e827f1159d9e440).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22456
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/22456
cc @hvanhovell @cloud-fan
also @viirya please don't use such cryptic variable names ... we also need to fix the documentation for the config flag - it's arcane.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22456: [SPARK-19355][SQL] Fix variable names numberOfOut...
Posted by rxin <gi...@git.apache.org>.
Github user rxin closed the pull request at:
https://github.com/apache/spark/pull/22456
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22456: [SPARK-19355][SQL] Fix variable names numberOfOutput -> ...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22456
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3205/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org