You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2015/10/20 10:19:55 UTC

[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/9175

    Minor cleanup of ShuffleMapStage.outputLocs code.

    I was looking at this code and found the documentation to be insufficient. I added more documentation, and refactored some relevant code path slightly to improve encapsulation. There are more that I want to do, but I want to get these changes in before doing more work.
    
    My goal is to reduce exposing internal fields directly in ShuffleMapStage to improve encapsulation. After this change, DAGScheduler no longer directly writes outputLocs. There are still 3 places that reads outputLocs directly, but we can change those later.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark stage-cleanup

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9175
    
----
commit f58dca57a2bb29a7f0439a19bb45bf32f2de9855
Author: Reynold Xin <rx...@databricks.com>
Date:   2015-10-20T08:17:34Z

    Minor cleanup of ShuffleMapStage.outputLocs code.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149475850
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42569321
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    --- End diff --
    
    So this is trying to avoid dereferencing the pointer?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149474657
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42568601
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    +          // locs(i) will be null if missing
    +          stage.addOutputLoc(i, locs(i))
    --- End diff --
    
    Why not do the locs.foreach { ... here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149475784
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149642480
  
    **[Test build #1931 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1931/consoleFull)** for PR 9175 at commit [`cf18c92`](https://github.com/apache/spark/commit/cf18c926bd068e32d2da8ad849422b2f57b6776a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42635696
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    --- End diff --
    
    unused


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42573991
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    +          // locs(i) will be null if missing
    +          stage.addOutputLoc(i, locs(i))
    --- End diff --
    
    Probably not - but it's trivial to do too ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149673775
  
    **[Test build #1933 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1933/consoleFull)** for PR 9175 at commit [`cf18c92`](https://github.com/apache/spark/commit/cf18c926bd068e32d2da8ad849422b2f57b6776a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150013838
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149477122
  
    cc @andrewor14 and @mateiz 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42635685
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    +          // locs(i) will be null if missing
    +          stage.addOutputLoc(i, locs(i))
    --- End diff --
    
    +1 for readability.  A balance is to do foreach on a range `(0 until locs.length).foreach`, which is almost as readable as foreach on the collection and almost as fast as the while loop.  https://github.com/scala/scala/commit/4cfc633fc6


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42467248
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -1202,7 +1195,7 @@ class DAGScheduler(
     
                   clearCacheLocs()
     
    -              if (shuffleStage.outputLocs.contains(Nil)) {
    +              if (!shuffleStage.isAvailable) {
    --- End diff --
    
    similar to before, I believe the two if conditions are equal.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150043228
  
    LGTM / merged


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149520000
  
    **[Test build #43970 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43970/consoleFull)** for PR 9175 at commit [`cf18c92`](https://github.com/apache/spark/commit/cf18c926bd068e32d2da8ad849422b2f57b6776a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42568232
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    --- End diff --
    
    This is doing reference check, which is different from != that invokes the equals method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42467128
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -894,7 +899,7 @@ class DAGScheduler(
         submitStage(finalStage)
     
         // If the whole stage has already finished, tell the listener and remove it
    -    if (!finalStage.outputLocs.contains(Nil)) {
    +    if (finalStage.isAvailable) {
    --- End diff --
    
    I believe `!finalStage.outputLocs.contains(Nil)` is just `finalStage.isAvailable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150041854
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149478053
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43969/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149523858
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43970/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150014390
  
    **[Test build #44088 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44088/consoleFull)** for PR 9175 at commit [`59319e1`](https://github.com/apache/spark/commit/59319e10be8d8e89ab3872162275bdd94a357799).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42467091
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    +          // locs(i) will be null if missing
    +          stage.addOutputLoc(i, locs(i))
    --- End diff --
    
    addOutputLoc is the place that we handle this logic. We should just call that ... also changed it to a while loop.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42574149
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    --- End diff --
    
    We should ask Martin haha


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42574077
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    --- End diff --
    
    Ok I used the "ask-Shivaram" algorithm to determine whether this was standard.  But fine to leave it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42568948
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala ---
    @@ -48,12 +48,33 @@ private[spark] class ShuffleMapStage(
       /** Running map-stage jobs that were submitted to execute this stage independently (if any) */
       var mapStageJobs: List[ActiveJob] = Nil
     
    +  /**
    +   * Number of partitions that have shuffle outputs.
    +   * When this reaches [[numPartitions]], this map stage is ready.
    +   * This should be kept consistent as `outputLocs.filter(!_.isEmpty).size`.
    +   */
       var numAvailableOutputs: Int = 0
     
    +  /**
    +   * Returns true if the map stage is ready, i.e. all partitions have shuffle outputs.
    +   * This should be the same as `outputLocs.contains(Nil)`.
    +   */
       def isAvailable: Boolean = numAvailableOutputs == numPartitions
     
    +  /**
    +   * List of [[MapStatus]] for each partition. The index of the array is the map partition id,
    +   * and each value in the array is the list of possible [[MapStatus]] for a partition
    +   * (a single task might run multiple times).
    +   */
       val outputLocs = Array.fill[List[MapStatus]](numPartitions)(Nil)
     
    +  override def findMissingPartitions(): Seq[Int] = {
    +    val missing = (0 until numPartitions).filter(id => outputLocs(id).isEmpty)
    --- End diff --
    
    Why not just do numPartitions - numAvailableOutputs here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42467209
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -931,24 +936,12 @@ class DAGScheduler(
         stage.pendingPartitions.clear()
     
         // First figure out the indexes of partition ids to compute.
    -    val (allPartitions: Seq[Int], partitionsToCompute: Seq[Int]) = {
    -      stage match {
    -        case stage: ShuffleMapStage =>
    -          val allPartitions = 0 until stage.numPartitions
    -          val filteredPartitions = allPartitions.filter { id => stage.outputLocs(id).isEmpty }
    -          (allPartitions, filteredPartitions)
    -        case stage: ResultStage =>
    -          val job = stage.resultOfJob.get
    -          val allPartitions = 0 until job.numPartitions
    -          val filteredPartitions = allPartitions.filter { id => !job.finished(id) }
    -          (allPartitions, filteredPartitions)
    -      }
    -    }
    +    val partitionsToCompute: Seq[Int] = stage.findMissingPartitions()
     
         // Create internal accumulators if the stage has no accumulators initialized.
         // Reset internal accumulators only if this stage is not partially submitted
         // Otherwise, we may override existing accumulator values from some tasks
    -    if (stage.internalAccumulators.isEmpty || allPartitions == partitionsToCompute) {
    +    if (stage.internalAccumulators.isEmpty || stage.numPartitions == partitionsToCompute.size) {
    --- End diff --
    
    This tiny rewrite should also improve scheduler performance slightly, since the if check just needs to check size, not the actual comparison.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149706844
  
    **[Test build #1933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1933/consoleFull)** for PR 9175 at commit [`cf18c92`](https://github.com/apache/spark/commit/cf18c926bd068e32d2da8ad849422b2f57b6776a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149478051
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42573604
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    +          // locs(i) will be null if missing
    +          stage.addOutputLoc(i, locs(i))
    --- End diff --
    
    Does it really need to be a while loop? Other than that it looks ok, but this isn't performance-critical code (it only happens once when we create the stage)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149477267
  
    **[Test build #43970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43970/consoleFull)** for PR 9175 at commit [`cf18c92`](https://github.com/apache/spark/commit/cf18c926bd068e32d2da8ad849422b2f57b6776a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149474616
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150013862
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42567985
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    --- End diff --
    
    why not use the more common "!=" here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149523855
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by kayousterhout <gi...@git.apache.org>.
Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42574057
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    +          // locs(i) will be null if missing
    +          stage.addOutputLoc(i, locs(i))
    --- End diff --
    
    Readability!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149670646
  
    cc @kayousterhout too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150041856
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44088/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-150041672
  
    **[Test build #44088 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44088/consoleFull)** for PR 9175 at commit [`59319e1`](https://github.com/apache/spark/commit/59319e10be8d8e89ab3872162275bdd94a357799).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42573937
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -353,10 +353,15 @@ class DAGScheduler(
         if (mapOutputTracker.containsShuffle(shuffleDep.shuffleId)) {
           val serLocs = mapOutputTracker.getSerializedMapOutputStatuses(shuffleDep.shuffleId)
           val locs = MapOutputTracker.deserializeMapStatuses(serLocs)
    -      for (i <- 0 until locs.length) {
    -        stage.outputLocs(i) = Option(locs(i)).toList // locs(i) will be null if missing
    +      var numAvailableOutputs = 0
    +      var i = 0
    +      while (i < locs.length) {
    +        if (locs(i) ne null) {
    --- End diff --
    
    I think it's also the standard to do ne in scala.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9175#discussion_r42467224
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ---
    @@ -931,24 +936,12 @@ class DAGScheduler(
         stage.pendingPartitions.clear()
     
         // First figure out the indexes of partition ids to compute.
    -    val (allPartitions: Seq[Int], partitionsToCompute: Seq[Int]) = {
    --- End diff --
    
    all of these are inlined into Stage.findMissingPartitions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Minor cleanup of ShuffleMapStage.outputLocs co...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9175#issuecomment-149681220
  
    **[Test build #1931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1931/consoleFull)** for PR 9175 at commit [`cf18c92`](https://github.com/apache/spark/commit/cf18c926bd068e32d2da8ad849422b2f57b6776a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org