You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by carsonwang <gi...@git.apache.org> on 2017/12/04 12:35:50 UTC

[GitHub] spark pull request #19877: [SPARK-22681]Accumulator should only updated once...

GitHub user carsonwang opened a pull request:

    https://github.com/apache/spark/pull/19877

    [SPARK-22681]Accumulator should only updated once for each task in result stage

    ## What changes were proposed in this pull request?
    As the doc says "For accumulator updates performed inside actions only, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value."
    But currently the code doesn't guarantee this.
    
    ## How was this patch tested?
    New added tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/carsonwang/spark fixAccum

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19877.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19877
    
----
commit 882126c2671e1733d572350af9749e9f8bdca1c2
Author: Carson Wang <ca...@intel.com>
Date:   2017-12-04T12:23:14Z

    Do not update accumulator for resubmitted task in result stage

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    Merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19877: [SPARK-22681]Accumulator should only be updated o...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19877


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    **[Test build #84427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84427/testReport)** for PR 19877 at commit [`882126c`](https://github.com/apache/spark/commit/882126c2671e1733d572350af9749e9f8bdca1c2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by carsonwang <gi...@git.apache.org>.
Github user carsonwang commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    cc @vanzin 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84452/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    **[Test build #84452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84452/testReport)** for PR 19877 at commit [`756f02f`](https://github.com/apache/spark/commit/756f02f1586edd14e42e32cd119e43132e9d13ee).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19877: [SPARK-22681]Accumulator should only be updated o...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19877#discussion_r154729236
  
    --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
    @@ -1832,6 +1832,27 @@ class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext with TimeLi
         assertDataStructuresEmpty()
       }
     
    +  test("accumulator not calculated for resubmitted task in result stage") {
    +    // just for register
    --- End diff --
    
    nit: unnecessary (and confusing?) comment.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19877: [SPARK-22681]Accumulator should only be updated o...

Posted by carsonwang <gi...@git.apache.org>.
Github user carsonwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19877#discussion_r154833757
  
    --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ---
    @@ -1832,6 +1832,27 @@ class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext with TimeLi
         assertDataStructuresEmpty()
       }
     
    +  test("accumulator not calculated for resubmitted task in result stage") {
    +    // just for register
    --- End diff --
    
    Just removed that. Thanks @vanzin  for the review.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    **[Test build #84452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84452/testReport)** for PR 19877 at commit [`756f02f`](https://github.com/apache/spark/commit/756f02f1586edd14e42e32cd119e43132e9d13ee).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84427/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19877: [SPARK-22681]Accumulator should only be updated once for...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19877
  
    **[Test build #84427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84427/testReport)** for PR 19877 at commit [`882126c`](https://github.com/apache/spark/commit/882126c2671e1733d572350af9749e9f8bdca1c2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org