You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by rdblue <gi...@git.apache.org> on 2018/06/13 19:54:13 UTC

[GitHub] spark pull request #21558: [SPARK-24552][SQL] Use task ID instead of attempt...

GitHub user rdblue opened a pull request:

    https://github.com/apache/spark/pull/21558

    [SPARK-24552][SQL] Use task ID instead of attempt number for v2 writes.

    ## What changes were proposed in this pull request?
    
    This passes the unique task attempt id instead of attempt number to v2 data sources because attempt number is reused when stages are retried. When attempt numbers are reused, sources that track data by partition id and attempt number may incorrectly clean up data because **the same attempt number can be both committed and aborted**.
    
    ## How was this patch tested?
    
    Existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rdblue/spark SPARK-24552-v2-source-work-around

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21558.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21558
    
----
commit e9e776a097f5dca1dccdd6e50b3790e6a91873d8
Author: Ryan Blue <bl...@...>
Date:   2018-06-13T19:50:00Z

    SPARK-24552: Use task ID instead of attempt number for v2 writes.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    Actually I didn't mean speculation but something like this:
    
    ```
    sc.parallelize(1 to 10).foreach { i => if (TaskContext.get().attemptNumber() == 0) throw new Exception("Fail") else println(i) }
    ```
    
    Anyway I ran that and the behavior is the same (different attempt = different task ID) so it's all good.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    **[Test build #92050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92050/testReport)** for PR 21558 at commit [`6c60d14`](https://github.com/apache/spark/commit/6c60d1462c34f01610ada50c989832775b6fd117).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by rdblue <gi...@git.apache.org>.

Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    @cloud-fan, this is a work-around for SPARK-24552. I'm not sure the right way to fix this besides fixing the scheduler so that it doesn't use task attempt numbers twice, but I think this works.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/295/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    FYI, I'm preparing my own version of this PR with the remaining feedback addressed. Ryan was on paternity leave and I don't know whether he's done yet, so he may not be that responsive.
    
    This will conflict with the output commit coordinator change in any case, so one of them needs to wait (and that one is further along).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    **[Test build #91794 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91794/testReport)** for PR 21558 at commit [`e9e776a`](https://github.com/apache/spark/commit/e9e776a097f5dca1dccdd6e50b3790e6a91873d8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    **[Test build #92043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92043/testReport)** for PR 21558 at commit [`6c60d14`](https://github.com/apache/spark/commit/6c60d1462c34f01610ada50c989832775b6fd117).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    Ah ok, was looking at my own version as well.  There are other things we should update for v2 as well, other functions with the variable names, description in DataWriterFactory.java, etc.
     


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by rdblue <gi...@git.apache.org>.

Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    Retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    **[Test build #92043 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92043/testReport)** for PR 21558 at commit [`6c60d14`](https://github.com/apache/spark/commit/6c60d1462c34f01610ada50c989832775b6fd117).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by tgravescs <gi...@git.apache.org>.

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    so I looked through the code and it certain appears to be a bug in the existing code (not just v2 datasource api).  If you have one stage running that gets a fetch failure, if it leaves any tasks running with attempt 0, it could conflict with the restarted stage since those tasks would all start with attempt 0 as well.  When I say it could it means it would be a race if they go to commit at about the same time.   Its probably more of an issue if one commits, then starts the job commit and the other task starts to then commit its, you could end up with incomplete/corrupt file.  We should see the warning "Authorizing duplicate request to commit" in the logs though if this occurs.
    
    @rdblue does this match what you are seeing?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #21558: [SPARK-24552][SQL] Use task ID instead of attempt number...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21558
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org