You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2016/06/02 22:21:03 UTC
[GitHub] spark pull request #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle...
GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/13479
[SPARK-15736][CORE][branch-1.6] Gracefully handle loss of DiskStore files
If an RDD partition is cached on disk and the DiskStore file is lost, then reads of that cached partition will fail and the missing partition is supposed to be recomputed by a new task attempt. In the current BlockManager implementation, however, the missing file does not trigger any metadata updates / does not invalidate the cache, so subsequent task attempts will be scheduled on the same executor and the doomed read will be repeatedly retried, leading to repeated task failures and eventually a total job failure.
In order to fix this problem, the executor with the missing file needs to properly mark the corresponding block as missing so that it stops advertising itself as a cache location for that block.
This patch fixes this bug and adds an end-to-end regression test (in `FailureSuite`) and a set of unit tests (`in BlockManagerSuite`).
This is a branch-1.6 backport of #13473.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark handle-missing-cache-files-branch-1.6
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/13479.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #13479
----
commit fa40a80d4c3dc271943f2a8c88bb200dbca93f6c
Author: Josh Rosen <jo...@databricks.com>
Date: 2016-06-02T20:30:42Z
Add failing regression test.
commit 36536d7af43148dbceae4834be6f9eee71afee40
Author: Josh Rosen <jo...@databricks.com>
Date: 2016-06-02T21:08:43Z
Add failing unit tests in BlockManagerSuite.
commit 8f047202a15a373c92724d99ec2e4ab2d7d30a07
Author: Josh Rosen <jo...@databricks.com>
Date: 2016-06-02T22:15:12Z
Fix bug.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/13479
**[Test build #59889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59889/consoleFull)** for PR 13479 at commit [`8f04720`](https://github.com/apache/spark/commit/8f047202a15a373c92724d99ec2e4ab2d7d30a07).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen closed the pull request at:
https://github.com/apache/spark/pull/13479
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/13479
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the issue:
https://github.com/apache/spark/pull/13479
Merging into 1.6.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/13479
**[Test build #59889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59889/consoleFull)** for PR 13479 at commit [`8f04720`](https://github.com/apache/spark/commit/8f047202a15a373c92724d99ec2e4ab2d7d30a07).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/13479
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59889/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the issue:
https://github.com/apache/spark/pull/13479
can you delete branch
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #13479: [SPARK-15736][CORE][branch-1.6] Gracefully handle loss o...
Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the issue:
https://github.com/apache/spark/pull/13479
/cc @andrewor14, this is the branch-1.6 backport of my other patch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org