You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2017/07/01 09:07:04 UTC

[GitHub] spark pull request #18497: [SPARK-21078][WIP][CORE] JobHistory applications ...

GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/18497

    [SPARK-21078][WIP][CORE] JobHistory applications synchronized is invalid

    ## What changes were proposed in this pull request?
    
    - Make `applications` and `attemptsToClean` immutable to try to simplify concurrency
    - Remove, for now, the incomplete synchronization of updates of `applications` 
    - Next: add back synchronization of their update in a different form?
    
    ## How was this patch tested?
    
    Existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-21078

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18497.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18497
    
----
commit 7f0966bd3d167fcb7c0221c0270bc1be0170553e
Author: Sean Owen <so...@cloudera.com>
Date:   2017-07-01T09:05:08Z

    Make some fields of FsHistoryProvider immutable and for the moment remove not-quite-sufficient synchronization protecting updates

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    Yeah, that's really my last question -- mergeApplicationListing and checkForLogs need sync.
    You're right that making objects immutable just helped simplify the solution I was going for.
    
    This one's easy to finish with extra synchronization, but if you're saying it will be superseded shortly, just tag your other change with this JIRA and I will close this. I was not planning to back-port unless someone else was actually observing a problem here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    **[Test build #79021 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79021/testReport)** for PR 18497 at commit [`7f0966b`](https://github.com/apache/spark/commit/7f0966bd3d167fcb7c0221c0270bc1be0170553e).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    @vanzin I think I need your eyes on this one. I am trying to address the race condition in the JIRA, which I think is very narrowly possible but possible. Additionally, I think `attemptsToClean` wasn't synchronized properly. So I tried to make it immutable.
    
    But both of the updates to the references is now unsynchronized. I don't think that's OK, because we have to assume things like `checkForLogs` and `mergeApplicationListings` are called concurrently right? is it sufficient to just synchronize both of their updates to these two fields on a new lock?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    Your change won't work, as far as I can tell. Since multiple threads might be calling `mergeApplicationListing`, you need synchronization around the block that is doing the list merging. Immutability is not the problem here.
    
    The bug report is right in that the problem here is that synchronization is happening on the wrong object; since `applications` is overwritten to point to a different map, two threads may be synchronizing on different maps. So the correct change is just to use a different object as the lock here.
    
    In truth, unless you're planning to backport this to other releases, this will all be superseded by my code to fix SPARK-20642, which is the next bug on the list after #18395 goes in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79021/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    **[Test build #79021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79021/testReport)** for PR 18497 at commit [`7f0966b`](https://github.com/apache/spark/commit/7f0966bd3d167fcb7c0221c0270bc1be0170553e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    **[Test build #3833 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3833/testReport)** for PR 18497 at commit [`7f0966b`](https://github.com/apache/spark/commit/7f0966bd3d167fcb7c0221c0270bc1be0170553e).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18497: [SPARK-21078][WIP][CORE] JobHistory applications synchro...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18497
  
    **[Test build #3833 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3833/testReport)** for PR 18497 at commit [`7f0966b`](https://github.com/apache/spark/commit/7f0966bd3d167fcb7c0221c0270bc1be0170553e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18497: [SPARK-21078][WIP][CORE] JobHistory applications ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen closed the pull request at:

    https://github.com/apache/spark/pull/18497


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org