You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vanzin <gi...@git.apache.org> on 2014/12/15 23:28:50 UTC

[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/3705

    [SPARK-4834] [standalone] Clean up application files after app finishes.

    Commit 7aacb7bfa added support for sharing downloaded files among multiple
    executors of the same app. That works great in Yarn, since the app's directory
    is cleaned up after the app is done.
    
    But Spark standalone mode didn't do that, so the lock/cache files created
    by that change were left around and could eventually fill up the disk hosting
    /tmp.
    
    To solve that, create app-specific directories under the local dirs when
    launching executors. Multiple executors launched by the same Worker will
    use the same app directories, so they should be able to share the downloaded
    files. When the application finishes, a new message is sent to all executors
    telling them the application has finished; once that message has been received,
    and all executors registered for the application shut down, then those
    directories will be cleaned up by the Worker.
    
    Note 1: Unit testing this is hard (if even possible), since local-cluster mode
    doesn't seem to leave the Master/Worker daemons running long enough after
    `sc.stop()` is called for the clean up protocol to take effect.
    
    Note 2: the code tracking finished apps / app directories in Master.scala
    and Worker.scala is not really thread-safe, but then the code that modifies
    other shared maps in those classes isn't either, so this change is not making
    anything worse.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-4834

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3705.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3705
    
----
commit c0e5ea5923b26443e988384b37c325d0a62760c3
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2014-12-15T22:00:56Z

    [SPARK-4834] [standalone] Clean up application files after app finishes.
    
    Commit 7aacb7bfa added support for sharing downloaded files among multiple
    executors of the same app. That works great in Yarn, since the app's directory
    is cleaned up after the app is done.
    
    But Spark standalone mode didn't do that, so the lock/cache files created
    by that change were left around and could eventually fill up the disk hosting
    /tmp.
    
    To solve that, create app-specific directories under the local dirs when
    launching executors. Multiple executors launched by the same Worker will
    use the same app directories, so they should be able to share the downloaded
    files. When the application finishes, a new message is sent to all executors
    telling them the application has finished; once that message has been received,
    and all executors registered for the application shut down, then those
    directories will be cleaned up by the Worker.
    
    Note 1: Unit testing this is hard (if even possible), since local-cluster mode
    doesn't seem to leave the Master/Worker daemons running long enough after
    `sc.stop()` is called for the clean up protocol to take effect.
    
    Note 2: the code tracking finished apps / app directories in Master.scala
    and Worker.scala is not really thread-safe, but then the code that modifies
    other shared maps in those classes isn't either, so this change is not making
    anything worse.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67915900
  
    (lack of) thread-safety issues aside, this PR looks good to me.  It makes sense to have application-specific directories now that we have state that tied to an application's lifecycle (lock files, file caches, etc) rather than individual executors' lifecycles.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67079113
  
      [Test build #24470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24470/consoleFull) for   PR 3705 at commit [`c0e5ea5`](https://github.com/apache/spark/commit/c0e5ea5923b26443e988384b37c325d0a62760c3).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67917438
  
    @markhamstra that sounds a little limiting (forcing message handling to be effectively single threaded), but that's an akka issue, not a Spark issue. :-) Thanks for the pointer, though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67912224
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24716/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67991653
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24742/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3705#discussion_r22195734
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -452,6 +469,19 @@ private[spark] class Worker(
         registerWithMaster()
       }
     
    +  private def maybeCleanupApplication(id: String): Unit = synchronized {
    +    val shouldCleanup = finishedApps.contains(id) && !executors.values.exists(_.appId == id)
    +    if (shouldCleanup) {
    +      finishedApps -= id
    +      appDirectories.remove(id).foreach {
    +        logInfo(s"Cleaning up local directories for application $id")
    +        _.foreach { dir =>
    --- End diff --
    
    Minor nit, but I think it would be clearer to introduce an `appDirectory` variable in the outer `foreach` rather than using the underscore here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67907292
  
    > It wasn't very clear to me in which cases messages could be processed in parallel using akka from my last read of the documentation, but guess I'm due for a second reading. :-)
    
    Yeah, this is kind of hard to find, but it's described in the [Actors and the Java Memory Model](http://doc.akka.io/docs/akka/2.0.2/general/jmm.html#Actors_and_the_Java_Memory_Model) section:
    
    > **The actor subsequent processing rule**: processing of one message happens before processing of the next message by the same actor.
    
    To get parallelism, the Akka solution is to use multiple actors.  That said, people do [mix the actor model with other concurrency models](http://osl.cs.illinois.edu/publications/conf/ecoop/TasharofiDJ13.html), so it pays to be careful here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67906771
  
    re: thread-safety, the code I added could definitely be triggered from different actors, so I added some synchronization. For the `executors` map you mention, it seems to only be modified while handling messages from Master, but it's definitely accessed when handling messages from other sources, and at least with Java's `HashMap` that is not safe. Maybe Scala's has different concurrency semantics, though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67907847
  
    > so I think all accesses to that field will be serialized by Akka.
    
    Hmm, I'm not so sure. From the page you linked:
    
    > processing of one message happens before processing of the next message by the same actor.
    
    I read that sentence as: "messages from the same sender are processed in order", not "each receiver will only process one message at a time". `Worker.receiveWithLogging` handles messages from both `Master` and `Executor`, and both could cause `appDirectories` to be modified.
    
    Anyway, in these cases I tend to err on the side of safety; the extra synchronization here shouldn't cause any performance issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67912223
  
      [Test build #24716 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24716/consoleFull) for   PR 3705 at commit [`50eb4b9`](https://github.com/apache/spark/commit/50eb4b957a7e7b1cbb6f5445f8a0fbe99478ba4c).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class ApplicationFinished(id: String)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67991646
  
      [Test build #24742 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24742/consoleFull) for   PR 3705 at commit [`b430534`](https://github.com/apache/spark/commit/b430534e5d47fd12ae7b59925eeac925c1e6d2ab).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class ApplicationFinished(id: String)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67905774
  
    > implicit thread-safety due to that actor's message-at-a-time processing. 
    
    It wasn't very clear to me in which cases messages could be processed in parallel using akka from my last read of the documentation, but guess I'm due for a second reading. :-)
    
    I'll double-check that I'm not breaking that assumption at least.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67921830
  
    Sure, I can do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67907467
  
    > re: thread-safety, the code I added could definitely be triggered from different actors, so I added some synchronization
    
    Do you mean the `maybeCleanupApplication` method?  It looks like it's only called from two places, both message-handlers in `receiveWithLogging` (which is synchronized by Akka).  Similarly, `appDirectories` is only accessed in `receiveWithLogging` and `maybeCleanupApplication`, so I think all accesses to that field will be serialized by Akka.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67089141
  
      [Test build #24470 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24470/consoleFull) for   PR 3705 at commit [`c0e5ea5`](https://github.com/apache/spark/commit/c0e5ea5923b26443e988384b37c325d0a62760c3).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `  case class ApplicationFinished(id: String)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67918800
  
    Do you think we should remove the `synchronized` now that we're convinced it's no longer necessary?  Not a huge deal to leave it, but potentially confusing since it's not the norm to need synchronization within an actor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67906753
  
      [Test build #24716 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24716/consoleFull) for   PR 3705 at commit [`50eb4b9`](https://github.com/apache/spark/commit/50eb4b957a7e7b1cbb6f5445f8a0fbe99478ba4c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67901746
  
    Regarding thread-safety, it looks like `Worker.executors` is only modified from methods that are called through the Worker actor, so I think we get implicit thread-safety due to that actor's message-at-a-time processing. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67900715
  
    > Note 1: Unit testing this is hard (if even possible), since local-cluster mode doesn't seem to leave the Master/Worker daemons running long enough after sc.stop() is called for the clean up protocol to take effect.
    
    I have a testing framework that can reliably test these sorts of scenarios; I'm planning to open-source it once I've cleaned up and reorganized its code.
    
    > Note 2: the code tracking finished apps / app directories in Master.scala and Worker.scala is not really thread-safe, but then the code that modifies other shared maps in those classes isn't either, so this change is not making anything worse.
    
    Yeah, the code there is due for a thread-safety audit / cleanup.  A lot of the fields in those files have unnecessarily broad visibility, in many cases because they're read in the web UI.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67078615
  
    I tested this on a standalone cluster; verified that log message shows up in Worker logs and that the temp files are not left behing in /tmp.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67992288
  
    That most recent failure is due to a known flaky test, _org.apache.spark.streaming.CheckpointSuite.recovery with file input stream_, so I'm not going to make Jenkins re-test this.
    
    This looks good to me, so I'm going to merge it into `master` (1.3.0) and `branch-1.2` (1.2.1).  Thanks for the PR!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67089154
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24470/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/3705


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by markhamstra <gi...@git.apache.org>.
Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67912545
  
    @vanzin You need to read some more about Akka actors.
    
    The fundamental abstraction is that an actor has a message queue, the only way to communicate with the actor is via messages, those messages end up in the queue, and they are handled one at a time in the `receive` method.  That the actor could be receiving messages from more than one sender does not in any way imply that messages will be processed concurrently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/3705#discussion_r22195431
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala ---
    @@ -175,4 +175,9 @@ private[deploy] object DeployMessages {
       // Liveness checks in various places
     
       case object SendHeartbeat
    +
    +  // Application finished message, used for cleanup
    +
    +  case class ApplicationFinished(id: String)
    --- End diff --
    
    Maybe this should be under the `// Master to Worker` heading in this file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-4834] [standalone] Clean up application...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3705#issuecomment-67985289
  
      [Test build #24742 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24742/consoleFull) for   PR 3705 at commit [`b430534`](https://github.com/apache/spark/commit/b430534e5d47fd12ae7b59925eeac925c1e6d2ab).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org