You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vanzin <gi...@git.apache.org> on 2018/01/10 20:55:26 UTC

[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/20223

    [SPARK-23020][core] Fix races in launcher code, test.

    The race in the code is because the handle might update
    its state to the wrong state if the connection handling
    thread is still processing incoming data; so the handle
    needs to wait for the connection to finish up before
    checking the final state.
    
    The race in the test is because when waiting for a handle
    to reach a final state, the waitFor() method needs to wait
    until all handle state is updated (which also includes
    waiting for the connection thread above to finish).
    Otherwise, waitFor() may return too early, which would cause
    a bunch of different races (like the listener not being yet
    notified of the state change, or being in the middle of
    being notified, or the handle not being properly disposed
    and causing postChecks() to assert).
    
    On top of that I found, by code inspection, a couple of
    potential races that could make a handle end up in the
    wrong state when being killed.
    
    Tested by running the existing unit tests a lot (and not
    seeing the errors I was seeing before).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-23020

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20223
    
----
commit 5139f605904996667d8d97941172bbe9d366a579
Author: Marcelo Vanzin <va...@...>
Date:   2018-01-10T17:56:17Z

    [SPARK-23020][core] Fix races in launcher code, test.
    
    The race in the code is because the handle might update
    its state to the wrong state if the connection handling
    thread is still processing incoming data; so the handle
    needs to wait for the connection to finish up before
    checking the final state.
    
    The race in the test is because when waiting for a handle
    to reach a final state, the waitFor() method needs to wait
    until all handle state is updated (which also includes
    waiting for the connection thread above to finish).
    Otherwise, waitFor() may return too early, which would cause
    a bunch of different races (like the listener not being yet
    notified of the state change, or being in the middle of
    being notified, or the handle not being properly disposed
    and causing postChecks() to assert).
    
    On top of that I found, by code inspection, a couple of
    potential races that could make a handle end up in the
    wrong state when being killed.
    
    Tested by running the existing unit tests a lot (and not
    seeing the errors I was seeing before).

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r160956968
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/LauncherConnection.java ---
    @@ -95,15 +95,15 @@ protected synchronized void send(Message msg) throws IOException {
       }
     
       @Override
    -  public void close() throws IOException {
    +  public synchronized void close() throws IOException {
         if (!closed) {
    -      synchronized (this) {
    --- End diff --
    
    what's wrong with this? It's a classic "double-checked locking" in java.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161022567
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/AbstractAppHandle.java ---
    @@ -91,10 +92,15 @@ LauncherConnection getConnection() {
         return connection;
       }
     
    -  boolean isDisposed() {
    +  synchronized boolean isDisposed() {
         return disposed;
    --- End diff --
    
    The `synchronized` is not protecting the variable, but all the actions that happen around the code that sets the variable.
    
    So this method returning guarantees that either all the disposal tasks have run or none have.
    
    That being said `ServerConnection.close` is not really holding the handle lock as it should, so I might have to make some changes here.
    
    These are not really contended locks. In the worst case there will be 2 threads looking the them. So trying to come up with a finer-grained locking scheme here sounds like more trouble than it's worth.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161022921
  
    --- Diff: core/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java ---
    @@ -137,7 +139,9 @@ public void testInProcessLauncher() throws Exception {
           // Here DAGScheduler is stopped, while SparkContext.clearActiveContext may not be called yet.
           // Wait for a reasonable amount of time to avoid creating two active SparkContext in JVM.
    --- End diff --
    
    It's not that 500ms is not reasonable, it's that this code will return more quickly when the context shuts down in under 500ms, have a longer grace period in case it ever takes more than 500ms, and fail the test here if the context does not shut down.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85986 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85986/testReport)** for PR 20223 at commit [`8d52338`](https://github.com/apache/spark/commit/8d52338eed18bcc82ad04179e966f9eca064e68c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86142/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85929 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85929/testReport)** for PR 20223 at commit [`5139f60`](https://github.com/apache/spark/commit/5139f605904996667d8d97941172bbe9d366a579).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85929 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85929/testReport)** for PR 20223 at commit [`5139f60`](https://github.com/apache/spark/commit/5139f605904996667d8d97941172bbe9d366a579).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    At this point in time it might be a good idea to revert SPARK-10300, since the YARN tests don't really add that much to the test run time compared to the others anymore...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r160989352
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/AbstractAppHandle.java ---
    @@ -91,10 +92,15 @@ LauncherConnection getConnection() {
         return connection;
       }
     
    -  boolean isDisposed() {
    +  synchronized boolean isDisposed() {
    --- End diff --
    
    why do we need `synchronized` here


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85937/testReport)** for PR 20223 at commit [`2018e29`](https://github.com/apache/spark/commit/2018e295f516d51e8c1661d3b77c31a029fdb006).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    This patch unfortunately broke [`YarnClusterSuite.timeout to get SparkContext in cluster mode triggers failure`](https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-maven-hadoop-2.7/94/testReport/org.apache.spark.deploy.yarn/YarnClusterSuite/timeout_to_get_SparkContext_in_cluster_mode_triggers_failure/history/) and [`SparkLauncherSuite.testInProcessLauncher`](https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-maven-hadoop-2.7/90/testReport/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/) is still flaky. I'm going to revert this patch and temporarily disable the test in `SparkLauncherSuite` to fix the build.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85937/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161006918
  
    --- Diff: core/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java ---
    @@ -137,7 +139,9 @@ public void testInProcessLauncher() throws Exception {
           // Here DAGScheduler is stopped, while SparkContext.clearActiveContext may not be called yet.
           // Wait for a reasonable amount of time to avoid creating two active SparkContext in JVM.
    --- End diff --
    
    Here Vanzin is waiting for the active SparkContext being empty.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85984/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85986/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by sameeragarwal <gi...@git.apache.org>.
Github user sameeragarwal commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    merging to master/2.3. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85929/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161834649
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java ---
    @@ -48,14 +48,16 @@ public synchronized void disconnect() {
     
       @Override
       public synchronized void kill() {
    -    disconnect();
    -    if (childProc != null) {
    -      if (childProc.isAlive()) {
    -        childProc.destroyForcibly();
    +    if (!isDisposed()) {
    +      setState(State.KILLED);
    --- End diff --
    
    I changed this because the old disconnect code, at least, might change the handle's state. It's easier to put this call first and not have to reason about whether that will happen.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r160952457
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/AbstractAppHandle.java ---
    @@ -91,10 +92,15 @@ LauncherConnection getConnection() {
         return connection;
       }
     
    -  boolean isDisposed() {
    +  synchronized boolean isDisposed() {
         return disposed;
    --- End diff --
    
    can we simply mark `disposed` as `transient`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #86142 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86142/testReport)** for PR 20223 at commit [`8d52338`](https://github.com/apache/spark/commit/8d52338eed18bcc82ad04179e966f9eca064e68c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r160989105
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/AbstractAppHandle.java ---
    @@ -91,10 +92,15 @@ LauncherConnection getConnection() {
         return connection;
       }
     
    -  boolean isDisposed() {
    +  synchronized boolean isDisposed() {
         return disposed;
       }
     
    +  synchronized void markDisposed() {
    --- End diff --
    
    why do we need `synchronized` here? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #86142 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86142/testReport)** for PR 20223 at commit [`8d52338`](https://github.com/apache/spark/commit/8d52338eed18bcc82ad04179e966f9eca064e68c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85986 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85986/testReport)** for PR 20223 at commit [`8d52338`](https://github.com/apache/spark/commit/8d52338eed18bcc82ad04179e966f9eca064e68c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161022671
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/LauncherConnection.java ---
    @@ -95,15 +95,15 @@ protected synchronized void send(Message msg) throws IOException {
       }
     
       @Override
    -  public void close() throws IOException {
    +  public synchronized void close() throws IOException {
         if (!closed) {
    -      synchronized (this) {
    --- End diff --
    
    The sub-class needs coarser-grained locking so this wasn't really doing anything useful.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161651289
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java ---
    @@ -48,14 +48,16 @@ public synchronized void disconnect() {
     
       @Override
       public synchronized void kill() {
    -    disconnect();
    -    if (childProc != null) {
    -      if (childProc.isAlive()) {
    -        childProc.destroyForcibly();
    +    if (!isDisposed()) {
    +      setState(State.KILLED);
    --- End diff --
    
    I have the same question, but should be OK as we always synchronize when touching the state.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20223


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85984/testReport)** for PR 20223 at commit [`9c7d2d3`](https://github.com/apache/spark/commit/9c7d2d3ee1cb4d9ec36607dd69122c55daea4172).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85984 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85984/testReport)** for PR 20223 at commit [`9c7d2d3`](https://github.com/apache/spark/commit/9c7d2d3ee1cb4d9ec36607dd69122c55daea4172).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    **[Test build #85937 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85937/testReport)** for PR 20223 at commit [`2018e29`](https://github.com/apache/spark/commit/2018e295f516d51e8c1661d3b77c31a029fdb006).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    We didn't run the yarn test with this PR due to https://issues.apache.org/jira/browse/SPARK-10300
    
    This indicates that it might be a bad idea to skip the yarn test if we don't change yarn code. In this case, we touch the code in spark code and cause the failure in yarn test.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r160991236
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/AbstractAppHandle.java ---
    @@ -71,15 +71,16 @@ public void stop() {
       @Override
       public synchronized void disconnect() {
         if (!disposed) {
    --- End diff --
    
    `if(!isDisposed())` ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by gengliangwang <gi...@git.apache.org>.
Github user gengliangwang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r161564385
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java ---
    @@ -48,14 +48,16 @@ public synchronized void disconnect() {
     
       @Override
       public synchronized void kill() {
    -    disconnect();
    -    if (childProc != null) {
    -      if (childProc.isAlive()) {
    -        childProc.destroyForcibly();
    +    if (!isDisposed()) {
    +      setState(State.KILLED);
    --- End diff --
    
    Why put `setState` in the front? It should be OK but in previous code some of the `setState` is called in the end.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20223: [SPARK-23020][core] Fix races in launcher code, t...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20223#discussion_r160957301
  
    --- Diff: core/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java ---
    @@ -137,7 +139,9 @@ public void testInProcessLauncher() throws Exception {
           // Here DAGScheduler is stopped, while SparkContext.clearActiveContext may not be called yet.
           // Wait for a reasonable amount of time to avoid creating two active SparkContext in JVM.
    --- End diff --
    
    why is `500 ms` not a reasonable waiting time anymore?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20223: [SPARK-23020][core] Fix races in launcher code, test.

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20223
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org