You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by andreweduffy <gi...@git.apache.org> on 2016/07/13 18:55:19 UTC

[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...

GitHub user andreweduffy opened a pull request:

    https://github.com/apache/spark/pull/14185

    [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuilder for user

    ## What changes were proposed in this pull request?
    
    Expose the `ProcessBuilder` in `SparkLauncher`, as well as opening up `SparkLauncher#startApplication` to allow accepting arbitrary ProcessBuilders.
    
    For more info see [https://issues.apache.org/jira/browse/SPARK-16511](this JIRA ticket)
    
    Also covers [https://issues.apache.org/jira/browse/SPARK-14702](SPARK-14702)
    
    ## How was this patch tested?
    
    Unit tests + spark shell
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andreweduffy/spark feature/processbuilder

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14185.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14185
    
----
commit 7fe36f5970e7e577a47d8b6a7534cc95d22a94c2
Author: Andrew Duffy <ad...@palantir.com>
Date:   2016-07-13T14:20:54Z

    [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuilder for user
    flexibility
    
    Also covers https://issues.apache.org/jira/browse/SPARK-16511

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    **[Test build #62289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62289/consoleFull)** for PR 14185 at commit [`7fe36f5`](https://github.com/apache/spark/commit/7fe36f5970e7e577a47d8b6a7534cc95d22a94c2).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14185#discussion_r70729726
  
    --- Diff: core/src/test/scala/org/apache/spark/launcher/LauncherBackendSuite.scala ---
    @@ -17,15 +17,16 @@
     
     package org.apache.spark.launcher
     
    +
     import java.util.concurrent.TimeUnit
     
     import scala.concurrent.duration._
     import scala.language.postfixOps
     
    -import org.scalatest.Matchers
     import org.scalatest.concurrent.Eventually._
    +import org.scalatest.Matchers
    --- End diff --
    
    Please avoid making unnecessary changes to imports. This one, in particular, is wrong and would trigger a style checker error if the style checker didn't have a bug...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14185#discussion_r70729733
  
    --- Diff: core/src/test/scala/org/apache/spark/launcher/LauncherBackendSuite.scala ---
    @@ -17,15 +17,16 @@
     
     package org.apache.spark.launcher
     
    +
    --- End diff --
    
    don't add this extra line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    This PR should really be referencing SPARK-14702, which is the older bug. SPARK-16511 should be closed as a duplicate of it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62289/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14185#discussion_r70730228
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ---
    @@ -418,14 +414,26 @@ public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) thr
           }
         }
     
    -    String loggerPrefix = getClass().getPackage().getName();
    -    String loggerName = String.format("%s.app.%s", loggerPrefix, appName);
    -    ProcessBuilder pb = createBuilder().redirectErrorStream(true);
    +    ProcessBuilder pb = createProcessBuilder().redirectErrorStream(true);
    +    return startApplication(appName, pb, listeners);
    +  }
    +
    +  public SparkAppHandle startApplication(String appName, ProcessBuilder pb,
    +                                         SparkAppHandle.Listener... listeners) throws IOException {
    +    ChildProcAppHandle handle = LauncherServer.newAppHandle();
    +    for (SparkAppHandle.Listener l : listeners) {
    +      handle.addListener(l);
    +    }
    +
         pb.environment().put(LauncherProtocol.ENV_LAUNCHER_PORT,
           String.valueOf(LauncherServer.getServerInstance().getPort()));
         pb.environment().put(LauncherProtocol.ENV_LAUNCHER_SECRET, handle.getSecret());
    +
    +    String loggerPrefix = getClass().getPackage().getName();
    +    String loggerName = String.format("%s.app.%s", loggerPrefix, appName);
         try {
    -      handle.setChildProc(pb.start(), loggerName);
    +      Process proc = pb.start();
    --- End diff --
    
    This change is not necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14185#discussion_r70729812
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ---
    @@ -418,14 +414,26 @@ public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) thr
           }
         }
     
    -    String loggerPrefix = getClass().getPackage().getName();
    -    String loggerName = String.format("%s.app.%s", loggerPrefix, appName);
    -    ProcessBuilder pb = createBuilder().redirectErrorStream(true);
    +    ProcessBuilder pb = createProcessBuilder().redirectErrorStream(true);
    +    return startApplication(appName, pb, listeners);
    +  }
    +
    +  public SparkAppHandle startApplication(String appName, ProcessBuilder pb,
    --- End diff --
    
    Style is:
    
    ```
    public Type method(
        Type arg1,
        Type arg2,
        ...) {
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    I tried playing a little bit with what this API would look like, and I'm starting to question my previous idea that exposing the ProcessBuilder is the way to go here...
    
    The above issue with logging redirection is just one source of issues. There are other problems, such as the API becoming a little bit convoluted:
    
    ```
    SparkLauncher launcher = ...;
    ProcessBuilder pb = launcher.createProcessBuilder();
    launcher.startApplication(pb);
    ```
    
    And all the different ways to start the Spark app (3 different methods in SparkLauncher + `ProcessBuilder.start()`). At this point I'm starting to think it might be better to mirror parts of the ProcessBuilder API that are interesting. e.g., have:
    
    ```
      SparkLauncher directory(File directory)
      SparkLauncher redirectErrorStream(boolean redirectErrorStream)
      SparkLauncher redirectError(ProcessBuilder.Redirect destination)
      SparkLauncher redirectOutput(ProcessBuilder.Redirect destination)
    ```
    
    Optionally these (since you can use `Redirect.to(File)`):
    
    ```
      SparkLauncher redirectError(File destination)
      SparkLauncher redirectOutput(File destination)
    ```
    
    And add this one which implements the current logger redirection:
    
    ```  
      SparkLauncher redirectToLog(String loggerName)
    ```
    
    By default logging redirection would be done when using `startApplication`, using the current semantics, unless the user has overridden that by calling one of the new methods (which would also apply to `launch`).
    
    This adds more methods and is a bit more work, but it avoids certains oddities in the API, avoids overloading `startApplication`, and hides ProcessBuilder APIs we don't want to expose (like `command()`). What do you think?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14185: [SPARK-14702][SUBMIT] Expose SparkLauncher's Proc...

Posted by andreweduffy <gi...@git.apache.org>.
Github user andreweduffy closed the pull request at:

    https://github.com/apache/spark/pull/14185


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's Proc...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14185#discussion_r70731128
  
    --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ---
    @@ -418,14 +414,26 @@ public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) thr
           }
         }
     
    -    String loggerPrefix = getClass().getPackage().getName();
    -    String loggerName = String.format("%s.app.%s", loggerPrefix, appName);
    -    ProcessBuilder pb = createBuilder().redirectErrorStream(true);
    +    ProcessBuilder pb = createProcessBuilder().redirectErrorStream(true);
    +    return startApplication(appName, pb, listeners);
    +  }
    +
    +  public SparkAppHandle startApplication(String appName, ProcessBuilder pb,
    +                                         SparkAppHandle.Listener... listeners) throws IOException {
    +    ChildProcAppHandle handle = LauncherServer.newAppHandle();
    +    for (SparkAppHandle.Listener l : listeners) {
    +      handle.addListener(l);
    +    }
    +
         pb.environment().put(LauncherProtocol.ENV_LAUNCHER_PORT,
           String.valueOf(LauncherServer.getServerInstance().getPort()));
         pb.environment().put(LauncherProtocol.ENV_LAUNCHER_SECRET, handle.getSecret());
    +
    +    String loggerPrefix = getClass().getPackage().getName();
    --- End diff --
    
    There's an issue now that the user might call `.redirectOutput` on the `ProcessBuilder` and this code might break. So the log redirection really should only happen if the `ProcessBuilder` is configured to redirect to `ProcessBuilder.Redirect.PIPE`. And you'd need to make sure that `redirectErrorStream()` is set in that case.
    
    An alternative approach would be to only do this log redirection from the existing `startApplication` call that does not take a `ProcessBuilder`. In that case, output of the child process for the new `startApplication` should be handled by the caller, just like for the old `launch()` API.
    
    I really wish you could create new `ProcessBuilder.Redirect` values so that we could create a new `LOG` one and fix this properly, but it doesn't look like that's possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-14702][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by andreweduffy <gi...@git.apache.org>.
Github user andreweduffy commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    That sounds reasonable, I can take a shot at that today, I'll submit it as a second PR and then we can close this one out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14185: [SPARK-16511][SUBMIT] Expose SparkLauncher's ProcessBuil...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14185
  
    **[Test build #62289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62289/consoleFull)** for PR 14185 at commit [`7fe36f5`](https://github.com/apache/spark/commit/7fe36f5970e7e577a47d8b6a7534cc95d22a94c2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org