You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tnachen <gi...@git.apache.org> on 2015/12/16 18:45:51 UTC

[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

GitHub user tnachen opened a pull request:

    https://github.com/apache/spark/pull/10332

    [SPARK-12345][MESOS] Filter SPARK_HOME when submitting Spark jobs with Mesos cluster mode.

    SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined.
    
    We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tnachen/spark scheduler_ui

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10332
    
----
commit baea28f54406a58ae313d1a8428d985e70b3116a
Author: Timothy Chen <tn...@gmail.com>
Date:   2015-12-16T16:45:34Z

    Filter SPARK_HOME when submitting Spark jobs with Mesos cluster mode.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by dragos <gi...@git.apache.org>.
Github user dragos commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165201670
  
    There seems to be a race condition :) @skyluc opened #10329, but the change is in `SparkSubmit`. I wonder which one we should take. We tested #10329 locally and it passed. This will take a while to re-test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165545844
  
    Oh wait, looks like #10359 which fixes this is already merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by dragos <gi...@git.apache.org>.
Github user dragos commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165204489
  
    Agreed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165227460
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by mgummelt <gi...@git.apache.org>.
Github user mgummelt commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10332#discussion_r47944423
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala ---
    @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet(
         val driverMemory = sparkProperties.get("spark.driver.memory")
         val driverCores = sparkProperties.get("spark.driver.cores")
         val appArgs = request.appArgs
    -    val environmentVariables = request.environmentVariables
    +    // We don't want to pass down SPARK_HOME when launching Spark apps
    +    // with Mesos cluster mode since it's populated by default on the client and it will
    +    // cause spark-submit script to look for files in SPARK_HOME instead.
    +    // We only need the ability to specify where to find spark-submit script
    +    // which user can user spark.executor.home or spark.home configurations.
    +    val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME"))
    --- End diff --
    
    Yea, odd.  I ran this against DCOS and didn't see the error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by dragos <gi...@git.apache.org>.
Github user dragos commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165546077
  
    Yeah, just about to comment on that @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by dragos <gi...@git.apache.org>.
Github user dragos commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10332#discussion_r47902198
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala ---
    @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet(
         val driverMemory = sparkProperties.get("spark.driver.memory")
         val driverCores = sparkProperties.get("spark.driver.cores")
         val appArgs = request.appArgs
    -    val environmentVariables = request.environmentVariables
    +    // We don't want to pass down SPARK_HOME when launching Spark apps
    +    // with Mesos cluster mode since it's populated by default on the client and it will
    +    // cause spark-submit script to look for files in SPARK_HOME instead.
    +    // We only need the ability to specify where to find spark-submit script
    +    // which user can user spark.executor.home or spark.home configurations.
    +    val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME"))
    --- End diff --
    
    Unfortunately there is a subtle error here, and this is a no-op. And nobody ran this code, it seems.
    
    Here's what happens: `environmentVariables` is a map, not a sequence. So `filter` works on Pairs, and a pair will never be equal to a string. The correct call would have been `filterKeys`.
    
    Unfortunately this went in RC3 without fixing the bug. It is harmless otherwise, but highlights the fact that there are no easy fixes or safe changes. :-/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165227464
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47830/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by tnachen <gi...@git.apache.org>.
Github user tnachen commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165206313
  
    Yes I would also want to just make changes on the Mesos side and not cause any possible regression on standalone.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by tnachen <gi...@git.apache.org>.
Github user tnachen commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165190411
  
    @dragos PTAL


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10332#discussion_r47909261
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala ---
    @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet(
         val driverMemory = sparkProperties.get("spark.driver.memory")
         val driverCores = sparkProperties.get("spark.driver.cores")
         val appArgs = request.appArgs
    -    val environmentVariables = request.environmentVariables
    +    // We don't want to pass down SPARK_HOME when launching Spark apps
    +    // with Mesos cluster mode since it's populated by default on the client and it will
    +    // cause spark-submit script to look for files in SPARK_HOME instead.
    +    // We only need the ability to specify where to find spark-submit script
    +    // which user can user spark.executor.home or spark.home configurations.
    +    val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME"))
    --- End diff --
    
    That's really the problem, I think we should fix this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165227299
  
    **[Test build #47830 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47830/consoleFull)** for PR 10332 at commit [`baea28f`](https://github.com/apache/spark/commit/baea28f54406a58ae313d1a8428d985e70b3116a).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10332#discussion_r47816823
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala ---
    @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet(
         val driverMemory = sparkProperties.get("spark.driver.memory")
         val driverCores = sparkProperties.get("spark.driver.cores")
         val appArgs = request.appArgs
    -    val environmentVariables = request.environmentVariables
    +    // We don't want to pass down SPARK_HOME when launching Spark apps
    +    // with Mesos cluster mode since it's populated by default on the client and it will
    +    // cause spark-submit script to look for files in SPARK_HOME instead.
    +    // We only need the ability to specify where to find spark-submit script
    +    // which user can user spark.executor.home or spark.home configurations.
    --- End diff --
    
    I would add (SPARK-12345) here, but I'll fix this myself on merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165545519
  
    OK I'm going to go ahead and revert this patch since it doesn't work...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by skyluc <gi...@git.apache.org>.
Github user skyluc commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165204770
  
    Code LGTM. Unfortunately, I cannot try it before a couple of hours.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165204282
  
    I would lean towards this patch since it only affects mesos and not standalone mode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165194859
  
    **[Test build #47830 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47830/consoleFull)** for PR 10332 at commit [`baea28f`](https://github.com/apache/spark/commit/baea28f54406a58ae313d1a8428d985e70b3116a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10332


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by tnachen <gi...@git.apache.org>.
Github user tnachen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10332#discussion_r47939790
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala ---
    @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet(
         val driverMemory = sparkProperties.get("spark.driver.memory")
         val driverCores = sparkProperties.get("spark.driver.cores")
         val appArgs = request.appArgs
    -    val environmentVariables = request.environmentVariables
    +    // We don't want to pass down SPARK_HOME when launching Spark apps
    +    // with Mesos cluster mode since it's populated by default on the client and it will
    +    // cause spark-submit script to look for files in SPARK_HOME instead.
    +    // We only need the ability to specify where to find spark-submit script
    +    // which user can user spark.executor.home or spark.home configurations.
    +    val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME"))
    --- End diff --
    
    Hmm interesting I wonder if I ran it differently than what my code have, since I was able to see it not passed through.
    Thanks for retesting this, I think having the automated tests is going to be crucial to prevent mistakes like this that I'm making :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165210134
  
    LGTM merging into master and 1.6. Just FYI I might revert this patch in master because I believe #10329 is a better fix in the long run, but for now let's just unblock the release.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/10332#issuecomment-165925913
  
    Note: I'm reverting this patch in *master only* since #10329, the better alternative, is merged there.
    This patch continues to exist in branch-1.6.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org