You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/12/13 00:42:00 UTC

[jira] [Commented] (SPARK-23464) MesosClusterScheduler double-escapes parameters to bash command

    [ https://issues.apache.org/jira/browse/SPARK-23464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719598#comment-16719598 ] 

ASF GitHub Bot commented on SPARK-23464:
----------------------------------------

vanzin closed pull request #20641: [SPARK-23464][MESOS] Fix mesos cluster scheduler options double-escaping
URL: https://github.com/apache/spark/pull/20641
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala b/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
index d224a7325820a..3b4e3d5d9cb57 100644
--- a/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
+++ b/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
@@ -530,7 +530,7 @@ private[spark] class MesosClusterScheduler(
       .filter { case (key, _) => !replicatedOptionsBlacklist.contains(key) }
       .toMap
     (defaultConf ++ driverConf).foreach { case (key, value) =>
-      options ++= Seq("--conf", s""""$key=${shellEscape(value)}"""".stripMargin) }
+      options ++= Seq("--conf", s"$key=${shellEscape(value)}") }
 
     options
   }
diff --git a/resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala b/resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala
index e534b9d7e3ed9..9d550a030f1bc 100644
--- a/resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala
+++ b/resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSuite.scala
@@ -199,6 +199,38 @@ class MesosClusterSchedulerSuite extends SparkFunSuite with LocalSparkContext wi
     })
   }
 
+  test("properly wraps and escapes parameters passed to driver command") {
+    setScheduler()
+
+    val mem = 1000
+    val cpu = 1
+
+    val response = scheduler.submitDriver(
+      new MesosDriverDescription("d1", "jar", mem, cpu, true,
+        command,
+        Map("spark.mesos.executor.home" -> "test",
+          "spark.app.name" -> "test",
+          // no special characters, wrap only
+          "spark.driver.extraJavaOptions" ->
+            "-XX+PrintGC -Dparam1=val1 -Dparam2=val2",
+          // special characters, to be escaped
+          "spark.executor.extraJavaOptions" ->
+            """-Dparam1="value 1" -Dparam2=value\ 2 -Dpath=$PATH"""),
+        "s1",
+        new Date()))
+    assert(response.success)
+
+    val offer = Utils.createOffer("o1", "s1", mem, cpu)
+    scheduler.resourceOffers(driver, List(offer).asJava)
+    val tasks = Utils.verifyTaskLaunched(driver, "o1")
+    val driverCmd = tasks.head.getCommand.getValue
+    assert(driverCmd.contains(
+      """--conf spark.driver.extraJavaOptions="-XX+PrintGC -Dparam1=val1 -Dparam2=val2""""))
+    assert(driverCmd.contains(
+      """--conf spark.executor.extraJavaOptions="""
+      + """"-Dparam1=\"value 1\" -Dparam2=value\\ 2 -Dpath=\$PATH""""))
+  }
+
   test("supports spark.mesos.driverEnv.*") {
     setScheduler()
 


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> MesosClusterScheduler double-escapes parameters to bash command
> ---------------------------------------------------------------
>
>                 Key: SPARK-23464
>                 URL: https://issues.apache.org/jira/browse/SPARK-23464
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 2.2.0
>         Environment: Spark 2.2.0 with Mesosphere patches (but the problem exists in main repo)
> DC/OS 1.9.5
>            Reporter: Marcin Kurczych
>            Priority: Major
>
> Parameters passed to driver launching command in Mesos container are escaped using _shellEscape_ function. In SPARK-18114 additional wrapping in double quotes has been introduced. This cancels out quoting done by _shellEscape_ and makes in unable to run tasks with whitespaces in parameters, as they are interpreted as additional parameters to in-container spark-submit.
> This is how parameter passed to in-container spark-submit looks like now:
> {code:java}
> --conf "spark.executor.extraJavaOptions="-Dfoo=\"first value\" -Dbar=another""
> {code}
> This is how they look after reverting SPARK-18114 related commit:
> {code:java}
> --conf spark.executor.extraJavaOptions="-Dfoo=\"first value\" -Dbar=another"
> {code}
> In current version submitting job with such extraJavaOptions causes following error:
> {code:java}
> Error: Unrecognized option: -Dbar=another
> Usage: spark-submit [options] <app jar | python file> [app arguments]
> Usage: spark-submit --kill [submission ID] --master [spark://...]
> Usage: spark-submit --status [submission ID] --master [spark://...]
> Usage: spark-submit run-example [options] example-class [example args]
> Options:
>   --master MASTER_URL         spark://host:port, mesos://host:port, yarn, or local.
>   --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally ("client") or
>                               on one of the worker machines inside the cluster ("cluster")
>                               (Default: client).
> (... further spark-submit help ...)
> {code}
> Reverting SPARK-18114 is the solution to the issue. I can create a pull-request in GitHub. I thought about adding unit tests for that, buth methods generating driver launch command are private.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org