You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "paul mackles (JIRA)" <ji...@apache.org> on 2018/04/15 22:54:00 UTC

[jira] [Created] (SPARK-23988) [Mesos] Improve handling of appResource in mesos dispatcher when using Docker

paul mackles created SPARK-23988:
------------------------------------

             Summary: [Mesos] Improve handling of appResource in mesos dispatcher when using Docker
                 Key: SPARK-23988
                 URL: https://issues.apache.org/jira/browse/SPARK-23988
             Project: Spark
          Issue Type: Improvement
          Components: Mesos
    Affects Versions: 2.3.0, 2.2.1
            Reporter: paul mackles


Our organization makes heavy use of Docker containers when running Spark on Mesos. The images we use for our containers include Spark along with all of the application dependencies. We find this to be a great way to manage our artifacts.

When specifying the primary application jar (i.e. appResource), the mesos dispatcher insists on adding it to the list of URIs for Mesos to fetch as part of launching the driver's container. This leads to confusing behavior where paths such as:
 * file:///application.jar
 * local:/application.jar
 * /application.jar

wind up being fetched from the host where the driver is running. Obviously, this doesn't work since all of the above examples are referencing the path of the jar on the container image itself.

Here is an example that I used for testing:
{code:java}
spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://spark-dispatcher \
  --deploy-mode cluster \
  --conf spark.cores.max=4 \
  --conf spark.mesos.executor.docker.image=spark:2.2.1 \
  local:/usr/local/spark/examples/jars/spark-examples_2.11-2.2.1.jar 10{code}
The "spark:2.2.1" image contains an installation of spark under "/usr/local/spark". Notice how we reference the appResource using the "local:/" scheme.

If you try the above with the current version of the mesos dispatcher, it will try to fetch the path "/usr/local/spark/examples/jars/spark-examples_2.11-2.2.1.jar" from the host filesystem where the driver's container is running. On our systems, this fails since we don't have spark installed on the hosts. 

For the PR, all I did was modify the mesos dispatcher to not add the "appResource to the list of URIs for Mesos to fetch if it uses the "local:/" scheme.

For now, I didn't change the behavior of absolute paths or the "file:/" scheme because I wanted to leave some form for the old behavior in place for backwards compatibility. Anyone have any opinions on whether these schemes should change as well?

The PR also includes support for using "spark-internal" with Mesos in cluster mode which is something we need for another use-case. I can separate them if that makes more sense.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org