You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bryan Cutler (JIRA)" <ji...@apache.org> on 2016/09/16 21:40:20 UTC

[jira] [Created] (SPARK-17568) Add spark-submit option for user to override ivy settings used to resolve packages/artifacts

Bryan Cutler created SPARK-17568:
------------------------------------

             Summary: Add spark-submit option for user to override ivy settings used to resolve packages/artifacts
                 Key: SPARK-17568
                 URL: https://issues.apache.org/jira/browse/SPARK-17568
             Project: Spark
          Issue Type: Improvement
          Components: Deploy, Spark Core
            Reporter: Bryan Cutler


The {{--packages}} option to {{spark-submit}} uses Ivy to map Maven coordinates to package jars. Currently, the IvySettings are hard-coded with Maven Central as the last repository in the chain of resolvers. 

At IBM, we have heard from several enterprise clients that are frustrated with lack of control over their local Spark installations. These clients want to ensure that certain artifacts can be excluded or patched due to security or license issues. For example, a package may use a vulnerable SSL protocol; or a package may link against an AGPL library written by a litigious competitor.

While additional repositories and exclusions can be added on the spark-submit command line, this falls short of what is needed. With Maven Central always as a fall-back repository, it is difficult to ensure only approved artifacts are used and it is often the exclusions that site admins are not aware of that can cause problems. Also, known exclusions are better handled through a centralized managed repository rather than as command line arguments.

To resolve these issues, we propose the following change: allow the user to specify an Ivy Settings XML file to pass in as an optional argument to {{spark-submit}} (or specify in a config file) to define alternate repositories used to resolve artifacts instead of the hard-coded defaults. The use case for this would be to define a managed repository (such as Nexus) in the settings file so that all requests for artifacts go through one location only.

Example usage:
{noformat}
$SPARK_HOME/bin/spark-submit --conf spark.ivy.settings=/path/to/ivysettings.xml  myapp.jar
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org