You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/09/24 11:19:02 UTC

[jira] [Commented] (AIRFLOW-1331) Contrib.SparkSubmitOperator should allow --packages parameter

    [ https://issues.apache.org/jira/browse/AIRFLOW-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178159#comment-16178159 ] 

ASF subversion and git services commented on AIRFLOW-1331:
----------------------------------------------------------

Commit fbca8f0ad8a01364fd4ddb3b5b8b7f9e15660060 in incubator-airflow's branch refs/heads/v1-9-test from [~hayashidac]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=fbca8f0 ]

[AIRFLOW-1331] add SparkSubmitOperator option

spark-submit has --packages option to use
additional java packages.
but current version of SparkSubmitOperator
couldn't handle it.
I added "packages" option to SparkSubmitOperator
to resolve it.
I added same option for TestSparkSubmitOperator,
too.

Closes #2622 from chie8842/AIRFLOW-1331

(cherry picked from commit e4a984a6b87888753415bdd4308c89622c983917)
Signed-off-by: Bolke de Bruin <bo...@xs4all.nl>


> Contrib.SparkSubmitOperator should allow --packages parameter
> -------------------------------------------------------------
>
>                 Key: AIRFLOW-1331
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1331
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib
>            Reporter: manuel garrido
>
> Right now SparkSubmitOperator (and its related hook SparkSubmitHook) does not allow for the parameter packages, an option very useful to pull packages from the spark-packages repository.
> I am not an expert by no means , but given how SparkSubmitHook builds the command to submit a spark job this could be as easy as adding 
> {code:python}
>         if self._jars:
>             connection_cmd += ["--jars", self._jars]
> {code}
> Right under [this line](https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_submit_hook.py#L167), as well as adding the *packages* parameter (defaulting to None) both in the SparkSubmitHook and SparkSubmitOperator init methods (basically, anywhere where the jars parameter is called).
> To be honest I would not mind doing a pull request to fix this, however I am not knowledgeable enough both about Airflow and how the Contribution guidelines are setup. I the community thinks this could be an easy fix that a newbie like me can do (i do believe this) then please let me know and I will do my best.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)