You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Alexander Kazarin (Jira)" <ji...@apache.org> on 2019/09/18 15:37:00 UTC

[jira] [Comment Edited] (AIRFLOW-5517) SparkSubmitOperator: spark-binary parameter no longer taken from connection extra

    [ https://issues.apache.org/jira/browse/AIRFLOW-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932548#comment-16932548 ] 

Alexander Kazarin edited comment on AIRFLOW-5517 at 9/18/19 3:36 PM:
---------------------------------------------------------------------

Fix:
{code}
diff --git a/airflow/contrib/hooks/spark_submit_hook.py b/airflow/contrib/hooks/spark_submit_hook.py
index 449f072..bab5a71 100644
--- a/airflow/contrib/hooks/spark_submit_hook.py
+++ b/airflow/contrib/hooks/spark_submit_hook.py
@@ -174,7 +174,7 @@ class SparkSubmitHook(BaseHook, LoggingMixin):
                      'queue': None,
                      'deploy_mode': None,
                      'spark_home': None,
-                     'spark_binary': self._spark_binary or "spark-submit",
+                     'spark_binary': self._spark_binary,
                      'namespace': None}

         try:
diff --git a/airflow/contrib/operators/spark_submit_operator.py b/airflow/contrib/operators/spark_submit_operator.py
index 8325e1f..4c57e34 100644
--- a/airflow/contrib/operators/spark_submit_operator.py
+++ b/airflow/contrib/operators/spark_submit_operator.py
@@ -117,7 +117,7 @@ class SparkSubmitOperator(BaseOperator):
                  application_args=None,
                  env_vars=None,
                  verbose=False,
-                 spark_binary="spark-submit",
+                 spark_binary=None,
                  *args,
                  **kwargs):
         super().__init__(*args, **kwargs)
{code}


was (Author: boiler):
Fix:
{code}
diff --git a/airflow/contrib/operators/spark_submit_operator.py b/airflow/contrib/operators/spark_submit_operator.py
index 8325e1f..4c57e34 100644
--- a/airflow/contrib/operators/spark_submit_operator.py
+++ b/airflow/contrib/operators/spark_submit_operator.py
@@ -117,7 +117,7 @@ class SparkSubmitOperator(BaseOperator):
                  application_args=None,
                  env_vars=None,
                  verbose=False,
-                 spark_binary="spark-submit",
+                 spark_binary=None,
                  *args,
                  **kwargs):
         super().__init__(*args, **kwargs)
{code}

> SparkSubmitOperator: spark-binary parameter no longer taken from connection extra
> ---------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5517
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5517
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib
>    Affects Versions: 1.10.4, 1.10.5
>            Reporter: Alexander Kazarin
>            Priority: Major
>
> We have an extra parameters in spark connection:
> {code:java}
> {"deploy-mode": "cluster", "spark-binary": "spark2-submit"}
> {code}
> After upgrade to 1.10.5 from 1.10.3 parameter 'spark-binary' in extra is no longer take effect.
>  Broken after [this|https://github.com/apache/airflow/commit/8be59fb4edf0f2a132b13d0ffd1df0b8908191ab] commit, I think
> Workaround: call SparkSubmitOperator with spark_binary=None argument



--
This message was sent by Atlassian Jira
(v8.3.4#803005)