You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Cyril Shcherbin (Jira)" <ji...@apache.org> on 2020/10/07 09:41:00 UTC

[jira] [Commented] (AIRFLOW-7052) spark 3.0.0 does not work with sparksubmitoperator

    [ https://issues.apache.org/jira/browse/AIRFLOW-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209421#comment-17209421 ] 

Cyril Shcherbin commented on AIRFLOW-7052:
------------------------------------------

Fixed in https://github.com/apache/airflow/pull/8730

> spark 3.0.0 does not work with sparksubmitoperator
> --------------------------------------------------
>
>                 Key: AIRFLOW-7052
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-7052
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: operators
>    Affects Versions: 1.10.9
>            Reporter: t oo
>            Priority: Major
>
> from slack:
> If anyone runs into this in the future I've found out where the issue is in the spark_submit_hook.py.
> Line 419
> ``match_exit_code = re.search(r'\s*Exit code: (\d+)', line)```
> in spark 3.0 the line that prints the exit code is actually lower case E on "Exit code:" so this re.search will never find that value. To fix this you can simply switch the line to this
> ```match_exit_code = re.search(r'\s*Exit code: (\d+)', line, re.IGNORECASE)```
> Which should also be backwards compatible.
> MattD
> Having some difficulty understanding why my spark-submit task is being marked as failed even though the spark job has completed successfully,
> I see these logs at the end of the job,
> exit code: 0
> termination reason: Completed
> But then it also right after displays this,
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 966, in _run_raw_task
>     result = task_copy.execute(context=context)
>   File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/spark_submit_operator.py", line 187, in execute
>     self._hook.submit(self._application)
>   File "/usr/local/lib/python3.7/site-packages/airflow/contrib/hooks/spark_submit_hook.py", line 403, in submit
>     self._mask_cmd(spark_submit_cmd), returncode
> airflow.exceptions.AirflowException: Cannot execute: spark-submit (spark submit args would be here) Error code is: 0.
> I took a look at spark_submit_hook.py line 403 and it shows that it shouldn't be throwing that exception if the error code is 0. Anyone have any ideas? I'm only finding this happens now that I've switched to using spark 3.0, never ran into this with spark 2.4.5. *Also running 1.10.9 now



--
This message was sent by Atlassian Jira
(v8.3.4#803005)