You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Milan van der Meer (JIRA)" <ji...@apache.org> on 2017/11/28 10:12:00 UTC
[jira] [Work started] (AIRFLOW-1854) Improve Spark submit hook for
cluster mode
[ https://issues.apache.org/jira/browse/AIRFLOW-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on AIRFLOW-1854 started by Milan van der Meer.
---------------------------------------------------
> Improve Spark submit hook for cluster mode
> ------------------------------------------
>
> Key: AIRFLOW-1854
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1854
> Project: Apache Airflow
> Issue Type: Improvement
> Components: hooks
> Reporter: Milan van der Meer
> Assignee: Milan van der Meer
> Priority: Minor
> Labels: features
>
> *We are already working on this issue and making a PR soon*
> When executing a Spark submit to a standalone cluster using the Spark submit hook, it will get a return code from the Spark submit action and not the Spark job itself.
> This means when a Spark submit is executed and successfully received by the cluster, the Airflow job will be successful, even when the Spark job fails on the cluster later on.
> Suggested solution:
> * When you execute a Spark submit, the response will contain a driver ID.
> * Use this driver ID to poll the cluster for the driver state.
> * Based on the drivers state, the job will be successful or failed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)