You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jeffrey Payne (JIRA)" <ji...@apache.org> on 2018/09/10 22:41:00 UTC

[jira] [Assigned] (AIRFLOW-3035) gcp_dataproc_hook should treat CANCELLED job state consistently

     [ https://issues.apache.org/jira/browse/AIRFLOW-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeffrey Payne reassigned AIRFLOW-3035:
--------------------------------------

    Assignee: Jeffrey Payne

> gcp_dataproc_hook should treat CANCELLED job state consistently
> ---------------------------------------------------------------
>
>                 Key: AIRFLOW-3035
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3035
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib
>    Affects Versions: 1.10.0, 2.0.0, 1.10.1
>            Reporter: Jeffrey Payne
>            Assignee: Jeffrey Payne
>            Priority: Major
>              Labels: dataproc
>
> When a DP job is cancelled, {{gcp_dataproc_hook.py}} does not treat the {{CENCELLED}} state in a consistent and non-intuitive manner:
> # The API internal to {{gcp_dataproc_hook.py}} returns {{False}} from {{_DataProcJob.wait_for_done()}}, resulting in {{raise_error()}} being called for cancelled jobs, yet {{raise_error()}} only raises {{Exception}} if the job state is {{ERROR}}.
> # The end result from the perspective of the {{dataproc_operator.py}} for a cancelled job is that the job succeeded, which results in the success callback being called.  This seems strange to me, as a "cancelled" job is rarely considered successful, in my experience.
> Simply changing {{raise_error()}} from:
> {code:python}
>         if 'ERROR' == self.job['status']['state']:
> {code}
> to
> {code:python}
>         if self.job['status']['state'] in ('ERROR', 'CANCELLED'):
> {code}
> would fix both of these...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)