You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jack (JIRA)" <ji...@apache.org> on 2019/05/02 08:16:00 UTC

[jira] [Commented] (AIRFLOW-2549) GCP DataProc Workflow Template operators report success when jobs fail

    [ https://issues.apache.org/jira/browse/AIRFLOW-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831473#comment-16831473 ] 

jack commented on AIRFLOW-2549:
-------------------------------

[~kaxilnaik] There has been many releases and updates on dataproc since this issue. Did it resolve the problem? 

> GCP DataProc Workflow Template operators report success when jobs fail
> ----------------------------------------------------------------------
>
>                 Key: AIRFLOW-2549
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2549
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Kevin McHale
>            Assignee: Kevin McHale
>            Priority: Major
>
> cc: [~DanSedov] [~fenglu]
>  
> The Google DataProc workflow template operators use the[_DataProcOperator|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataproc_hook.py#L149] class for analyzing the outcome of the workflow template instance, but that class does not properly detect errors.
>  
> Specifically, when any one of the jobs in the template fails, the operator should report an error, but it always reports success because it does not properly analyze the API responses.
>  
> The outcomes of individual jobs are indicated in the API responses under the {{metadata.graph.nodes}} path in the API response, and this field needs to be checked for errors.  However, the existing {{_DataProcOperator}} class only checks for the existence of the {{done}} and {{error}} fields.
>  
> Below is an example of the API response object for a failed DataProc workflow template operation, to illustrate this.  I pulled this directly from the DataProc API and anonymized it:
> {code:java}
> {
>   "response": {
>     "@type": "type.googleapis.com/google.protobuf.Empty"
>   },
>   "done": true,
>   "name": "projects/my-project/regions/us-central1/operations/dddddddd-dddd-dddd-dddd-dddddddddddd",
>   "metadata": {
>     "createCluster": {
>       "done": true,
>       "operationId": "projects/my-project/regions/us-central1/operations/1111111-0000-aaaa-bbbb-ffffffffffff"
>     },
>     "clusterName": "fake-dataproc-cluster",
>     "graph": {
>       "nodes": [
>         {
>           "state": "FAILED",
>           "jobId": "my-job-abcdefghijklm",
>           "stepId": "my-job",
>           "error": "Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in 'gs://dataproc-00000000-0000-0000-0000-000000000000-us-central1/google-cloud-dataproc-metainfo/cccccccc-cccc-cccc-cccc-cccccccccccc/jobs/my-job-abcdefghijklm/driveroutput'."
>         }
>       ]
>     },
>     "state": "DONE",
>     "deleteCluster": {
>       "done": true,
>       "operationId": "projects/my-project/regions/us-central1/operations/1111111-1111-aaaa-bbbb-ffffffffffff"
>     },
>     "@type": "type.googleapis.com/google.cloud.dataproc.v1beta2.WorkflowMetadata"
>   }
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)