You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kevin McHale (JIRA)" <ji...@apache.org> on 2018/05/31 17:14:00 UTC

[jira] [Created] (AIRFLOW-2549) GCP DataProc Workflow Template operators report success when jobs fail

Kevin McHale created AIRFLOW-2549:
-------------------------------------

             Summary: GCP DataProc Workflow Template operators report success when jobs fail
                 Key: AIRFLOW-2549
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2549
             Project: Apache Airflow
          Issue Type: Bug
            Reporter: Kevin McHale
            Assignee: Kevin McHale


cc: [~DanSedov] [~fenglu]

 

The Google DataProc workflow template operators use the[_DataProcOperator|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataproc_hook.py#L149] class for analyzing the outcome of the workflow template instance, but that class does not properly detect errors.

 

Specifically, when any one of the jobs in the template fails, the operator should report an error, but it always reports success because it does not properly analyze the API responses.

 

The outcomes of individual jobs are indicated in the API responses under the {{metadata.graph.nodes}} path in the API response, and this field needs to be checked for errors.  However, the existing {{_DataProcOperator}} class only checks for the existence of the {{done}} and {{error}} fields.

 

Below is an example of the API response object for a failed DataProc workflow template operation, to illustrate this.  I pulled this directly from the DataProc API and anonymized it:
{code:java}
{
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  },
  "done": true,
  "name": "projects/my-project/regions/us-central1/operations/dddddddd-dddd-dddd-dddd-dddddddddddd",
  "metadata": {
    "createCluster": {
      "done": true,
      "operationId": "projects/my-project/regions/us-central1/operations/1111111-0000-aaaa-bbbb-ffffffffffff"
    },
    "clusterName": "fake-dataproc-cluster",
    "graph": {
      "nodes": [
        {
          "state": "FAILED",
          "jobId": "my-job-abcdefghijklm",
          "stepId": "my-job",
          "error": "Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in 'gs://dataproc-00000000-0000-0000-0000-000000000000-us-central1/google-cloud-dataproc-metainfo/cccccccc-cccc-cccc-cccc-cccccccccccc/jobs/my-job-abcdefghijklm/driveroutput'."
        }
      ]
    },
    "state": "DONE",
    "deleteCluster": {
      "done": true,
      "operationId": "projects/my-project/regions/us-central1/operations/1111111-1111-aaaa-bbbb-ffffffffffff"
    },
    "@type": "type.googleapis.com/google.cloud.dataproc.v1beta2.WorkflowMetadata"
  }
}
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)