You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Joel Croteau (Jira)" <ji...@apache.org> on 2019/08/22 01:28:00 UTC

[jira] [Created] (AIRFLOW-5281) GCP transfer operators do not detect previous successful runs if interrupted

Joel Croteau created AIRFLOW-5281:
-------------------------------------

             Summary: GCP transfer operators do not detect previous successful runs if interrupted
                 Key: AIRFLOW-5281
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5281
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib, gcp, operators
    Affects Versions: 1.10.3
            Reporter: Joel Croteau


Operators that rely on GCS/BigQuery transfer service basically work by creating a transfer job, then periodically polling the transfer service for the status of their transfer job, and reporting success or failure upon job completion. This can cause problems if a task instance is terminated by the cluster, e.g. for lack of resources, as retries will try to create a new transfer job, and if the previous job was successful, the new job will usually fail because there will already be files at its destination, and this will cause the task overall to fail, despite having actually succeeded in transferring what it needs to transfer. I have noticed this in particular using `{{}}{{S3ToGoogleCloudStorageTransferOperator}}` and `{{GoogleCloudStorageToBigQueryOperator}}`, but I imagine that it exists in other operators as well. I know that the documentation for at least some of these operators includes a big warning about how they are not idempotent, and that multiple runs will create multiple transfer jobs, but that doesn't actually help with fixing it. What they should do is set a variable or XCom upon creation of the job, and retries should check for existing jobs before starting a new one. I've also noticed this same thing using `PostgresOperator` to execute an `UNLOAD` query on Redshift (I didn't use `RedshiftToS3Transfer` because that operator has zero customization options, and I wanted to actually control what columns I exported).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)