You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jack (JIRA)" <ji...@apache.org> on 2019/04/04 07:19:00 UTC

[jira] [Updated] (AIRFLOW-4236) Operators uploading to GCS needs to handle googleapiclient.errors

     [ https://issues.apache.org/jira/browse/AIRFLOW-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

jack updated AIRFLOW-4236:
--------------------------
         Labels: easy-fix  (was: )
    Component/s: operators
                 gcp
     Issue Type: Bug  (was: New Feature)

> Operators uploading to GCS needs to handle googleapiclient.errors
> -----------------------------------------------------------------
>
>                 Key: AIRFLOW-4236
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4236
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: gcp, operators
>    Affects Versions: 1.10.3
>            Reporter: jack
>            Priority: Major
>              Labels: easy-fix
>
> I got HTTP 500 error when using the MySqlToGoogleCloudStorageOperator this error caused the operator to fail and a 3 hours job were gone to waste.
> This error should not cause the operator to fail immediately. It should first try to retry the upload as suggested : 
> [https://stackoverflow.com/questions/23945784/how-to-manage-google-api-errors-in-python]
> There should be at least 2 attempts to retry before giving up.
> log:
> INFO - Subtask: Traceback (most recent call last):
> INFO - Subtask: File "/usr/local/bin/airflow", line 27, in <module>
> INFO - Subtask: args.func(args)
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 392, in run
> INFO - Subtask: pool=args.pool,
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 50, in wrapper
> INFO - Subtask: result = func(*args, **kwargs)
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1493, in _run_raw_task
> INFO - Subtask: result = task_copy.execute(context=context)
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/operators/mysql_to_gcs.py", line 99, in execute
> INFO - Subtask: self._upload_to_gcs(files_to_upload)
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/operators/mysql_to_gcs.py", line 184, in _upload_to_gcs
> INFO - Subtask: hook.upload(self.bucket, object, tmp_file_handle.name, 'application/json')
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/gcs_hook.py", line 131, in upload
> INFO - Subtask: .insert(bucket=bucket, name=object, media_body=media) \
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper
> INFO - Subtask: return wrapped(*args, **kwargs)
> INFO - Subtask: File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 842, in execute
> INFO - Subtask: raise HttpError(resp, content, uri=self.uri)
> INFO - Subtask: googleapiclient.errors.HttpError: <HttpError 500 when requesting https://www.googleapis.com/upload/storage/v1/b/.......json returned "Backend Error">



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)