You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2018/09/02 17:58:03 UTC

[jira] [Commented] (AIRFLOW-2769) Increase num_retries polling value on Dataflow hook

    [ https://issues.apache.org/jira/browse/AIRFLOW-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16601312#comment-16601312 ] 

Apache Spark commented on AIRFLOW-2769:
---------------------------------------

User 'pwoods25443' has created a pull request for this issue:
https://github.com/apache/incubator-airflow/pull/3617

> Increase num_retries polling value on Dataflow hook
> ---------------------------------------------------
>
>                 Key: AIRFLOW-2769
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2769
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib, Dataflow
>    Affects Versions: 1.10
>            Reporter: Paul Woods
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> *Problem Description*
> When airflow launches a Job in Dataflow, it polls the GCP api for job status until the job is complete or fails.  The GCP API occasionally returns 500 and 429  errors on these API requests, which causes the airflow task to fail intermittently, particularly for long-running tasks, while the dataflow job itself does not terminate.
> The recommended action is to retry the request with exponential backoff ([https://developers.google.com/drive/api/v3/handle-errors)].   The gcp api provides this service via the `num_retries` parameter on execute(), but that parameter is not used in
> {code:java}
> airflow.contrib.hooks.gcp_dataflow_hook{code}
> *Proposed Solution*
> Add num_retries to the execute() calls in 
> {code:java}
> _DataflowJob._get_job_id_from_name{code}
> and _
> {code:java}
> _DataflowJob._get_job{code}
>  
> *NOTE:*  the same problem was addressed for Dataproc in ([https://issues.apache.org/jira/browse/AIRFLOW-1718)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)