You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Bolke de Bruin (JIRA)" <ji...@apache.org> on 2017/10/02 15:10:02 UTC

[jira] [Resolved] (AIRFLOW-1658) Kill (possibly) still running Druid indexing job after max timeout is exceeded

     [ https://issues.apache.org/jira/browse/AIRFLOW-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bolke de Bruin resolved AIRFLOW-1658.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.9.0

Issue resolved by pull request #2644
[https://github.com/apache/incubator-airflow/pull/2644]

> Kill (possibly) still running Druid indexing job after max timeout is exceeded
> ------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-1658
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1658
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>            Reporter: Daniel van der Ende
>            Assignee: Daniel van der Ende
>            Priority: Minor
>             Fix For: 1.9.0
>
>
> Right now, the Druid hook contains a parameter max_ingestion_time. If the total execution time of the Druid indexing job exceeds this timeout, an AirflowException is thrown. However, this does not necessarily mean that the Druid task failed (a busy Hadoop cluster could also be to blame for slow performance for example). If the Airflow task is then retried, you end up with multiple Druid tasks performing the same work. 
> To easily prevent this, we can call the shutdown endpoint on the task id that is still running.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)