You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/10/02 15:10:02 UTC

[jira] [Commented] (AIRFLOW-1658) Kill (possibly) still running Druid indexing job after max timeout is exceeded

    [ https://issues.apache.org/jira/browse/AIRFLOW-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188286#comment-16188286 ] 

ASF subversion and git services commented on AIRFLOW-1658:
----------------------------------------------------------

Commit cbf7add7aa2e61d1bfe511d6a8250b63485068bb in incubator-airflow's branch refs/heads/v1-9-test from [~danielvdende]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=cbf7add ]

[AIRFLOW-1658] Kill Druid task on timeout

If the total execution time of a Druid task
exceeds the max timeout
defined, the Airflow task fails, but the Druid
task may still keep
running. This can cause undesired behaviour if
Airflow retries the
task. This patch calls the shutdown endpoint on
the Druid task to
kill any still running Druid task.

This commit also adds tests to ensure that all
mocked requests in
the Druid hook are actually called.

Closes #2644 from
danielvdende/kill_druid_task_on_timeout_exceeded

(cherry picked from commit c61726288dcdb093c55a38faaf60aef020d0d3e0)
Signed-off-by: Bolke de Bruin <bo...@xs4all.nl>


> Kill (possibly) still running Druid indexing job after max timeout is exceeded
> ------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-1658
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1658
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>            Reporter: Daniel van der Ende
>            Assignee: Daniel van der Ende
>            Priority: Minor
>             Fix For: 1.9.0
>
>
> Right now, the Druid hook contains a parameter max_ingestion_time. If the total execution time of the Druid indexing job exceeds this timeout, an AirflowException is thrown. However, this does not necessarily mean that the Druid task failed (a busy Hadoop cluster could also be to blame for slow performance for example). If the Airflow task is then retried, you end up with multiple Druid tasks performing the same work. 
> To easily prevent this, we can call the shutdown endpoint on the task id that is still running.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)