You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@airflow.apache.org by Reed Villanueva <rv...@ucera.org> on 2019/12/20 16:43:44 UTC

Way to stop airflow dag if enough of certain tasks fail?

Is there a way to stop an airflow dag if enough of certain tasks fail? Eg.
have collection of tasks that all do same thing for different values

for dataset in list_of_datasets:
    task_1 = BashOperator(task_id="task_1_%s" % dataset["id"], ...)
    task_2 = BashOperator(task_id="task_2_%s" % dataset["id"], ...)
    task_3 = BashOperator(task_id="task_3_%s" % dataset["id"], ...)
    task_1 >> task_2 >> task_3

and if, say any 5 instances of task_2 fail, then it means something bigger
is wrong with the underlying process used for task_2 (as opposed to the
individual dataset being processed in the particular task instance) and
that that tasks is likely not going to succeed for any other instance of
that task, so the whole dag should stop or skip to a later /
alternative-branching task.

Is there a way to enforce this by setting something in the task
declarations? Any other common workarounds for this kind of situation
(something like a "some_failed" kind of trigger rule)?

-- 
This electronic message is intended only for the named 
recipient, and may 
contain information that is confidential or 
privileged. If you are not the 
intended recipient, you are 
hereby notified that any disclosure, copying, 
distribution or 
use of the contents of this message is strictly 
prohibited. If 
you have received this message in error or are not the 
named
recipient, please notify us immediately by contacting the 
sender at 
the electronic mail address noted above, and delete 
and destroy all copies 
of this message. Thank you.

Re: Way to stop airflow dag if enough of certain tasks fail?

Posted by Reed Villanueva <rv...@ucera.org>.
IDK. Sound interesting, but not sure how often I'd want to simply cut off a
running process and kill a dag (rather than maybe killing and skipping to a
clean up step). In any case, helpful to know to stop looking for a
conventional / preexisting airflow solution.

On Fri, Dec 20, 2019 at 6:51 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Not for now - but we have a proposal from someone in community to
> implement "fail-fast" mode for DAG- kill running tasks if another one
> fails. Would that be something that you'd find useful?
>
> J.
>
> On Fri, Dec 20, 2019 at 5:47 PM Reed Villanueva <rv...@ucera.org>
> wrote:
>
>> Is there a way to stop an airflow dag if enough of certain tasks fail?
>> Eg. have collection of tasks that all do same thing for different values
>>
>> for dataset in list_of_datasets:
>>     task_1 = BashOperator(task_id="task_1_%s" % dataset["id"], ...)
>>     task_2 = BashOperator(task_id="task_2_%s" % dataset["id"], ...)
>>     task_3 = BashOperator(task_id="task_3_%s" % dataset["id"], ...)
>>     task_1 >> task_2 >> task_3
>>
>> and if, say any 5 instances of task_2 fail, then it means something
>> bigger is wrong with the underlying process used for task_2 (as opposed to
>> the individual dataset being processed in the particular task instance) and
>> that that tasks is likely not going to succeed for any other instance of
>> that task, so the whole dag should stop or skip to a later /
>> alternative-branching task.
>>
>> Is there a way to enforce this by setting something in the task
>> declarations? Any other common workarounds for this kind of situation
>> (something like a "some_failed" kind of trigger rule)?
>>
>> This electronic message is intended only for the named
>> recipient, and may contain information that is confidential or
>> privileged. If you are not the intended recipient, you are
>> hereby notified that any disclosure, copying, distribution or
>> use of the contents of this message is strictly prohibited. If
>> you have received this message in error or are not the named
>> recipient, please notify us immediately by contacting the
>> sender at the electronic mail address noted above, and delete
>> and destroy all copies of this message. Thank you.
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 
This electronic message is intended only for the named 
recipient, and may 
contain information that is confidential or 
privileged. If you are not the 
intended recipient, you are 
hereby notified that any disclosure, copying, 
distribution or 
use of the contents of this message is strictly 
prohibited. If 
you have received this message in error or are not the 
named
recipient, please notify us immediately by contacting the 
sender at 
the electronic mail address noted above, and delete 
and destroy all copies 
of this message. Thank you.

Re: Way to stop airflow dag if enough of certain tasks fail?

Posted by Jarek Potiuk <Ja...@polidea.com>.
Not for now - but we have a proposal from someone in community to implement
"fail-fast" mode for DAG- kill running tasks if another one fails. Would
that be something that you'd find useful?

J.

On Fri, Dec 20, 2019 at 5:47 PM Reed Villanueva <rv...@ucera.org>
wrote:

> Is there a way to stop an airflow dag if enough of certain tasks fail? Eg.
> have collection of tasks that all do same thing for different values
>
> for dataset in list_of_datasets:
>     task_1 = BashOperator(task_id="task_1_%s" % dataset["id"], ...)
>     task_2 = BashOperator(task_id="task_2_%s" % dataset["id"], ...)
>     task_3 = BashOperator(task_id="task_3_%s" % dataset["id"], ...)
>     task_1 >> task_2 >> task_3
>
> and if, say any 5 instances of task_2 fail, then it means something bigger
> is wrong with the underlying process used for task_2 (as opposed to the
> individual dataset being processed in the particular task instance) and
> that that tasks is likely not going to succeed for any other instance of
> that task, so the whole dag should stop or skip to a later /
> alternative-branching task.
>
> Is there a way to enforce this by setting something in the task
> declarations? Any other common workarounds for this kind of situation
> (something like a "some_failed" kind of trigger rule)?
>
> This electronic message is intended only for the named
> recipient, and may contain information that is confidential or
> privileged. If you are not the intended recipient, you are
> hereby notified that any disclosure, copying, distribution or
> use of the contents of this message is strictly prohibited. If
> you have received this message in error or are not the named
> recipient, please notify us immediately by contacting the
> sender at the electronic mail address noted above, and delete
> and destroy all copies of this message. Thank you.
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>