You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "N. M. (Jira)" <ji...@apache.org> on 2021/02/17 18:55:00 UTC

[jira] [Comment Edited] (AIRFLOW-4453) none_failed trigger rule cascading skipped state to downstream tasks

    [ https://issues.apache.org/jira/browse/AIRFLOW-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286082#comment-17286082 ] 

N. M. edited comment on AIRFLOW-4453 at 2/17/21, 6:54 PM:
----------------------------------------------------------

This has either regressed in 1.10.12 or there are still corner cases where it is not solved.

In a simple three-task DAG where, the first step is the [GoogleCloudStoragePrefixSensor|https://airflow.apache.org/docs/apache-airflow/1.10.12/_modules/airflow/contrib/sensors/gcs_sensor.html], followed by a processing task and ending with a heartbeat check operator:
{code:java}
check_for_late_data >> run_statekeeper >> passive_check{code}
The passive_check task is configured with NONE_FAILED:
{code:java}
passive_check = PassiveCheckOperator(task_id="passive_check", dag=dag, trigger_rule=TriggerRule.NONE_FAILED){code}
 The GCS sensor operator exits thusly if it finds no keys:
{code:java}
[2021-02-12 00:32:18,130] {taskinstance.py:1025} INFO - Marking task as SKIPPED.dag_id=pipeline_v1, task_id=check_for_late_data, execution_date=20210211T003000, start_date=20210212T003017, end_date=
 [2021-02-12 00:32:18,130] {taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=pipeline_v1, task_id=check_for_late_data, execution_date=20210211T003000, start_date=20210212T003017, end_date=20210212T003218{code}
...but the final step never fires:

!image-2021-02-17-12-41-23-809.png!

If we put a shared dummy start task in front, the same happens:

!image-2021-02-17-13-50-32-787.png!

This is airflow 1.10.12, installed from the airflow-7.16.0 helm chart, using the celery executor.


was (Author: n-oden):
This has either regressed in 1.10.12 or there are still corner cases where it is not solved.

In a simple three-task DAG where, the first step is the [GoogleCloudStoragePrefixSensor|https://airflow.apache.org/docs/apache-airflow/1.10.12/_modules/airflow/contrib/sensors/gcs_sensor.html], followed by a processing task and ending with a heartbeat check operator:
check_for_late_data >> run_statekeeper >> passive_check
The passive_check task is configured with NONE_FAILED:
passive_check = PassiveCheckOperator(task_id="passive_check", dag=dag, trigger_rule=TriggerRule.NONE_FAILED)
 The GCS sensor operator exits thusly if it finds no keys:
[2021-02-12 00:32:18,130] \{taskinstance.py:1025} INFO - Marking task as SKIPPED.dag_id=pipeline_v1, task_id=check_for_late_data, execution_date=20210211T003000, start_date=20210212T003017, end_date=
[2021-02-12 00:32:18,130] \{taskinstance.py:1070} INFO - Marking task as SUCCESS.dag_id=pipeline_v1, task_id=check_for_late_data, execution_date=20210211T003000, start_date=20210212T003017, end_date=20210212T003218
...but the final step never fires:

!image-2021-02-17-12-41-23-809.png!

If we put a shared dummy start task in front, the same happens:

!image-2021-02-17-13-50-32-787.png!

This is airflow 1.10.12, installed from the airflow-7.16.0 helm chart, using the celery executor.

> none_failed trigger rule cascading skipped state to downstream tasks
> --------------------------------------------------------------------
>
>                 Key: AIRFLOW-4453
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4453
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, scheduler
>    Affects Versions: 1.10.3, 1.10.4, 1.10.5, 1.10.6, 1.10.7
>            Reporter: Dmytro Kulyk
>            Assignee: Kaxil Naik
>            Priority: Major
>              Labels: skipped
>             Fix For: 1.10.5, 1.10.10
>
>         Attachments: 3_step.png, cube_update.py, image-2019-05-02-18-11-28-307.png, image-2021-02-17-12-41-23-809.png, image-2021-02-17-13-50-32-787.png, simple_skip.png
>
>
> Task with trigger_rule = 'none_failed' cascading *skipped *status to downstream task
>  * task have multiple upstream tasks
>  * trigger_rule set to 'none_failed'
>  * some of upstream tasks can be skipped due to *latest only*
> Basing on documentation this shouldn't happen
>  !image-2019-05-02-18-11-28-307.png|width=655,height=372! 
>  DAG attached



--
This message was sent by Atlassian Jira
(v8.3.4#803005)