You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/20 02:02:02 UTC

[GitHub] [airflow] alexbegg opened a new issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

alexbegg opened a new issue #19719:
URL: https://github.com/apache/airflow/issues/19719


   ### Apache Airflow version
   
   2.1.4
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   When a task has the default "all_success" trigger rule, if the task immediately upstream has a future `start_date` it will cause the current task to not run, same with the downstream tasks, causing the DAG run to fail.
   
   This doesn't seem to be desirable. There has to be a reason each task has their own `start_date` and it is not on the DAG level, right?
   
   I think the ideal solution would be the "all_success" trigger rule should ignore **_future_** upstream tasks.
   
   ### What you expected to happen
   
   When a task "all_success" trigger rule it should ignore future upstream tasks.
   
   ### How to reproduce
   
   1. Make a DAG with:
       1. `catchup=True`
       2. `default_args` setting the default `start_date` to "2021-11-01" (18 days ago)
       3. `default_args` setting the default `end_date` to "2021-11-06" (to not have too many runs to test)
   3. Add a first task (in this example named "only_run_in_future") with a `start_date` of "2021-11-04"
   4. Add a downstream task (in this example named "normal_task")
   4. Unpause the DAG
   5. For the first 3 runs, both "only_run_in_future" and "normal_task" will not run
   6. In the "normal_task" Task Instance Details the "Trigger Rule" dependencies reason will be:
   > Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'only_run_in_future'}
   7. Any remaining downstream tasks (if applicable) will also not run
   8. The first 3 DAG runs will be marked as failed, but DAG runs starting "2021-11-04" will be fine
   
   ### Anything else
   
   Maybe the "all_success" trigger rule message should be changed from this:
   > requires all upstream tasks to have succeeded this:
   
   to this:
   > requires all scheduled upstream tasks to have succeeded
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] alexbegg edited a comment on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
alexbegg edited a comment on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-975858267


   > If a task is after another that mean there is a dependency so airflow will not schedule a task if an upstream task haven't run yet , no matter the trigger rule of that second task.
   
   What I am trying to say with this issue is that the task shouldn't be a dependency because the run is before the task's start_date. I can go as far as saying I think there should be a full design change where the UI to make it so it shouldn't even render the task in the graph view and the tree view shouldn't show the task's square.
   
   Shouldn't dependencies be only tasks where the current run's datetime is within the task's start_date and end_date?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-975140013


   I’d think `NONE_FAILED` or `NONE_FAILED_MIN_ONE_SUCCESS` is a better fit. Success means the task successfully executed to me, and a task in the future did not successfully execute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] raphaelauv commented on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
raphaelauv commented on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-974794324


   If a task is after another that mean there is a dependency so airflow will not schedule a task if an upstream task haven't run yet , no matter the trigger rule of that second task.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] alexbegg edited a comment on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
alexbegg edited a comment on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-975858267


   > If a task is after another that mean there is a dependency so airflow will not schedule a task if an upstream task haven't run yet , no matter the trigger rule of that second task.
   
   What I am trying to say with this issue is that the task shouldn't be a dependency because the run is before the task's start_date. I can go as far as saying I think there should be a full design change with the UI to make it so it shouldn't even render the task in the graph view and the tree view shouldn't show the task's square.
   
   Shouldn't dependencies be only tasks where the current run's datetime is within the task's start_date and end_date?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] alexbegg commented on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
alexbegg commented on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-975858267


   > If a task is after another that mean there is a dependency so airflow will not schedule a task if an upstream task haven't run yet , no matter the trigger rule of that second task.
   
   What I am trying to say with this issue is that the ask shouldn't be a dependency because the run is before the task's start_date. I can go as far as saying I think there should be a full design change where the UI to make it so it shouldn't even render the task in the graph view and the tree view shouldn't show the task's square.
   
   Shouldn't dependencies be only tasks where the current run's datetime is within the task's start_date and end_date?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] raphaelauv commented on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
raphaelauv commented on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-974794324


   If a task is after another that mean there is a dependency so airflow will not schedule a task if an upstream task haven't run yet , no matter the trigger rule of that second task.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] raphaelauv commented on issue #19719: Future upstream tasks should be ignored for default "all_success" trigger rule

Posted by GitBox <gi...@apache.org>.
raphaelauv commented on issue #19719:
URL: https://github.com/apache/airflow/issues/19719#issuecomment-979761347


   A task after another does not mean :
   
   " You can't run before that task "
   
   It mean :
   
   " You NEED that task to run before you "
   
   Example :
   
   _build_report_file_task_ >> _uplaod_report_file_task_
   
   So if _build_report_file_task_ have a specific start date then it impact _uplaod_report_file_task_
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org