You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/20 11:57:19 UTC

[GitHub] [airflow] sou-joshi opened a new issue #17752: Skip Scheduled Runs

sou-joshi opened a new issue #17752:
URL: https://github.com/apache/airflow/issues/17752


   **Description**
   Ability to skip scheduled Dag Runs. 
   
   **Use case / motivation**
   Let's say we have a Dag scheduled to run from 5th through 10th of every month. When it runs on 5th and is successful,  would like to skip all the following scheduled runs for the month and only trigger it next month.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-902925509


   Sounds like a very, very specific request.. Since Airflow is developed by community, I think the only way to get it, is that you implement it. Would you be willing to make a  PR for it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-902639905


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-904368835


   Another thing—it is very unlikely for Airflow core to implement this behaviour, since backfilling is a thing and those runs may happen concurrently or even out of order, so it’s conceptually not clear how the scheduler can schedule runs based on the previous run’s result in a general way. We can skip a run from happening after it’s created, but can’t stop it from being created, nor dynamically decide when the run should happen. So whatever you choose to do, it would probably need to happen in your own customisation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-904364486


   Allowing a DAG to control when the next run should happen feels like the wrong abstraction to me. A BranchPythonOperator seems like a good way to go. With AIP-39 implemented, you may also use a custom timetable to query previous DAG run results to determine when the next DAG run should happen. The difference is to not put controlling logic on the *previous* run (which is wrong), but make the *next* run—either when the run happens as in BranchPythonOperator, or when the scheduler decides when the run should happen as in AIP-39 timetable—decide what to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ManiBharataraju commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
ManiBharataraju commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-903926318


   @sou-joshi - Can't you add a BranchPythonOperator as the first task and in the function, query the DagRun table and check if the dag has already run and based on the result skip the subsequent tasks? 
   Just thinking what parameter would you pass to skip the next runs to achieve this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] sou-joshi commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
sou-joshi commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-903310086


   @potiuk Approach to solve one off case right now is this - 
   **Assuming**
   - DAG scheduled to run from 5th - 10th of every month
   - DAG run successful for 6th dated run
   
   **Solution**
   - If DAG successful, get the next scheduled Dag Runs for the month
   - delete the next dag run scheduled for the month from the metadata db
   
   **Questions**
   - How safe is it to touch the meta db for deletion?
   - Any other impacts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-904364486


   Allowing a DAG to control when the next run should happen feels like the wrong abstraction to me. A BranchPythonOperator seems like a good way to go. With AIP-39 implemented, you may also use a custom timetable to query previous DAG run results to determine when the next DAG run should happen. The difference is to not put controlling logic on the *previous* run (which is difficult), but make the *next* run—either when the run happens as in BranchPythonOperator, or when the scheduler decides when the run should happen as in AIP-39 timetable—decide what to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #17752:
URL: https://github.com/apache/airflow/issues/17752


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-904423749


   Very much agree with @uranusjr . Skipping processing based on success of previous runs is not compatible with airflow paradigm and I cannot see how.it can be part of scheduler. Some kind of 'data based' behaviour is possible - for example you might skip a run If the data is already prepared - but as @uranusjr  mentioned this should be done as Dag logic not as scheduler decision
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17752: Skip Scheduled Runs

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17752:
URL: https://github.com/apache/airflow/issues/17752#issuecomment-904368835


   Another thing—it is very unlikely for Airflow core to implement this behaviour, since backfilling is a thing and those runs may happen concurrently or even out of order, so it’s conceptually difficult not clear how the scheduler can schedule runs based on the previous run’s result in a general way. We can skip a run from happening after it’s created, but can’t stop it from being created, nor dynamically decide when the run should happen. So whatever you choose to do, it would probably need to happen in your own customisation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org