You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Ry Walker (JIRA)" <ji...@apache.org> on 2019/04/16 17:41:00 UTC
[jira] [Updated] (AIRFLOW-81) Scheduler blackout time period
[ https://issues.apache.org/jira/browse/AIRFLOW-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ry Walker updated AIRFLOW-81:
-----------------------------
External issue URL: (was: https://github.com/apache/incubator-airflow/issues/1191#issuecomment-217719094)
> Scheduler blackout time period
> ------------------------------
>
> Key: AIRFLOW-81
> URL: https://issues.apache.org/jira/browse/AIRFLOW-81
> Project: Apache Airflow
> Issue Type: Wish
> Components: scheduler
> Reporter: Sean McIntyre
> Priority: Minor
> Labels: features
>
> I have the need for a scheduler blackout time period in Airflow.
> My team, which uses Airflow, has been asked to not query one of my company's data sources between midnight and 7 AM. When we launch big backfills on this data source, it would be nice to have the Scheduler not schedule some TaskInstances during the blackout hours.
> We (@r39132 and @ledsusop) brainstormed a few ideas on gitter on how to do this...
> (1) Put more state/logic in the TaskInstance and Scheduler like this:
> my_task = PythonOperator(
> task_id='my_task',
> python_callable=my_command_that_access_the_datasource,
> provide_context=True,
> dag=dag,
> blackout=my_blackout_logic_for_the_datasource # <---
> )
> where my_blackout_logic is some function I provide that the scheduler calls to determine whether or not it is the blackout period.
> (2) Pause DAGs on nightly basis. This can be done with the `pause_dag` CLI command scheduled by cron / Jenkins. However could this be considered a core feature to bring into the Airflow UI and scheduling system?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)