You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/16 01:38:52 UTC

[GitHub] [airflow] SteNicholas opened a new issue #9321: Add Signal Based Scheduling To Airflow

SteNicholas opened a new issue #9321:
URL: https://github.com/apache/airflow/issues/9321


   **Description**
   
   The idea of signal based scheduling is to let the operators send signals to the scheduler to trigger a scheduling action, such as starting jobs, stopping jobs and restarting jobs. Also, compared with the current state change retrieval mechanism, signal based scheduling allows the scheduler to know the change of the dependency state immediately without periodically querying the metadata database. In addition to that, signal based scheduling allows potential support for richer scheduling semantics such as periodic execution and manual trigger at per operator granularity.
   
   **Use case / motivation**
   
   Airflow scheduler uses DAG definitions to monitor the state of tasks in the metadata database, and triggers the task instances whose dependencies have been met. It is based on state of dependencies scheduling. 
   However, the current design has the following caveats:
   - When the workflow contains streaming jobs, the scheduler can’t work because the streaming job runs forever.
   - The communication between the operator and scheduler has a long latency of the database query interval.
   In order to address the issues, we propose to add signal based scheduling to the scheduler.
   
   **Related Issues**
   
   N/A


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zhongjiajie commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-644614865


   BTW, I think is good for airflow support stream job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-1046103101


   I'm closing this issue as it already has a [draft AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-35+Add+Signal+Based+Scheduling+To+Airflow).
   There is a discussion thread in the mailing list about it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-888191647


   anyone working on this? seems like a very useful feature for airflow users 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ruoyu90 commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
ruoyu90 commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-646884033


   (Thanks @casassg for pointing us here)
   
   Hello from [TFX](https://github.com/tensorflow/tfx/) community!
   
   It is excited to see this proposal for Airflow as we are also [proposing](https://github.com/tensorflow/community/pull/253) some pipeline patterns that seem to align well with the vision here.
   
   Some context: TFX is designed to be a portable ML production platform and we already have a [demo implementation](https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_simple.py) on top of Airflow stack for traditional workflow-based pipelines. Our new [proposal](https://github.com/tensorflow/community/pull/253) is trying to define a more advanced semantics for asynchronous execution pipelines where every nodes in a pipeline is loosely connected through data signals. The existing Airflow implementation is not good at dealing with such pattern but we are excited to see the proposal in this thread which makes it feasible to realize such pattern on Airflow. How about we get in touch and see whether we can have another successful story for TFX on Airflow?
   
   /cc @zhitaoli
   /cc @rcrowe-google
   /cc @theadactyl


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zhongjiajie commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
zhongjiajie commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-644614569


   We already discuss in Salck, and thread link: https://apache-airflow.slack.com/archives/CCPRP7943/p1591868844043900, and @SteNicholas will create and new AIP for more detail explanation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal closed issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #9321:
URL: https://github.com/apache/airflow/issues/9321


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] subkanthi commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
subkanthi commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-795982649


   is someone working on this, this seems to be a requirement for the flink operator, love to contribute to AIP.
   @potiuk , @mik-laj 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SteNicholas commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-647414670


   @ruoyu90 Good point. Btw, could you please take discussion on dev mailing list? The thread associated with this issue is [[AIP-35] Add Signal Based Scheduling To Airflow](https://lists.apache.org/thread.html/re1a7e5cfcb1e9f4a0bfac41998da2d88ffb26d4f597036c772a4c86e%40%3Cdev.airflow.apache.org%3E). Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SteNicholas commented on issue #9321: Add Signal Based Scheduling To Airflow

Posted by GitBox <gi...@apache.org>.
SteNicholas commented on issue #9321:
URL: https://github.com/apache/airflow/issues/9321#issuecomment-888279359


   @Gabriel39 , we are workfing for the contribution of signal based scheduling to community.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org