You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jarek Potiuk (Jira)" <ji...@apache.org> on 2020/01/19 23:37:06 UTC
[jira] [Closed] (AIRFLOW-596) Networkx based scheduler
[ https://issues.apache.org/jira/browse/AIRFLOW-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jarek Potiuk closed AIRFLOW-596.
--------------------------------
Resolution: Won't Fix
I am closing some old issues that are not relevant any more. Please let me know if you want to reopen it.
> Networkx based scheduler
> ------------------------
>
> Key: AIRFLOW-596
> URL: https://issues.apache.org/jira/browse/AIRFLOW-596
> Project: Apache Airflow
> Issue Type: Improvement
> Components: scheduler
> Reporter: Shreyas Joshi
> Priority: Minor
>
> I'd like to use [networkx|https://networkx.github.io/] and represent each dag/dagbag in memory as a networkx graph.
> Benefits:
> * The scheduling logic would be simplified a fair bit from what it is now.
> * There seems to be gaps between scheduling tasks in a DAG. This is detrimental in cases where each task doesn't take much time and the scheduling delay dominates. We might be able to reduce these gaps. Also, I see currently that the scheduler sleeps for a second by default. Not sure why this is necessary.
> * Set the stage for smarter scheduling of tasks in the future. (As a simple example, greedily schedule the longest tasks first)
> The netowrkx graph can be created when the dag is being scheduled. As the scheduler runs, it updates the status failed/success etc. before asking for the next task to run. We leverage networkx to figure out which tasks are eligible for execution.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)