You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jarek Potiuk (Jira)" <ji...@apache.org> on 2020/01/19 23:37:06 UTC

[jira] [Closed] (AIRFLOW-596) Networkx based scheduler

     [ https://issues.apache.org/jira/browse/AIRFLOW-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jarek Potiuk closed AIRFLOW-596.
--------------------------------
    Resolution: Won't Fix

I am closing some old issues that are not relevant any more. Please let me know if you want to reopen it.

> Networkx based scheduler
> ------------------------
>
>                 Key: AIRFLOW-596
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-596
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: scheduler
>            Reporter: Shreyas Joshi
>            Priority: Minor
>
> I'd like to use [networkx|https://networkx.github.io/] and represent each dag/dagbag in memory as a networkx graph.
> Benefits:
> * The scheduling logic would be simplified a fair bit from what it is now.
> * There seems to be gaps between scheduling tasks in a DAG. This is detrimental in cases where each task doesn't take much time and the scheduling delay dominates. We might be able to reduce these gaps. Also, I see currently that the scheduler sleeps for a second by default. Not sure why this is necessary.
> * Set the stage for smarter scheduling of tasks in the future. (As a simple example, greedily schedule the longest tasks first)
> The netowrkx graph can be created when the dag is being scheduled. As the scheduler runs, it updates the status failed/success etc. before asking for the next task to run. We leverage networkx to figure out which tasks are eligible for execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)