You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Trent Robbins <ro...@gmail.com> on 2019/03/08 17:43:27 UTC

In Support of AIP-15: Future of Airflow Scheduling

Hi All,

I've been on a team using airflow to ingest batch data for about 2 years
and I wanted to throw some support behind the recent AIP-15 by Xiaodong
DENG, and to say that it probably doesn't go far enough in its current
state.

AIP-15 Support Multiple-Schedulers for HA & Better Scheduling Performance
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651

However, the number one frustration I have experienced and heard across a
few companies using airflow is that the scheduler is hard to control.  I
don't know if the teams I have talked to have identical problems to my
teams. Scheduler expectations has got to be a top reason engineers do not
adopt airflow.  Things like scheduling tasks 24h ahead, inability to
trigger tasks at exact times, lack of ability to prioritize dropped dags to
be picked up first, need to be tuned to the needs of specific
organizations.  There are common workarounds that I won't get into here.

There might even be short-term value in the idea of spinning up an airflow
docker container every task to trigger a manual run, using some other
scheduler.  I have my thumb on at least a few pulses and I believe the next
step folks will take is to try to find a way to get off airflow to improve
the scheduling woes.

I guess I would say that Airflow's major value has been templatizing
workflows with the DAG constraint and pulling them out of bash, now we've
exposed the next issue which is the high variety of business logic
expectations people bring to a scheduler.

Airflow is pretty far ahead of other tools in the space. I am moving to a
role where I don't use airflow, but for those who want to grow the tool I
think this is the single biggest blocker to adoption and the best way to
create a feeling of joy/relief (and not dread) when you open up Airflow at
9 AM on Wednesday.

Best,

Trent Robbins
https://www.linkedin.com/in/trentrobbins