You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by ra...@gmail.com, ra...@gmail.com on 2018/03/28 14:03:05 UTC

Airflow Scalability with Local Executor

Hi All,
We have a use case to support 1000 concurrent DAGs. These dags would have have couple of Http task which would be submitting jobs to external services. Each DAG could run for couple of hours.
HTTP tasks are periodically checking(with sleep 20) the job status.
We tried running 1000 such dags(Parallelism set to 1000) with Airflow's LocalExecutor Mode but after 100 concurrent runs, tasks started failing due to
--> OOM error
--> Scheduler marked them failed because of lack of heartbeat.
We are using 4 cores and 16 GB RAM. Each airflow worker is taking ~250 MB of Virtual memory and ~60 MB of RES memory which seems to be on higher side. CPU utilisation is also ~98%.
Is there anything that can be done to optimise Memory/CPU for airflow worker.
Any pointer to airflow benchmarking with LocalExecutor would also be helpful

Re: Airflow Scalability with Local Executor

Posted by Dan Davydov <da...@airbnb.com.INVALID>.
The LocalExecutor is great for running small numbers of DAGs/tasks, but it
is more of a starter executor meant to made Airflow work out of the box. I
would recommend switching to a different executor like the CeleryExecutor.

You are certainly right that there is room for reducing the memory
footprint of each Airflow process (though I'm not too sure how much can be
done about the CPU usage, could be a function of how your DAGs are parsed).
Even if you fix the current bottlenecks you will likely run into more.

On Wed, Mar 28, 2018 at 7:13 AM ramandumcs@gmail.com <ra...@gmail.com>
wrote:

> Hi All,
> We have a use case to support 1000 concurrent DAGs. These dags would have
> have couple of Http task which would be submitting jobs to external
> services. Each DAG could run for couple of hours.
> HTTP tasks are periodically checking(with sleep 20) the job status.
> We tried running 1000 such dags(Parallelism set to 1000) with Airflow's
> LocalExecutor Mode but after 100 concurrent runs, tasks started failing due
> to
> --> OOM error
> --> Scheduler marked them failed because of lack of heartbeat.
> We are using 4 cores and 16 GB RAM. Each airflow worker is taking ~250 MB
> of Virtual memory and ~60 MB of RES memory which seems to be on higher
> side. CPU utilisation is also ~98%.
> Is there anything that can be done to optimise Memory/CPU for airflow
> worker.
> Any pointer to airflow benchmarking with LocalExecutor would also be
> helpful
>