You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Ashwin Sai Shankar <as...@slack-corp.com.INVALID> on 2019/05/08 20:32:57 UTC

Dag storage in airflow on kube

Hi!
In airflow on kube deployment, which of the following options would you
recommend to store dags in a production env and why? This is for about 1000
dags and everyday ~20 commits are made to dags folder
a)EFS/NFS volume
b) git sync (every container is going to do a git pull before running task)
c) bake dags into image
d)s3-sync

Thanks,
Ash

Re: Dag storage in airflow on kube

Posted by Ry Walker <ry...@rywalker.com>.
We're fan of (C) because:

* Sometimes, new DAGs introduce new python or system-level dependencies. It seems cleaner to update everything together. It's a big hammer, but we've got graceful restart of workers built into our platform. This allows us to not have to worry about taking the system offline to update dependencies.

* It also provides a way to roll back to previous state in the case of a bad upgrade (i.e. roll back both DAGs and dependencies).

* With the emergence of KubernetesExecutor, bundling up the whole image just feels cleaner as the system would otherwise be pulling down DAGs repetitively with every task execution.

We use an internal DockerRegistry, our GraphQL API, and K8s API to handle re-deployments. We also have a CLI to build and run the image locally and to deploy code in a more ad-hoc fashion (for example, to deploy to a test Airflow cluster).

More info on the stack: https://www.astronomer.io/docs/ee-overview ( https://www.astronomer.io/docs/ee-overview/ )

-Ry

Sent via Superhuman ( https://sprh.mn/?vip=ry@rywalker.com )

On Wed, May 08, 2019 at 4:32 PM, Ashwin Sai Shankar < ashankar@slack-corp.com.invalid > wrote:

> 
> 
> 
> Hi!
> In airflow on kube deployment, which of the following options would you
> recommend to store dags in a production env and why? This is for about
> 1000 dags and everyday ~20 commits are made to dags folder
> a)EFS/NFS volume
> b) git sync (every container is going to do a git pull before running
> task)
> c) bake dags into image
> d)s3-sync
> 
> 
> 
> Thanks,
> Ash
> 
> 
>