You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Arvind Nedumaran <ar...@gmail.com> on 2017/01/16 11:28:30 UTC

Programatically generating DAGs

Hello everyone,

I'm new to Airflow and I'm evaluating it for implementation at our org. I
find that there are a lot of use cases where workflows are not known during
development. Is there a way to programatically generate, and manage DAGs.

Better yet, is there a way to manage them directly from a DB instead of
generating them from templates and managing them as files. We're about to
spend considerable time implementing this, so I just want to know about the
various existing options.

Does it help if the tasks are pre-defined and the DAGs only happen to be
some permutation of the pre-existing tasks?

Anybody who's tried something like before, I'd be grateful for any input on
this issue.

Thanks,
Arvind

Re: Programatically generating DAGs

Posted by Arvind Nedumaran <ar...@gmail.com>.
Hello Ludovic, 

Thank you so much. Things are a lot clearer now. Looks like there is a way to make this work for us. 

Have a nice day,
Arvind

On 16/01/17, 22:30, "Ludovic Claude" <lu...@gmail.com> wrote:

    Hello Arvind,
    
    DAGs in Airflow are simple Python objects that only need to be registered
    in the globals namespace. It's easy to generate DAGs from some
    configuration or database, I'm using that for example to generate several
    image processing pipelines based on configuration.
    
    Here is an example:
    https://github.com/LREN-CHUV/airflow-mri-preprocessing-dags/blob/master/mri_pipelines_init.py
    
    Ludovic
    
    
    2017-01-16 12:28 GMT+01:00 Arvind Nedumaran <ar...@gmail.com>:
    
    > Hello everyone,
    >
    > I'm new to Airflow and I'm evaluating it for implementation at our org. I
    > find that there are a lot of use cases where workflows are not known during
    > development. Is there a way to programatically generate, and manage DAGs.
    >
    > Better yet, is there a way to manage them directly from a DB instead of
    > generating them from templates and managing them as files. We're about to
    > spend considerable time implementing this, so I just want to know about the
    > various existing options.
    >
    > Does it help if the tasks are pre-defined and the DAGs only happen to be
    > some permutation of the pre-existing tasks?
    >
    > Anybody who's tried something like before, I'd be grateful for any input on
    > this issue.
    >
    > Thanks,
    > Arvind
    >
    



Re: Programatically generating DAGs

Posted by Ludovic Claude <lu...@gmail.com>.
Hello Arvind,

DAGs in Airflow are simple Python objects that only need to be registered
in the globals namespace. It's easy to generate DAGs from some
configuration or database, I'm using that for example to generate several
image processing pipelines based on configuration.

Here is an example:
https://github.com/LREN-CHUV/airflow-mri-preprocessing-dags/blob/master/mri_pipelines_init.py

Ludovic


2017-01-16 12:28 GMT+01:00 Arvind Nedumaran <ar...@gmail.com>:

> Hello everyone,
>
> I'm new to Airflow and I'm evaluating it for implementation at our org. I
> find that there are a lot of use cases where workflows are not known during
> development. Is there a way to programatically generate, and manage DAGs.
>
> Better yet, is there a way to manage them directly from a DB instead of
> generating them from templates and managing them as files. We're about to
> spend considerable time implementing this, so I just want to know about the
> various existing options.
>
> Does it help if the tasks are pre-defined and the DAGs only happen to be
> some permutation of the pre-existing tasks?
>
> Anybody who's tried something like before, I'd be grateful for any input on
> this issue.
>
> Thanks,
> Arvind
>