You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Shubham Gupta <y2...@gmail.com> on 2018/11/14 20:56:07 UTC

Fusing operators together

*[Please let me know if this is NOT the correct place for such a query]*

Hello maintainers and committers,
I've stumbled upon this design decision for my Airflow project. Any
pointers would be helpful.

Overview

   - I'm in the process of deploying Airflow and I've felt the need to
   merge groups of operators that form a single logical task (to clear the
   clutter in huge DAGs)
   - The most common use-case would be coupling an operator and the
   corresponding sensor. For instance, one might want to chain together the
   EmrStepOperator and EmrStepSensor


----

Possible approaches

   - This could be achieved by offloading actual logic to Hooks and then
   using as many hooks as needed within an operator
   - A hacky alternative (if at all) would be SubDagOperator


----

Questions

   - Are hooks the right tool for this problem?
   - Any other way to compose operators together?
   - Is it a good idea to combine operators at all?


Here's <https://stackoverflow.com/questions/53308306> my complete (more
elaborate) question on StackOverflow

Thanks

*Shubham Gupta*
Software Engineer
 zomato

Re: Fusing operators together

Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
Hi Shubham,

I think the EmrStepOperator and EmrStepSensor are a clear exception. Most
operators wait until the operation has finished successfully. For example,
the DruidOperator will block until the indexing job has successfully
finished:
https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/druid_hook.py#L84-L109.
I think this should also be the case of the EmrStepOperator, but this
slipped through at the review. Hope this helps.

Cheers, Fokko



Op wo 14 nov. 2018 om 21:56 schreef Shubham Gupta <
y2k.shubhamgupta@gmail.com>:

> *[Please let me know if this is NOT the correct place for such a query]*
>
> Hello maintainers and committers,
> I've stumbled upon this design decision for my Airflow project. Any
> pointers would be helpful.
>
> Overview
>
>    - I'm in the process of deploying Airflow and I've felt the need to
>    merge groups of operators that form a single logical task (to clear the
>    clutter in huge DAGs)
>    - The most common use-case would be coupling an operator and the
>    corresponding sensor. For instance, one might want to chain together the
>    EmrStepOperator and EmrStepSensor
>
>
> ----
>
> Possible approaches
>
>    - This could be achieved by offloading actual logic to Hooks and then
>    using as many hooks as needed within an operator
>    - A hacky alternative (if at all) would be SubDagOperator
>
>
> ----
>
> Questions
>
>    - Are hooks the right tool for this problem?
>    - Any other way to compose operators together?
>    - Is it a good idea to combine operators at all?
>
>
> Here's <https://stackoverflow.com/questions/53308306> my complete (more
> elaborate) question on StackOverflow
>
> Thanks
>
> *Shubham Gupta*
> Software Engineer
>  zomato
>