You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Shubham Gupta <y2...@gmail.com> on 2018/11/14 20:56:07 UTC
Fusing operators together
*[Please let me know if this is NOT the correct place for such a query]*
Hello maintainers and committers,
I've stumbled upon this design decision for my Airflow project. Any
pointers would be helpful.
Overview
- I'm in the process of deploying Airflow and I've felt the need to
merge groups of operators that form a single logical task (to clear the
clutter in huge DAGs)
- The most common use-case would be coupling an operator and the
corresponding sensor. For instance, one might want to chain together the
EmrStepOperator and EmrStepSensor
----
Possible approaches
- This could be achieved by offloading actual logic to Hooks and then
using as many hooks as needed within an operator
- A hacky alternative (if at all) would be SubDagOperator
----
Questions
- Are hooks the right tool for this problem?
- Any other way to compose operators together?
- Is it a good idea to combine operators at all?
Here's <https://stackoverflow.com/questions/53308306> my complete (more
elaborate) question on StackOverflow
Thanks
*Shubham Gupta*
Software Engineer
zomato
Re: Fusing operators together
Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
Hi Shubham,
I think the EmrStepOperator and EmrStepSensor are a clear exception. Most
operators wait until the operation has finished successfully. For example,
the DruidOperator will block until the indexing job has successfully
finished:
https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/druid_hook.py#L84-L109.
I think this should also be the case of the EmrStepOperator, but this
slipped through at the review. Hope this helps.
Cheers, Fokko
Op wo 14 nov. 2018 om 21:56 schreef Shubham Gupta <
y2k.shubhamgupta@gmail.com>:
> *[Please let me know if this is NOT the correct place for such a query]*
>
> Hello maintainers and committers,
> I've stumbled upon this design decision for my Airflow project. Any
> pointers would be helpful.
>
> Overview
>
> - I'm in the process of deploying Airflow and I've felt the need to
> merge groups of operators that form a single logical task (to clear the
> clutter in huge DAGs)
> - The most common use-case would be coupling an operator and the
> corresponding sensor. For instance, one might want to chain together the
> EmrStepOperator and EmrStepSensor
>
>
> ----
>
> Possible approaches
>
> - This could be achieved by offloading actual logic to Hooks and then
> using as many hooks as needed within an operator
> - A hacky alternative (if at all) would be SubDagOperator
>
>
> ----
>
> Questions
>
> - Are hooks the right tool for this problem?
> - Any other way to compose operators together?
> - Is it a good idea to combine operators at all?
>
>
> Here's <https://stackoverflow.com/questions/53308306> my complete (more
> elaborate) question on StackOverflow
>
> Thanks
>
> *Shubham Gupta*
> Software Engineer
> zomato
>