You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/05/25 07:51:25 UTC

[GitHub] [airflow] JavierLopezT opened a new issue #16038: Append Operator functionality

JavierLopezT opened a new issue #16038:
URL: https://github.com/apache/airflow/issues/16038


   The idea is to "append" operators to other operators without having to type the code of them. For instance, after every S3ToSnowflakeOperator, I want a SnowflakeOperator that makes an insert in a final table. So instead of doing the current way, coding both operators, you could do something like this:
   
   ```
   default_args = {'table': 'ex_table'}
   
   s3_1 = S3ToSnowflakeOperator(
   task_id='s3_1',
   s3_bucket='blabla',
   append_operator={'type': 'SnowflakeOperator',
       arguments: {'sql': default_args['table']}}
   ```
   And in the operators dependencies, it would be right after the operator it is appended to. The task_id could be one by default, passed by the user or you could create functions with rules to generate new task_ids based on the master operator.
   
    While it reduces a little bit the readability, I think it is very useful for saving time writing some patterns that are always the same. I think it is useful especially for checks, like for example GreatExpectations/SQL/dbt operators right after PythonOperator/S3ToSQLOperators, etc. 
   
   I am willing to work on this but I don't know if it's technically possible. And if so, I would need some guidance on how to approach it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16038: Append Operator functionality

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16038:
URL: https://github.com/apache/airflow/issues/16038#issuecomment-847705704


   I think there are other mechanisms in Airflow that are better for that.
   
   The default way of handling it (and much better IMHO) is to write your custom operator and reuse and compose Hooks instead of having chain of operators.
   
   Vast majority (if not all) operators are written in the way that they almost transparently pass the parameters to underlying hooks for the very purpose. 
   
   Writing your own operator is easy - just a class that derives from BaseOperator and implements 'execute' method. And it has the additional advantage that you can make it a really 'custom' operator where you could hard-code the args values that are specific for your case rather than expose them as __init__ parameters.
   
   Another advantage of this is that in this case the single 'execute' method runs always on one instance of airflow worker/same process and you can pass even vast amount of data between those hooks (for example streaming it) without worrying about passing the data through yet another external storage. This is how many transfer operators work.
   
   In short - in Airflow world composability is achieved by composing Hooks rather than operators.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #16038: Append Operator functionality

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #16038:
URL: https://github.com/apache/airflow/issues/16038#issuecomment-847708335


   Closing it for now unless this is not good alternative and you want to re-open it @JavierLopezT .
   
   BTW. For those kind of things it it better to open GitHub Discussion first. Then we can discuss it and mark answers as 'answers' without the hassle of closing/reopening issues. Issues are more for bugs or concrete features which we already know it makes sense to implement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #16038: Append Operator functionality

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #16038:
URL: https://github.com/apache/airflow/issues/16038


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org