You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/10/26 02:49:03 UTC

[GitHub] [airflow] potiuk opened a new issue, #27285: Using .output on non-templated fields

potiuk opened a new issue, #27285:
URL: https://github.com/apache/airflow/issues/27285

   ### Discussed in https://github.com/apache/airflow/discussions/26938
   
   <div type='discussions-op-text'>
   
   <sup>Originally posted by **stephenonethree** October  7, 2022</sup>
   I just discovered the `.output` property functionality that apparently was released in Airflow 2 for classic operators, as a simple way of accessing their output XComs. I think that this is a super useful feature because it would allow simpler connections between tasks than what I have been doing until now. 
   
   Until now, I've been explicitly giving a downstream task the task_ids and XCom names that it needs to pull from its upstreams (as hardcoded string parameters). Something like:
   
   ```
   # pushes XCom named myvar
   upstream_task=SomeOperator(task_id='upstream_task')
   
   # this is a custom operator and within execute() I have a line roughly like:
   # actual_input = context['task_instance'].xcom_pull(
   #     dag_id=context['dag'].dag_id, task_ids=upstream_task_id, key=upstream_xcom_name)
   this_task = OtherOperator(
       task_id='this_task',
       upstream_task_id='upstream_task',
       upstream_xcom_name='myvar'
   )
   upstream_task >> this_task
   ```
   
   With `.output` I could simplify this to:
   
   ```
   # pushes XCom named myvar
   upstream_task=SomeOperator(task_id='upstream_task')
   
   this_task = OtherOperator(
       task_id='this_task',
       actual_input=upstream_task.output['myvar']
   )
   upstream_task >> this_task
   ```
   
   Unfortunately it seems that there is one limitation. On the TaskFlow documentation page (https://airflow.apache.org/docs/apache-airflow/2.4.1/tutorial/taskflow.html#consuming-xcoms-between-decorated-and-traditional-tasks) it says: _Using the .output property as an input to another task is supported only for operator parameters listed as a template_field._
   
   I don't use Jinja templating for very many of my parameters as it's mostly irrelevant for me. So my questions are, is there a technical reason for this limitation? If not, is this limitation something that you are considering dropping in a future Airflow version? I suppose I could just make these fields templated to get around it, but I don't really want to turn on templating if I don't expect to need it, since I suppose it introduces the possibility of incorrect interpolation (though perhaps that's a remote possibility because I don't think most of my variables will include `{{` or `}}`.
   
   Let me know if I should file this as a feature request instead, for now I guess the Ideas category works.</div>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] rkarish commented on issue #27285: Using .output on non-templated fields

Posted by GitBox <gi...@apache.org>.
rkarish commented on issue #27285:
URL: https://github.com/apache/airflow/issues/27285#issuecomment-1396641848

   Hey Jarek, I've opened a PR for this. I took a slightly different approach than a DAG level setting. I managed the functionality at the Operator level, but this can still be applied to the whole DAG with the use of `default_args`, I think that this approach gives the Users more flexibility. Looking forward to getting some feedback!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #27285: Using .output on non-templated fields

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #27285:
URL: https://github.com/apache/airflow/issues/27285#issuecomment-1373209999

   Sure. Why not. I have not seen an opposition so far, and I think it's worth doing - I also see no big drawback if it's going to be similar setting to `render_templates_as_native_obj`  that you set for the whole DAG.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #27285: Using .output on non-templated fields

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #27285:
URL: https://github.com/apache/airflow/issues/27285#issuecomment-1292795423

   Following the comment - I have a bold proposal ... It's not exaclty what the original proposal is, but in a way it provides a possibility to do what was originally requested here.  
   
   Why don't we add an option (disabled by default) to make ALL ELIGIBLE fields -  "templeted_fields" (and automatically .output -capable). 
   
   That bothered me for a while but I think there is very little impact of making all fields templated and often people complained that templated fields. Performance overhead should be negligible (just walking through parameters and jinjafying them which in most cases will be no-op). 
   
   The only drawback it might have is that the if a string contains " {{}}" acidentally - this will be replaced with "" - which is backwards-incompatible. We could also provide a mechanism that would eclude a field from being templated just in case.
   
   I think that has a number of benefits - for example our users will not have extend operators that miss some fields in "templated_fields".
   
   I am not too worried anout "outlets"  and executor_config not being available for  .output and user's education. As long as we simply error out in this case that should be good.
   
   In a way it woudl be similar to `render_template_as_native_obj` DAG paraemeter. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #27285: Using .output on non-templated fields

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #27285:
URL: https://github.com/apache/airflow/issues/27285#issuecomment-1397432952

   > Hey Jarek, I've opened a PR for this. I took a slightly different approach than a DAG level setting. I managed the functionality at the Operator level, but this can still be applied to the whole DAG with the use of `default_args`. I think that this approach gives the Users more flexibility. Looking forward to getting some feedback!
   
   Very nice. I asked others for input, but I like the way you proposed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] rkarish commented on issue #27285: Using .output on non-templated fields

Posted by GitBox <gi...@apache.org>.
rkarish commented on issue #27285:
URL: https://github.com/apache/airflow/issues/27285#issuecomment-1372906220

   @potiuk Is this bold proposal still something you want to move forward with? I think it sounds interesting and I see how it can be useful to the Users. I can make an attempt at implementing the solution you have proposed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on issue #27285: Using .output on non-templated fields

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #27285:
URL: https://github.com/apache/airflow/issues/27285#issuecomment-1291486305

   I don’t think there’s a strict technical limitation, but a user perception considertation: Not all fields can reference an XComArg (e.g. `outlets` and `executor_config` can’t), so even if the limitation is lifted, it still wouldn’t be _all fields_. Instead of trying to teach users what fields can reference dynamic values and what can’t, it’s easier to positively list what _can_, and `template_fields` is an easy one to go with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org