You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/11 17:01:53 UTC
[GitHub] [airflow] josh-fell opened a new issue #19538: TaskFlow API `multiple_outputs` inference not handling all flavors of dict typing
josh-fell opened a new issue #19538:
URL: https://github.com/apache/airflow/issues/19538
### Apache Airflow version
2.2.0
### Operating System
Debian GNU/Linux 11 (bullseye)
### Versions of Apache Airflow Providers
```shell
apache-airflow-providers-amazon 1!2.2.0
apache-airflow-providers-cncf-kubernetes 1!2.0.3
apache-airflow-providers-elasticsearch 1!2.0.3
apache-airflow-providers-ftp 1!2.0.1
apache-airflow-providers-google 1!6.0.0
apache-airflow-providers-http 1!2.0.1
apache-airflow-providers-imap 1!2.0.1
apache-airflow-providers-microsoft-azure 1!3.2.0
apache-airflow-providers-mysql 1!2.1.1
apache-airflow-providers-postgres 1!2.3.0
apache-airflow-providers-redis 1!2.0.1
apache-airflow-providers-slack 1!4.1.0
apache-airflow-providers-sqlite 1!2.0.1
apache-airflow-providers-ssh 1!2.2.0
```
### Deployment
Astronomer
### Deployment details
Local deployment using the Astronomer CLI.
### What happened
Creating a TaskFlow function with a return type annotation of `dict` does not yield `XComs` for each key within the returned dict. Additionally, the inference does not work for both `dict` and `Dict` (without arg annotation) types in Python 3.6.
### What you expected to happen
When creating a TaskFlow function and not explicitly setting `multiple_outputs=True`, the unfurling of `XComs` into separate keys is inferred by the return type annotation (as noted [here](https://airflow.apache.org/docs/apache-airflow/stable/tutorial_taskflow_api.html#multiple-outputs-inference)). When using a return type annotation of `dict`, separate `XComs` should be created. There is an explicit check for this type as well:
https://github.com/apache/airflow/blob/7622f5e08261afe5ab50a08a6ca0804af8c7c7fe/airflow/decorators/base.py#L207
Additionally, on Python 3.6, the inference should handle generating multiple `XComs` for both `dict` and `typing.Dict` return type annotations as expected on other Python versions.
### How to reproduce
This DAG can be used to demonstrate the different results of dict typing:
```python
from datetime import datetime
from typing import Dict
from airflow.decorators import dag, task
from airflow.models.baseoperator import chain
from airflow.models import XCom
@dag(
start_date=datetime(2021, 11, 11),
schedule_interval=None,
)
def __test__():
@task
def func_no_return_anno():
return {"key1": "value1", "key2": "value2"}
@task
def func_with_dict() -> dict:
return {"key1": "value1", "key2": "value2"}
@task
def func_with_typing_dict() -> Dict:
return {"key1": "value1", "key2": "value2"}
@task
def func_with_typing_dict_explicit() -> Dict[str, str]:
return {"key1": "value1", "key2": "value2"}
@task
def get_xcoms(run_id=None):
xcoms = XCom.get_many(
dag_ids="__test__",
task_ids=[
"func_no",
"func_with_dict",
"func_with_typing_dict",
"func_with_typing_dict_explicit",
],
run_id=run_id,
).all()
for xcom in xcoms:
print(f"Task ID: {xcom.task_id} \n", f"Key: {xcom.key} \n", f"Value: {xcom.value}")
chain(
[
func_no_return_anno(),
func_with_dict(),
func_with_typing_dict(),
func_with_typing_dict_explicit(),
],
get_xcoms(),
)
dag = __test__()
```
**Expected `XCom` keys**
- func_no_return_anno
- `return_value`
- func_with_dict
- `return_value`, `key1`, and `key2`
- func_with_typing_dict
- `return_value`, `key1`, and `key2`
- func_with_typing_dict_explicit
- `return_value`, `key1`, and `key2`
Here is the output from the `get_xcoms` task which is gathering all of the `XComs` generated for the run:
![image](https://user-images.githubusercontent.com/48934154/141336206-259bd78b-8ef3-4edb-81a6-b161d783f39f.png)
The `func_with_dict` task does not yield `XComs` for `key1` and `key2`.
### Anything else
The inference also doesn't function as intended on Python 3.6 when using simple `dict` or `Dict` return types.
For example, isolating the existing `TestAirflowTaskDecorator::test_infer_multiple_outputs_using_typing` unit test and adding some parameterization on Python 3.6:
```python
@parameterized.expand(
[
("dict", dict),
("Dict", Dict),
("Dict[str, int]", Dict[str, int]),
]
)
def test_infer_multiple_outputs_using_typing(self, _, test_return_annotation):
@task_decorator
def identity_dict(x: int, y: int) -> test_return_annotation:
return {"x": x, "y": y}
assert identity_dict(5, 5).operator.multiple_outputs is True
```
**Results**
![image](https://user-images.githubusercontent.com/48934154/141338408-5d7f2877-6465-4c81-857f-5ca6d1b612ee.png)
However, since Python 3.6 will reach EOL on 2021-12-23, this _may_ not be an aspect that needs to be fixed.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] yiga2 commented on issue #19538: TaskFlow API `multiple_outputs` inference not handling all flavors of dict typing
Posted by GitBox <gi...@apache.org>.
yiga2 commented on issue #19538:
URL: https://github.com/apache/airflow/issues/19538#issuecomment-970476479
Probably a different issue or a feature request but can `multiple_outputs` not be available on any (all?) operators that could return a `dict` e.g. pseudo-code
```
if isinstance(returned_value, dict) and multiple_ouputs:
# unroll and xcom_push each key
```
x-post: https://apache-airflow.slack.com/archives/CCPRP7943/p1637081028345300
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk closed issue #19538: TaskFlow API `multiple_outputs` inference not handling all flavors of dict typing
Posted by GitBox <gi...@apache.org>.
potiuk closed issue #19538:
URL: https://github.com/apache/airflow/issues/19538
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org