You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/11 17:01:53 UTC

[GitHub] [airflow] josh-fell opened a new issue #19538: TaskFlow API `multiple_outputs` inference not handling all flavors of dict typing

josh-fell opened a new issue #19538:
URL: https://github.com/apache/airflow/issues/19538


   ### Apache Airflow version
   
   2.2.0
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   ```shell
   apache-airflow-providers-amazon          1!2.2.0
   apache-airflow-providers-cncf-kubernetes 1!2.0.3
   apache-airflow-providers-elasticsearch   1!2.0.3
   apache-airflow-providers-ftp             1!2.0.1
   apache-airflow-providers-google          1!6.0.0
   apache-airflow-providers-http            1!2.0.1
   apache-airflow-providers-imap            1!2.0.1
   apache-airflow-providers-microsoft-azure 1!3.2.0
   apache-airflow-providers-mysql           1!2.1.1
   apache-airflow-providers-postgres        1!2.3.0
   apache-airflow-providers-redis           1!2.0.1
   apache-airflow-providers-slack           1!4.1.0
   apache-airflow-providers-sqlite          1!2.0.1
   apache-airflow-providers-ssh             1!2.2.0
   ```
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   Local deployment using the Astronomer CLI.
   
   ### What happened
   
   Creating a TaskFlow function with a return type annotation of `dict` does not yield `XComs` for each key within the returned dict.  Additionally, the inference does not work for both `dict` and `Dict` (without arg annotation) types in Python 3.6.
   
   ### What you expected to happen
   
   When creating a TaskFlow function and not explicitly setting `multiple_outputs=True`, the unfurling of `XComs` into separate keys is inferred by the return type annotation (as noted [here](https://airflow.apache.org/docs/apache-airflow/stable/tutorial_taskflow_api.html#multiple-outputs-inference)). When using a return type annotation of `dict`, separate `XComs` should be created. There is an explicit check for this type as well:
   
   https://github.com/apache/airflow/blob/7622f5e08261afe5ab50a08a6ca0804af8c7c7fe/airflow/decorators/base.py#L207
   
   Additionally, on Python 3.6, the inference should handle generating multiple `XComs` for both `dict` and `typing.Dict` return type annotations as expected on other Python versions.
   
   ### How to reproduce
   
   This DAG can be used to demonstrate the different results of dict typing:
   ```python
   from datetime import datetime
   from typing import Dict
   
   from airflow.decorators import dag, task
   from airflow.models.baseoperator import chain
   from airflow.models import XCom
   
   
   @dag(
       start_date=datetime(2021, 11, 11),
       schedule_interval=None,
   )
   def __test__():
       @task
       def func_no_return_anno():
           return {"key1": "value1", "key2": "value2"}
   
       @task
       def func_with_dict() -> dict:
           return {"key1": "value1", "key2": "value2"}
   
       @task
       def func_with_typing_dict() -> Dict:
           return {"key1": "value1", "key2": "value2"}
   
       @task
       def func_with_typing_dict_explicit() -> Dict[str, str]:
           return {"key1": "value1", "key2": "value2"}
   
       @task
       def get_xcoms(run_id=None):
           xcoms = XCom.get_many(
               dag_ids="__test__",
               task_ids=[
                   "func_no",
                   "func_with_dict",
                   "func_with_typing_dict",
                   "func_with_typing_dict_explicit",
               ],
               run_id=run_id,
           ).all()
   
           for xcom in xcoms:
               print(f"Task ID: {xcom.task_id} \n", f"Key: {xcom.key} \n", f"Value: {xcom.value}")
   
       chain(
           [
               func_no_return_anno(),
               func_with_dict(),
               func_with_typing_dict(),
               func_with_typing_dict_explicit(),
           ],
           get_xcoms(),
       )
   
   
   dag = __test__()
   
   ```
   
   **Expected `XCom` keys**
   - func_no_return_anno
     - `return_value`
   - func_with_dict
     - `return_value`, `key1`, and `key2`
   - func_with_typing_dict
     - `return_value`, `key1`, and `key2`
   - func_with_typing_dict_explicit
     - `return_value`, `key1`, and `key2`
   
   Here is the output from the `get_xcoms` task which is gathering all of the `XComs` generated for the run:
   ![image](https://user-images.githubusercontent.com/48934154/141336206-259bd78b-8ef3-4edb-81a6-b161d783f39f.png)
   
   The `func_with_dict` task does not yield `XComs` for `key1` and `key2`.
   
   ### Anything else
   
   The inference also doesn't function as intended on Python 3.6 when using simple `dict` or `Dict` return types.
   
   For example, isolating the existing `TestAirflowTaskDecorator::test_infer_multiple_outputs_using_typing` unit test and adding some parameterization on Python 3.6:
   ```python
   @parameterized.expand(
           [
               ("dict", dict),
               ("Dict", Dict),
               ("Dict[str, int]", Dict[str, int]),
           ]
       )
       def test_infer_multiple_outputs_using_typing(self, _, test_return_annotation):
           @task_decorator
           def identity_dict(x: int, y: int) -> test_return_annotation:
               return {"x": x, "y": y}
   
           assert identity_dict(5, 5).operator.multiple_outputs is True
   ```
   **Results**
   ![image](https://user-images.githubusercontent.com/48934154/141338408-5d7f2877-6465-4c81-857f-5ca6d1b612ee.png)
   
   
   However, since Python 3.6 will reach EOL on 2021-12-23, this _may_ not be an aspect that needs to be fixed.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [airflow] yiga2 commented on issue #19538: TaskFlow API `multiple_outputs` inference not handling all flavors of dict typing

Posted by GitBox <gi...@apache.org>.

yiga2 commented on issue #19538:
URL: https://github.com/apache/airflow/issues/19538#issuecomment-970476479


   Probably a different issue or a feature request but can `multiple_outputs` not be available on any (all?) operators that could return a `dict` e.g. pseudo-code
   ```
   if isinstance(returned_value, dict) and multiple_ouputs:
       # unroll and xcom_push each key 
   
   ```
   x-post: https://apache-airflow.slack.com/archives/CCPRP7943/p1637081028345300


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [airflow] potiuk closed issue #19538: TaskFlow API `multiple_outputs` inference not handling all flavors of dict typing

Posted by GitBox <gi...@apache.org>.

potiuk closed issue #19538:
URL: https://github.com/apache/airflow/issues/19538


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org