You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by as...@apache.org on 2022/04/14 12:08:05 UTC
[airflow] branch main updated: Note that value received in reduce is not a list (#23006)
This is an automated email from the ASF dual-hosted git repository.
ash pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new 22908d217d Note that value received in reduce is not a list (#23006)
22908d217d is described below
commit 22908d217d0ce3c2e1785ebfaa475a1751ed8c58
Author: Tzu-ping Chung <tp...@astronomer.io>
AuthorDate: Thu Apr 14 20:07:59 2022 +0800
Note that value received in reduce is not a list (#23006)
To avoid clogging, the aggregated value from an upstream mapped task is
a lazy access proxy. This should work as expected (similar to a list) in
most situations, but let's add a note to clarify it's not really a list
to avoid potential user confusion.
---
docs/apache-airflow/concepts/dynamic-task-mapping.rst | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/docs/apache-airflow/concepts/dynamic-task-mapping.rst b/docs/apache-airflow/concepts/dynamic-task-mapping.rst
index d1152971dc..e93ddfd106 100644
--- a/docs/apache-airflow/concepts/dynamic-task-mapping.rst
+++ b/docs/apache-airflow/concepts/dynamic-task-mapping.rst
@@ -62,6 +62,14 @@ The grid view also provides visibility into your mapped tasks in the details pan
.. image:: /img/mapping-simple-grid.png
+.. note:: Values passed from the mapped task is a lazy proxy
+
+ In the above example, ``values`` received by ``sum_it`` is an aggregation of all values returned by each mapped instance of ``add_one``. However, since it is impossible to know how many instances of ``add_one`` we will have in advance, ``values`` is not a normal list, but a "lazy sequence" that retrieves each individual value only when asked. Therefore, if you run ``print(values)`` directly, you would get something like this::
+
+ _LazyXComAccess(dag_id='simple_mapping', run_id='test_run', task_id='add_one')
+
+ You can use normal sequence syntax on this object (e.g. ``values[0]``), or iterate through it normally with a ``for`` loop. ``list(values)`` will give you a "real" ``list``, but please be aware of the potential performance implications if the list is large.
+
.. note:: A reduce task is not required.
Although we show a "reduce" task here (``sum_it``) you don't have to have one, the mapped tasks will still be executed even if they have no downstream tasks.
@@ -147,7 +155,7 @@ Up until now the examples we've shown could all be achieved with a ``for`` loop
@task
def consumer(arg):
- print(repr(arg))
+ print(list(arg))
with DAG(dag_id="dynamic-map", start_date=datetime(2022, 4, 2)) as dag: