You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by as...@apache.org on 2022/04/14 12:08:05 UTC

[airflow] branch main updated: Note that value received in reduce is not a list (#23006)

This is an automated email from the ASF dual-hosted git repository.

ash pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new 22908d217d Note that value received in reduce is not a list (#23006)
22908d217d is described below

commit 22908d217d0ce3c2e1785ebfaa475a1751ed8c58
Author: Tzu-ping Chung <tp...@astronomer.io>
AuthorDate: Thu Apr 14 20:07:59 2022 +0800

    Note that value received in reduce is not a list (#23006)
    
    To avoid clogging, the aggregated value from an upstream mapped task is
    a lazy access proxy. This should work as expected (similar to a list) in
    most situations, but let's add a note to clarify it's not really a list
    to avoid potential user confusion.
---
 docs/apache-airflow/concepts/dynamic-task-mapping.rst | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/apache-airflow/concepts/dynamic-task-mapping.rst b/docs/apache-airflow/concepts/dynamic-task-mapping.rst
index d1152971dc..e93ddfd106 100644
--- a/docs/apache-airflow/concepts/dynamic-task-mapping.rst
+++ b/docs/apache-airflow/concepts/dynamic-task-mapping.rst
@@ -62,6 +62,14 @@ The grid view also provides visibility into your mapped tasks in the details pan
 
 .. image:: /img/mapping-simple-grid.png
 
+.. note:: Values passed from the mapped task is a lazy proxy
+
+    In the above example, ``values`` received by ``sum_it`` is an aggregation of all values returned by each mapped instance of ``add_one``. However, since it is impossible to know how many instances of ``add_one`` we will have in advance, ``values`` is not a normal list, but a "lazy sequence" that retrieves each individual value only when asked. Therefore, if you run ``print(values)`` directly, you would get something like this::
+
+        _LazyXComAccess(dag_id='simple_mapping', run_id='test_run', task_id='add_one')
+
+    You can use normal sequence syntax on this object (e.g. ``values[0]``), or iterate through it normally with a ``for`` loop. ``list(values)`` will give you a "real" ``list``, but please be aware of the potential performance implications if the list is large.
+
 .. note:: A reduce task is not required.
 
     Although we show a "reduce" task here (``sum_it``) you don't have to have one, the mapped tasks will still be executed even if they have no downstream tasks.
@@ -147,7 +155,7 @@ Up until now the examples we've shown could all be achieved with a ``for`` loop
 
     @task
     def consumer(arg):
-        print(repr(arg))
+        print(list(arg))
 
 
     with DAG(dag_id="dynamic-map", start_date=datetime(2022, 4, 2)) as dag: