You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by po...@apache.org on 2022/01/23 13:22:12 UTC

[airflow] 18/24: Improve documentation on ``Params`` (#20567)

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch v2-2-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 543a78bd5b0b376c6591c384066961c0ae508e47
Author: Matt Rixman <58...@users.noreply.github.com>
AuthorDate: Mon Jan 3 20:40:10 2022 -0700

    Improve documentation on ``Params`` (#20567)
    
    I think that this doc could be improved by adding examples of how to reference the params in your dag. (Also, the current example code causes this: #20559.)
    
    While trying to find the right place to work a few reference examples in, I ended up rewriting quite a lot of it.
    Let me know if you think that this is an improvement.
    
    I haven't yet figured out how to build this and view it locally, and I'd want to do that as a sanity check before merging it, but I figured get feedback on what I've written before I do that.
    
    (cherry picked from commit 064efbeae7c2560741c5a8928799482ef795e100)
---
 docs/apache-airflow/concepts/params.rst | 146 ++++++++++++++++++++++++++------
 1 file changed, 119 insertions(+), 27 deletions(-)

diff --git a/docs/apache-airflow/concepts/params.rst b/docs/apache-airflow/concepts/params.rst
index c508279..ef266ea 100644
--- a/docs/apache-airflow/concepts/params.rst
+++ b/docs/apache-airflow/concepts/params.rst
@@ -15,16 +15,21 @@
     specific language governing permissions and limitations
     under the License.
 
+.. _concepts:params:
+
 Params
 ======
 
-Params are Airflow's concept of providing runtime configuration to tasks when a dag gets triggered manually.
-Params are configured while defining the dag & tasks, that can be altered while doing a manual trigger. The
-ability to update params while triggering a DAG depends on the flag ``core.dag_run_conf_overrides_params``,
-so if that flag is ``False``, params would behave like constants.
+Params are how Airflow provides runtime configuration to tasks.
+When you trigger a DAG manually, you can modify its Params before the dagrun starts.
+If the user-supplied values don't pass validation, Airflow shows a warning instead of creating the dagrun.
+(For scheduled runs, the default values are used.)
+
+Adding Params to a DAG
+----------------------
 
-To use them, one can use the ``Param`` class for complex trigger-time validations or simply use primitive types,
-which won't be doing any such validations.
+To add Params to a :class:`~airflow.models.dag.DAG`, initialize it with the ``params`` kwarg.
+Use a dictionary that maps Param names to a either a :class:`~airflow.models.param.Param` or an object indicating the parameter's default value.
 
 .. code-block::
 
@@ -32,33 +37,120 @@ which won't be doing any such validations.
     from airflow.models.param import Param
 
     with DAG(
-        'my_dag',
+        "the_dag",
         params={
-            'int_param': Param(10, type='integer', minimum=0, maximum=20),  # a int param with default value
-            'str_param': Param(type='string', minLength=2, maxLength=4),    # a mandatory str param
-            'dummy_param': Param(type=['null', 'number', 'string'])         # a param which can be None as well
-            'old_param': 'old_way_of_passing',                              # i.e. no data or type validations
-            'simple_param': Param('im_just_like_old_param'),                # i.e. no data or type validations
-            'email_param': Param(
-                default='example@example.com',
-                type='string',
-                format='idn-email',
-                minLength=5,
-                maxLength=255,
-            ),
+            "x": Param(5, type="integer", minimum=3),
+            "y": 6
         },
+    ) as the_dag:
+
+Referencing Params in a Task
+----------------------------
+
+Params are stored as ``params`` in the :ref:`template context <templates-ref>`.
+So you can reference them in a template.
+
+.. code-block::
+
+    PythonOperator(
+        task_id="from_template",
+        op_args=[
+            "{{ params.int_param + 10 }}",
+        ],
+        python_callable=(
+            lambda x: print(x)
+        ),
+    )
+
+Even though Params can use a variety of types, the default behavior of templates is to provide your task with a string.
+You can change this by setting ``render_template_as_native_obj=True`` while initializing the :class:`~airflow.models.dag.DAG`.
+
+.. code-block::
+
+    with DAG(
+        "the_dag",
+        params={"x": Param(5, type="integer", minimum=3)},
+        render_template_as_native_obj=True
+    ) as the_dag:
+
+
+This way, the Param's type is respected when its provided to your task.
+
+.. code-block::
+
+    # prints <class 'str'> by default
+    # prints <class 'int'> if render_template_as_native_obj=True
+    PythonOperator(
+        task_id="template_type",
+        op_args=[
+            "{{ params.int_param }}",
+        ],
+        python_callable=(
+            lambda x: print(type(x))
+        ),
     )
 
-``Param`` make use of `json-schema <https://json-schema.org/>`__ to define the properties and doing the
-validation, so one can use the full json-schema specifications mentioned at
-https://json-schema.org/draft/2020-12/json-schema-validation.html to define the construct of a ``Param``
-objects.
+Another way to access your param is via a task's ``context`` kwarg.
 
-Also, it worthwhile to note that if you have any DAG which uses a mandatory param value, i.e. a ``Param``
-object with no default value or ``null`` as an allowed type, that DAG schedule has to be ``None``. However,
-if such ``Param`` has been defined at task level, Airflow has no way to restrict that & the task would be
-failing at the execution time.
+.. code-block::
+
+    def print_x(**context):
+        print(context["params"]["x"])
+
+    PythonOperator(
+        task_id="print_x",
+        python_callable=print_it,
+    )
+
+Task-level Params
+-----------------
+
+You can also add Params to individual tasks.
+
+.. code-block::
+
+    PythonOperator(
+        task_id="print_x",
+        params={"x": 10},
+        python_callable=print_it,
+    )
+
+If there's already a dag param with that name, the task-level default will take precedence over the dag-level default.
+If a user supplies their own value when the DAG was triggered, Airflow ignores all defaults and uses the user's value.
+
+JSON Schema Validation
+----------------------
+
+:class:`~airflow.modules.param.Param` makes use of ``json-schema <https://json-schema.org/>``, so you can use the full json-schema specifications mentioned at https://json-schema.org/draft/2020-12/json-schema-validation.html to define ``Param`` objects.
+
+.. code-block::
+
+    with DAG(
+        "my_dag",
+        params={
+            # a int with a default value
+            "int_param": Param(10, type="integer", minimum=0, maximum=20),
+
+            # a required param which can be of multiple types
+            "dummy": Param(type=["null", "number", "string"]),
+
+            # a param which uses json-schema formatting
+            "email": Param(
+                default="example@example.com",
+                type="string",
+                format="idn-email",
+                minLength=5,
+                maxLength=255,
+            ),
+        },
+    ) as my_dag:
 
 .. note::
     As of now, for security reasons, one can not use Param objects derived out of custom classes. We are
     planning to have a registration system for custom Param classes, just like we've for Operator ExtraLinks.
+
+Disabling Runtime Param Modification
+------------------------------------
+
+The ability to update params while triggering a DAG depends on the flag ``core.dag_run_conf_overrides_params``.
+Setting this config to ``False`` will effectively turn your default params into constants.