You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2023/01/02 09:49:31 UTC

[GitHub] [airflow] BasPH commented on a diff in pull request #28558: Make the policy functions pluggable

BasPH commented on code in PR #28558:
URL: https://github.com/apache/airflow/pull/28558#discussion_r1059922543


##########
airflow/policies.py:
##########
@@ -0,0 +1,242 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+If you want to check or mutate DAGs or Tasks on a cluster-wide level, then a Cluster Policy will let you do
+that. They have three main purposes:
+
+* Checking that DAGs/Tasks meet a certain standard
+* Setting default arguments on DAGs/Tasks
+* Performing custom routing logic

Review Comment:
   What exactly do you mean here?



##########
airflow/policies.py:
##########
@@ -0,0 +1,242 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+If you want to check or mutate DAGs or Tasks on a cluster-wide level, then a Cluster Policy will let you do
+that. They have three main purposes:
+
+* Checking that DAGs/Tasks meet a certain standard
+* Setting default arguments on DAGs/Tasks
+* Performing custom routing logic
+
+There are three main types of cluster policy:
+
+* ``dag_policy``: Takes a :class:`~airflow.models.dag.DAG` parameter called ``dag``. Runs at load time of the
+  DAG from DagBag :class:`~airflow.models.dagbag.DagBag`.
+* ``task_policy``: Takes a :class:`~airflow.models.baseoperator.BaseOperator` parameter called ``task``. The
+  policy gets executed when the task is created during parsing of the task from DagBag at load time. This
+  means that the whole task definition can be altered in the task policy. It does not relate to a specific
+  task running in a DagRun. The ``task_policy`` defined is applied to all the task instances that will be
+  executed in the future.
+* ``task_instance_mutation_hook``: Takes a :class:`~airflow.models.taskinstance.TaskInstance` parameter called
+  ``task_instance``. The ``task_instance_mutation`` applies not to a task but to the instance of a task that
+  relates to a particular DagRun. It is executed in a "worker", not in the dag file processor, just before the
+  task instance is executed. The policy is only applied to the currently executed run (i.e. instance) of that
+  task.
+
+The DAG and Task cluster policies can raise the  :class:`~airflow.exceptions.AirflowClusterPolicyViolation`
+exception to indicate that the dag/task they were passed is not compliant and should not be loaded.
+
+Any extra attributes set by a cluster policy take priority over those defined in your DAG file; for example,
+if you set an ``sla`` on your Task in the DAG file, and then your cluster policy also sets an ``sla``, the
+cluster policy's value will take precedence.
+
+To configure cluster policies, you should create an ``airflow_local_settings.py`` file in either the
+``config/`` folder under your $AIRFLOW_HOME, or place it on the $PYTHONPATH, and then add callables to the
+file matching one or more of the cluster policy names above (e.g. ``dag_policy``).

Review Comment:
   ```suggestion
   ``config/`` folder under your $AIRFLOW_HOME or place it on the $PYTHONPATH, and implement one or more of the cluster policies above (e.g. ``dag_policy``).
   ```



##########
airflow/policies.py:
##########
@@ -0,0 +1,242 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+If you want to check or mutate DAGs or Tasks on a cluster-wide level, then a Cluster Policy will let you do
+that. They have three main purposes:
+
+* Checking that DAGs/Tasks meet a certain standard
+* Setting default arguments on DAGs/Tasks
+* Performing custom routing logic
+
+There are three main types of cluster policy:

Review Comment:
   ```suggestion
   There are three types of cluster policies:
   ```
   
   "Main" suggests to me there are more? Would remove.



##########
airflow/policies.py:
##########
@@ -0,0 +1,242 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+If you want to check or mutate DAGs or Tasks on a cluster-wide level, then a Cluster Policy will let you do
+that. They have three main purposes:
+
+* Checking that DAGs/Tasks meet a certain standard
+* Setting default arguments on DAGs/Tasks
+* Performing custom routing logic
+
+There are three main types of cluster policy:
+
+* ``dag_policy``: Takes a :class:`~airflow.models.dag.DAG` parameter called ``dag``. Runs at load time of the
+  DAG from DagBag :class:`~airflow.models.dagbag.DagBag`.
+* ``task_policy``: Takes a :class:`~airflow.models.baseoperator.BaseOperator` parameter called ``task``. The
+  policy gets executed when the task is created during parsing of the task from DagBag at load time. This
+  means that the whole task definition can be altered in the task policy. It does not relate to a specific
+  task running in a DagRun. The ``task_policy`` defined is applied to all the task instances that will be
+  executed in the future.
+* ``task_instance_mutation_hook``: Takes a :class:`~airflow.models.taskinstance.TaskInstance` parameter called
+  ``task_instance``. The ``task_instance_mutation`` applies not to a task but to the instance of a task that

Review Comment:
   ```suggestion
     ``task_instance``. The ``task_instance_mutation_hook`` applies not to a task but to the instance of a task that
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org