You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by pi...@apache.org on 2023/01/12 00:23:28 UTC

[airflow] branch v2-5-test updated (e420172865 -> 2fa9474cc0)

This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


    from e420172865 Limit SQLAlchemy to below 2.0 (#28725)
     new a00f089f45 Clarify about docker compose (#28729)
     new 8202a4cc9f Update CSRF token to expire with session (#28730)
     new 228961c9b9 Clarify that versioned constraints are fixed at release time (#28762)
     new 2fa9474cc0 Only patch single label when adopting pod (#28776)

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../config_templates/default_webserver_config.py   |  1 +
 airflow/executors/kubernetes_executor.py           | 17 +++--
 docs/apache-airflow/howto/docker-compose/index.rst |  3 +
 .../installation/installing-from-pypi.rst          | 34 ++++++++-
 docs/docker-stack/index.rst                        | 35 ++++++++++
 tests/executors/test_kubernetes_executor.py        | 80 +++++++++++++++++++---
 6 files changed, 149 insertions(+), 21 deletions(-)


[airflow] 03/04: Clarify that versioned constraints are fixed at release time (#28762)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 228961c9b9fe209fd179f3122b2b93cdf455cd79
Author: Jarek Potiuk <ja...@potiuk.com>
AuthorDate: Fri Jan 6 10:29:28 2023 +0100

    Clarify that versioned constraints are fixed at release time (#28762)
    
    We received a number of requests to upgrade individual dependencies in
    the constraint files (mostly due to those dependencies releasing version
    with vulnerabilities fixed). This is not how our constraint works, their
    main purpose is to provide "consistent installation" mechanism for
    anyone who installs airflow from the scratch, we are not going to keep
    such relased versions up-to-date with versions of dependencies released
    after the release.
    
    This PR provides additional explanation about that in both constraint
    files as well as in reference container images which follow similar
    patterns.
    
    (cherry picked from commit 8290ade26deba02ca6cf3d8254981b31cf89ee5b)
---
 .../installation/installing-from-pypi.rst          | 34 ++++++++++++++++++++-
 docs/docker-stack/index.rst                        | 35 ++++++++++++++++++++++
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/docs/apache-airflow/installation/installing-from-pypi.rst b/docs/apache-airflow/installation/installing-from-pypi.rst
index 3ec492b0ed..e6cf162bc3 100644
--- a/docs/apache-airflow/installation/installing-from-pypi.rst
+++ b/docs/apache-airflow/installation/installing-from-pypi.rst
@@ -53,7 +53,7 @@ and both at the same time. We decided to keep our dependencies as open as possib
 version of libraries if needed. This means that from time to time plain ``pip install apache-airflow`` will
 not work or will produce an unusable Airflow installation.
 
-In order to have a repeatable installation, we also keep a set of "known-to-be-working" constraint files in the
+In order to have a repeatable installation (and only for that reason), we also keep a set of "known-to-be-working" constraint files in the
 ``constraints-main``, ``constraints-2-0``, ``constraints-2-1`` etc. orphan branches and then we create a tag
 for each released version e.g. :subst-code:`constraints-|version|`. This way, we keep a tested and working set of dependencies.
 
@@ -88,6 +88,38 @@ constraints always points to the "latest" released Airflow version constraints:
 
   https://raw.githubusercontent.com/apache/airflow/constraints-latest/constraints-3.7.txt
 
+
+Fixing Constraint files at release time
+'''''''''''''''''''''''''''''''''''''''
+
+The released "versioned" constraints are mostly ``fixed`` when we release Airflow version and we only
+update them in exceptional circumstances. For example when we find out that the released constraints might prevent
+Airflow from being installed consistently from the scratch. In normal circumstances, the constraint files
+are not going to change if new version of Airflow dependencies are released - not even when those
+versions contain critical security fixes. The process of Airflow releases is designed around upgrading
+dependencies automatically where applicable but only when we release a new version of Airflow,
+not for already released versions.
+
+If you want to make sure that Airflow dependencies are upgraded to the latest released versions containing
+latest security fixes, you should implement your own process to upgrade those yourself when
+you detect the need for that. Airflow usually does not upper-bound versions of its dependencies via
+requirements, so you should be able to upgrade them to the latest versions - usually without any problems.
+
+Obviously - since we have no control over what gets released in new versions of the dependencies, we
+cannot give any guarantees that tests and functionality of those dependencies will be compatible with
+Airflow after you upgrade them - testing if Airflow still works with those is in your hands,
+and in case of any problems, you should raise issue with the authors of the dependencies that are problematic.
+You can also - in such cases - look at the `Airflow issues <https://github.com/apache/airflow/issues>`_
+`Airflow Pull Requests <https://github.com/apache/airflow/pulls>`_ and
+`Airflow Discussions <https://github.com/apache/airflow/discussions>`_, searching for similar
+problems to see if there are any fixes or workarounds found in the ``main`` version of Airflow and apply them
+to your deployment.
+
+The easiest way to keep-up with the latest released dependencies is however, to upgrade to the latest released
+Airflow version. Whenever we release a new version of Airflow, we upgrade all dependencies to the latest
+applicable versions and test them together, so if you want to keep up with those tests - staying up-to-date
+with latest version of Airflow is the easiest way to update those dependencies.
+
 Installation and upgrade scenarios
 ''''''''''''''''''''''''''''''''''
 
diff --git a/docs/docker-stack/index.rst b/docs/docker-stack/index.rst
index be8877be06..d54c374b70 100644
--- a/docs/docker-stack/index.rst
+++ b/docs/docker-stack/index.rst
@@ -83,6 +83,41 @@ are also images published from branches but they are used mainly for development
 See `Airflow Git Branching <https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#airflow-git-branches>`_
 for details.
 
+Fixing images at release time
+=============================
+
+The released "versioned" reference images are mostly ``fixed`` when we release Airflow version and we only
+update them in exceptional circumstances. For example when we find out that there are dependency errors
+that might prevent important Airflow or embedded provider's functionalities working. In normal circumstances,
+the images are not going to change after release, even if new version of Airflow dependencies are released -
+not even when those versions contain critical security fixes. The process of Airflow releases is designed
+around upgrading dependencies automatically where applicable but only when we release a new version of Airflow,
+not for already released versions.
+
+If you want to make sure that Airflow dependencies are upgraded to the latest released versions containing
+latest security fixes in the image you use, you should implement your own process to upgrade
+those yourself when you build custom image based on the Airflow reference one. Airflow usually does not
+upper-bound versions of its dependencies via requirements, so you should be able to upgrade them to the
+latest versions - usually without any problems. And you can follow the process described in
+:ref:`Building the image <build:build_image>` to do it (even in automated way).
+
+Obviously - since we have no control over what gets released in new versions of the dependencies, we
+cannot give any guarantees that tests and functionality of those dependencies will be compatible with
+Airflow after you upgrade them - testing if Airflow still works with those is in your hands,
+and in case of any problems, you should raise issue with the authors of the dependencies that are problematic.
+You can also - in such cases - look at the `Airflow issues <https://github.com/apache/airflow/issues>`_
+`Airflow Pull Requests <https://github.com/apache/airflow/pulls>`_ and
+`Airflow Discussions <https://github.com/apache/airflow/discussions>`_, searching for similar
+problems to see if there are any fixes or workarounds found in the ``main`` version of Airflow and apply them
+to your custom image.
+
+The easiest way to keep-up with the latest released dependencies is however, to upgrade to the latest released
+Airflow version via switching to newly released images as base for your images, when a new version of
+Airflow is released. Whenever we release a new version of Airflow, we upgrade all dependencies to the latest
+applicable versions and test them together, so if you want to keep up with those tests - staying up-to-date
+with latest version of Airflow is the easiest way to update those dependencies.
+
+
 Support
 =======
 


[airflow] 02/04: Update CSRF token to expire with session (#28730)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 8202a4cc9f873d496c5f84b94fbca2c38fffda4f
Author: Max Ho <ma...@gmail.com>
AuthorDate: Wed Jan 11 07:25:29 2023 +0800

    Update CSRF token to expire with session (#28730)
    
    (cherry picked from commit 543e9a592e6b9dc81467c55169725e192fe95e89)
---
 airflow/config_templates/default_webserver_config.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/airflow/config_templates/default_webserver_config.py b/airflow/config_templates/default_webserver_config.py
index ac999a0dea..aa22b125fa 100644
--- a/airflow/config_templates/default_webserver_config.py
+++ b/airflow/config_templates/default_webserver_config.py
@@ -32,6 +32,7 @@ basedir = os.path.abspath(os.path.dirname(__file__))
 
 # Flask-WTF flag for CSRF
 WTF_CSRF_ENABLED = True
+WTF_CSRF_TIME_LIMIT = None
 
 # ----------------------------------------------------
 # AUTHENTICATION CONFIG


[airflow] 04/04: Only patch single label when adopting pod (#28776)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 2fa9474cc061152d7705a93ede5fe4bcfa49f1e1
Author: Jed Cunningham <66...@users.noreply.github.com>
AuthorDate: Mon Jan 9 18:07:56 2023 -0600

    Only patch single label when adopting pod (#28776)
    
    When KubernetesExecutor adopts pods, it was patching the pod with the
    pod it retrieved from the k8s api, while just updating a single label.
    Normally this works just fine, but there are cases where the pod you
    pull from the k8s api can't be used as-is when patching - it results
    in a 422 `Forbidden: pod updates may not change fields other than ...`.
    
    Instead we now just pass the single label we need to update to patch,
    allowing us to avoid accidentally "updating" other fields.
    
    Closes #24015
    
    (cherry picked from commit 9922953bcd9e11a1412a3528aef938444d62f7fe)
---
 airflow/executors/kubernetes_executor.py    | 17 +++---
 tests/executors/test_kubernetes_executor.py | 80 +++++++++++++++++++++++++----
 2 files changed, 77 insertions(+), 20 deletions(-)

diff --git a/airflow/executors/kubernetes_executor.py b/airflow/executors/kubernetes_executor.py
index 65e463a948..28f720f35e 100644
--- a/airflow/executors/kubernetes_executor.py
+++ b/airflow/executors/kubernetes_executor.py
@@ -636,7 +636,6 @@ class KubernetesExecutor(BaseExecutor):
                     )
                     self.fail(task[0], e)
                 except ApiException as e:
-
                     # These codes indicate something is wrong with pod definition; otherwise we assume pod
                     # definition is ok, and that retrying may work
                     if e.status in (400, 422):
@@ -748,27 +747,28 @@ class KubernetesExecutor(BaseExecutor):
             assert self.scheduler_job_id
 
         self.log.info("attempting to adopt pod %s", pod.metadata.name)
-        pod.metadata.labels["airflow-worker"] = pod_generator.make_safe_label_value(self.scheduler_job_id)
         pod_id = annotations_to_key(pod.metadata.annotations)
         if pod_id not in pod_ids:
             self.log.error("attempting to adopt taskinstance which was not specified by database: %s", pod_id)
             return
 
+        new_worker_id_label = pod_generator.make_safe_label_value(self.scheduler_job_id)
         try:
             kube_client.patch_namespaced_pod(
                 name=pod.metadata.name,
                 namespace=pod.metadata.namespace,
-                body=PodGenerator.serialize_pod(pod),
+                body={"metadata": {"labels": {"airflow-worker": new_worker_id_label}}},
             )
-            pod_ids.pop(pod_id)
-            self.running.add(pod_id)
         except ApiException as e:
             self.log.info("Failed to adopt pod %s. Reason: %s", pod.metadata.name, e)
+            return
+
+        del pod_ids[pod_id]
+        self.running.add(pod_id)
 
     def _adopt_completed_pods(self, kube_client: client.CoreV1Api) -> None:
         """
-
-        Patch completed pod so that the KubernetesJobWatcher can delete it.
+        Patch completed pods so that the KubernetesJobWatcher can delete them.
 
         :param kube_client: kubernetes client for speaking to kube API
         """
@@ -783,12 +783,11 @@ class KubernetesExecutor(BaseExecutor):
         pod_list = kube_client.list_namespaced_pod(namespace=self.kube_config.kube_namespace, **kwargs)
         for pod in pod_list.items:
             self.log.info("Attempting to adopt pod %s", pod.metadata.name)
-            pod.metadata.labels["airflow-worker"] = new_worker_id_label
             try:
                 kube_client.patch_namespaced_pod(
                     name=pod.metadata.name,
                     namespace=pod.metadata.namespace,
-                    body=PodGenerator.serialize_pod(pod),
+                    body={"metadata": {"labels": {"airflow-worker": new_worker_id_label}}},
                 )
             except ApiException as e:
                 self.log.info("Failed to adopt pod %s. Reason: %s", pod.metadata.name, e)
diff --git a/tests/executors/test_kubernetes_executor.py b/tests/executors/test_kubernetes_executor.py
index 367f1cb2c4..97619225e6 100644
--- a/tests/executors/test_kubernetes_executor.py
+++ b/tests/executors/test_kubernetes_executor.py
@@ -654,20 +654,78 @@ class TestKubernetesExecutor:
         pod_ids = {ti_key: {}}
 
         executor.adopt_launched_task(mock_kube_client, pod=pod, pod_ids=pod_ids)
-        assert mock_kube_client.patch_namespaced_pod.call_args[1] == {
-            "body": {
-                "metadata": {
-                    "labels": {"airflow-worker": "modified"},
-                    "annotations": annotations,
-                    "name": "foo",
-                }
-            },
-            "name": "foo",
-            "namespace": None,
-        }
+        mock_kube_client.patch_namespaced_pod.assert_called_once_with(
+            body={"metadata": {"labels": {"airflow-worker": "modified"}}},
+            name="foo",
+            namespace=None,
+        )
         assert pod_ids == {}
         assert executor.running == {ti_key}
 
+    @mock.patch("airflow.executors.kubernetes_executor.get_kube_client")
+    def test_adopt_launched_task_api_exception(self, mock_kube_client):
+        """We shouldn't think we are running the task if aren't able to patch the pod"""
+        executor = self.kubernetes_executor
+        executor.scheduler_job_id = "modified"
+        annotations = {
+            "dag_id": "dag",
+            "run_id": "run_id",
+            "task_id": "task",
+            "try_number": "1",
+        }
+        ti_key = annotations_to_key(annotations)
+        pod = k8s.V1Pod(metadata=k8s.V1ObjectMeta(name="foo", annotations=annotations))
+        pod_ids = {ti_key: {}}
+
+        mock_kube_client.patch_namespaced_pod.side_effect = ApiException(status=400)
+        executor.adopt_launched_task(mock_kube_client, pod=pod, pod_ids=pod_ids)
+        mock_kube_client.patch_namespaced_pod.assert_called_once_with(
+            body={"metadata": {"labels": {"airflow-worker": "modified"}}},
+            name="foo",
+            namespace=None,
+        )
+        assert pod_ids == {ti_key: {}}
+        assert executor.running == set()
+
+    @mock.patch("airflow.executors.kubernetes_executor.get_kube_client")
+    def test_adopt_completed_pods(self, mock_kube_client):
+        """We should adopt all completed pods from other schedulers"""
+        executor = self.kubernetes_executor
+        executor.scheduler_job_id = "modified"
+        executor.kube_client = mock_kube_client
+        executor.kube_config.kube_namespace = "somens"
+        pod_names = ["one", "two"]
+        mock_kube_client.list_namespaced_pod.return_value.items = [
+            k8s.V1Pod(
+                metadata=k8s.V1ObjectMeta(
+                    name=pod_name,
+                    labels={"airflow-worker": pod_name},
+                    annotations={"some_annotation": "hello"},
+                    namespace="somens",
+                )
+            )
+            for pod_name in pod_names
+        ]
+
+        executor._adopt_completed_pods(mock_kube_client)
+        mock_kube_client.list_namespaced_pod.assert_called_once_with(
+            namespace="somens",
+            field_selector="status.phase=Succeeded",
+            label_selector="kubernetes_executor=True,airflow-worker!=modified",
+        )
+        assert len(pod_names) == mock_kube_client.patch_namespaced_pod.call_count
+        mock_kube_client.patch_namespaced_pod.assert_has_calls(
+            [
+                mock.call(
+                    body={"metadata": {"labels": {"airflow-worker": "modified"}}},
+                    name=pod_name,
+                    namespace="somens",
+                )
+                for pod_name in pod_names
+            ],
+            any_order=True,
+        )
+
     @mock.patch("airflow.executors.kubernetes_executor.get_kube_client")
     def test_not_adopt_unassigned_task(self, mock_kube_client):
         """


[airflow] 01/04: Clarify about docker compose (#28729)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit a00f089f45e096c5b833af088063e543f3c19e9d
Author: eladkal <45...@users.noreply.github.com>
AuthorDate: Wed Jan 4 18:05:58 2023 +0200

    Clarify about docker compose (#28729)
    
    We got several requests to update syntax https://github.com/apache/airflow/pull/28728 https://github.com/apache/airflow/pull/27792 https://github.com/apache/airflow/pull/28194
    lets clarify that this is not a mistake
    
    (cherry picked from commit df0e4c9ad447377073af1ed60fb0dfad731be059)
---
 docs/apache-airflow/howto/docker-compose/index.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/apache-airflow/howto/docker-compose/index.rst b/docs/apache-airflow/howto/docker-compose/index.rst
index 33e06e4b96..cc7a67e297 100644
--- a/docs/apache-airflow/howto/docker-compose/index.rst
+++ b/docs/apache-airflow/howto/docker-compose/index.rst
@@ -162,6 +162,9 @@ Now you can start all services:
 
     docker compose up
 
+.. note::
+  docker-compose is old syntax. Please check `Stackoverflow <https://stackoverflow.com/questions/66514436/difference-between-docker-compose-and-docker-compose>`__.
+
 In a second terminal you can check the condition of the containers and make sure that no containers are in an unhealthy condition:
 
 .. code-block:: text