You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/10 21:17:42 UTC

[GitHub] [airflow] kaxil opened a new pull request #15314: Chart: Disable `git-sync` when DAG serialization is enabled

kaxil opened a new pull request #15314:
URL: https://github.com/apache/airflow/pull/15314


   closes https://github.com/apache/airflow/issues/11704
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#issuecomment-817290718


   [The Workflow run](https://github.com/apache/airflow/actions/runs/738069533) is cancelling this PR. Building images for the PR has failed. Follow the workflow link to check the reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#discussion_r611171506



##########
File path: docs/helm-chart/manage-dags-files.rst
##########
@@ -91,10 +96,27 @@ This option will use a Persistent Volume Claim with an access mode of ``ReadWrit
       # by setting the  dags.persistence.* and dags.gitSync.* values
       # Please refer to values.yaml for details
 
+When using ``apache-airflow>=2.0.0``, :ref:`DAG Serialization <apache-airflow:dag-serialization>` is enabled by default,
+hence Webserver does not need access to DAG files, so you can turn off ``git-sync`` for Webserver by setting
+``dags.gitSync.excludeWebserver`` to ``true``.
+This is also recommended when enabling DAG Serialization for ``apache-airflow>=1.10.7,<2``.

Review comment:
       We had removed the default and have a fallback of `True` for `store_dag_code`
   
   https://github.com/apache/airflow/blob/5da831910c358ecbd7a5c33ee31fe0d909508bea/airflow/settings.py#L460-L462
   
   https://github.com/apache/airflow/blob/5da831910c358ecbd7a5c33ee31fe0d909508bea/airflow/config_templates/default_airflow.cfg#L205-L209
   
   Same for 1.10.11:
   
   https://github.com/apache/airflow/blob/1.10.11/airflow/config_templates/default_airflow.cfg#L237-L242
   
   where the value is same as `store_serialized_dags`:
   
   https://github.com/apache/airflow/blob/1.10.11/airflow/settings.py#L431-L434




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#issuecomment-817303695


   [The Workflow run](https://github.com/apache/airflow/actions/runs/738128839) is cancelling this PR. Building images for the PR has failed. Follow the workflow link to check the reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#issuecomment-817902356


   Could you update this PR title and description please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#discussion_r611145169



##########
File path: docs/helm-chart/manage-dags-files.rst
##########
@@ -91,10 +96,27 @@ This option will use a Persistent Volume Claim with an access mode of ``ReadWrit
       # by setting the  dags.persistence.* and dags.gitSync.* values
       # Please refer to values.yaml for details
 
+When using ``apache-airflow>=2.0.0``, :ref:`DAG Serialization <apache-airflow:dag-serialization>` is enabled by default,
+hence Webserver does not need access to DAG files, so you can turn off ``git-sync`` for Webserver by setting
+``dags.gitSync.excludeWebserver`` to ``true``.
+This is also recommended when enabling DAG Serialization for ``apache-airflow>=1.10.7,<2``.

Review comment:
       For Airflow 1.10.9 and earlier, DAG files is still needed for DAG code ttab. For Airflow 1.10.10 and newer (including 2.0.0), it is needed to enable`store_dag_code` to `True` (default value: `False`). DAG Serialization is not enough. 
   http://airflow.apache.org/docs/apache-airflow/2.0.1/configurations-ref.html#store-dag-code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #15314: Chart: Allow disabling `git-sync` for Webserver

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #15314:
URL: https://github.com/apache/airflow/pull/15314


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#discussion_r611171506



##########
File path: docs/helm-chart/manage-dags-files.rst
##########
@@ -91,10 +96,27 @@ This option will use a Persistent Volume Claim with an access mode of ``ReadWrit
       # by setting the  dags.persistence.* and dags.gitSync.* values
       # Please refer to values.yaml for details
 
+When using ``apache-airflow>=2.0.0``, :ref:`DAG Serialization <apache-airflow:dag-serialization>` is enabled by default,
+hence Webserver does not need access to DAG files, so you can turn off ``git-sync`` for Webserver by setting
+``dags.gitSync.excludeWebserver`` to ``true``.
+This is also recommended when enabling DAG Serialization for ``apache-airflow>=1.10.7,<2``.

Review comment:
       We had removed the default and have a fallback of `True` for `store_dag_code` for Airflow >=1.10.11
   
   https://github.com/apache/airflow/blob/5da831910c358ecbd7a5c33ee31fe0d909508bea/airflow/settings.py#L460-L462
   
   https://github.com/apache/airflow/blob/5da831910c358ecbd7a5c33ee31fe0d909508bea/airflow/config_templates/default_airflow.cfg#L205-L209
   
   Same for 1.10.11:
   
   https://github.com/apache/airflow/blob/1.10.11/airflow/config_templates/default_airflow.cfg#L237-L242
   
   where the value is same as `store_serialized_dags`:
   
   https://github.com/apache/airflow/blob/1.10.11/airflow/settings.py#L431-L434




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#discussion_r611172036



##########
File path: docs/helm-chart/manage-dags-files.rst
##########
@@ -91,10 +96,27 @@ This option will use a Persistent Volume Claim with an access mode of ``ReadWrit
       # by setting the  dags.persistence.* and dags.gitSync.* values
       # Please refer to values.yaml for details
 
+When using ``apache-airflow>=2.0.0``, :ref:`DAG Serialization <apache-airflow:dag-serialization>` is enabled by default,
+hence Webserver does not need access to DAG files, so you can turn off ``git-sync`` for Webserver by setting
+``dags.gitSync.excludeWebserver`` to ``true``.
+This is also recommended when enabling DAG Serialization for ``apache-airflow>=1.10.7,<2``.

Review comment:
       I updated the description in https://github.com/apache/airflow/pull/15314/commits/fb10b097e4537e46ecdc46c69bba199162324d71




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#discussion_r611215956



##########
File path: docs/helm-chart/manage-dags-files.rst
##########
@@ -91,10 +96,27 @@ This option will use a Persistent Volume Claim with an access mode of ``ReadWrit
       # by setting the  dags.persistence.* and dags.gitSync.* values
       # Please refer to values.yaml for details
 
+When using ``apache-airflow>=2.0.0``, :ref:`DAG Serialization <apache-airflow:dag-serialization>` is enabled by default,
+hence Webserver does not need access to DAG files, so you can turn off ``git-sync`` for Webserver by setting
+``dags.gitSync.excludeWebserver`` to ``true``.
+This is also recommended when enabling DAG Serialization for ``apache-airflow>=1.10.11,<2``.
+
+.. code-block:: bash
+
+    helm upgrade airflow . \
+      --set dags.persistence.enabled=true \
+      --set dags.gitSync.enabled=true \
+      --set dags.gitSync.excludeWebserver=true
+      # you can also override the other persistence or gitSync values
+      # by setting the  dags.persistence.* and dags.gitSync.* values
+      # Please refer to values.yaml for details
+
 Mounting DAGs using Git-Sync sidecar without Persistence
 --------------------------------------------------------
 
-This option will use an always running Git-Sync side car on every scheduler, webserver and worker pods. The Git-Sync side car containers will sync DAGs from a git repository every configured number of seconds. If you are using the KubernetesExecutor, Git-sync will run as an init container on your worker pods.
+This option will use an always running Git-Sync side car on every scheduler, webserver and worker pods.

Review comment:
       ```suggestion
   This option will use an always running Git-Sync sidecar on every scheduler, webserver and worker pods.
   ```

##########
File path: docs/helm-chart/manage-dags-files.rst
##########
@@ -91,10 +96,27 @@ This option will use a Persistent Volume Claim with an access mode of ``ReadWrit
       # by setting the  dags.persistence.* and dags.gitSync.* values
       # Please refer to values.yaml for details
 
+When using ``apache-airflow>=2.0.0``, :ref:`DAG Serialization <apache-airflow:dag-serialization>` is enabled by default,
+hence Webserver does not need access to DAG files, so you can turn off ``git-sync`` for Webserver by setting
+``dags.gitSync.excludeWebserver`` to ``true``.
+This is also recommended when enabling DAG Serialization for ``apache-airflow>=1.10.11,<2``.
+
+.. code-block:: bash
+
+    helm upgrade airflow . \
+      --set dags.persistence.enabled=true \
+      --set dags.gitSync.enabled=true \
+      --set dags.gitSync.excludeWebserver=true
+      # you can also override the other persistence or gitSync values
+      # by setting the  dags.persistence.* and dags.gitSync.* values
+      # Please refer to values.yaml for details
+
 Mounting DAGs using Git-Sync sidecar without Persistence
 --------------------------------------------------------
 
-This option will use an always running Git-Sync side car on every scheduler, webserver and worker pods. The Git-Sync side car containers will sync DAGs from a git repository every configured number of seconds. If you are using the KubernetesExecutor, Git-sync will run as an init container on your worker pods.
+This option will use an always running Git-Sync side car on every scheduler, webserver and worker pods.
+The Git-Sync side car containers will sync DAGs from a git repository every configured number of

Review comment:
       ```suggestion
   The Git-Sync sidecar containers will sync DAGs from a git repository every configured number of
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15314: Chart: Allow disabling `git-sync` when DAG Serialization is enabled

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15314:
URL: https://github.com/apache/airflow/pull/15314#issuecomment-817337775


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org