You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/23 17:30:00 UTC

[GitHub] [airflow] bpleines opened a new issue #11789: KubernetesExecutor git-sync example

bpleines opened a new issue #11789:
URL: https://github.com/apache/airflow/issues/11789


   <!--
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**: 1.10.12
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): `1.18.8`
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release):
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   The included `git-sync` example has a few issues that if addressed may aid adoption of the powerful config option.
   1. It does not properly share the kubernetes volume containing dags
   2. omits the `GIT_SYNC_ONE_TIME` option which is necessary for the initContainer to exit after syncing.
   3. `GIT_SYNC_WAIT` is not applicable because the initContainer should exit immediately after syncing.
   
   **What you expected to happen**:
   
   The `git-sync` initContainer syncs a dag repository to the shared k8s volume and then exits. The shared k8s volume `airflow-dags` is then consumed by the airflow worker pod. Lastly, because the `git-sync` container always syncs a repo inside a nested directory, force the naming of that destination directory to be `dags` and mount it one directory level up onto the airflow worker pod.
   
   ```
   apiVersion: v1
   kind: Pod
   metadata:
     name: dummy-name
   spec:
     initContainers:
       - name: git-sync
         image: "k8s.gcr.io/git-sync:v3.1.6"
         env:
           - name: GIT_SYNC_REV
             value: "HEAD"
           - name: GIT_SYNC_BRANCH
             value: "v1-10-stable"
           - name: GIT_SYNC_REPO
             value: "https://github.com/apache/airflow.git"
           - name: GIT_SYNC_DEPTH
             value: "1"
           - name: GIT_SYNC_ROOT
             value: "/git"
           - name: GIT_SYNC_DEST
             value: "dags"
           - name: GIT_SYNC_ADD_USER
             value: "true"                                                                                                                                                                                                                                
           - name: GIT_SYNC_ONE_TIME
             value: true
           - name: GIT_SYNC_MAX_SYNC_FAILURES
             value: "0"
         volumeMounts:
           - name: airflow-dags
             mountPath: /git
     containers:
       - args: []
         command: []
         env:
           - name: AIRFLOW__CORE__EXECUTOR
             value: LocalExecutor
           # Hard Coded Airflow Envs
           - name: AIRFLOW__CORE__FERNET_KEY
             valueFrom:
               secretKeyRef:
                 name: RELEASE-NAME-fernet-key
                 key: fernet-key
           - name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
             valueFrom:
               secretKeyRef:
                 name: RELEASE-NAME-airflow-metadata
                 key: connection
           - name: AIRFLOW_CONN_AIRFLOW_DB
             valueFrom:
               secretKeyRef:
                 name: RELEASE-NAME-airflow-metadata
                 key: connection
         envFrom: []
         image: dummy_image
         imagePullPolicy: IfNotPresent
         name: base
         ports: []
         volumeMounts:
           - mountPath: "/opt/airflow/logs"
             name: airflow-logs
           - mountPath: "/opt/airflow"
             name: airflow-dags
             readOnly: false
     hostNetwork: false
     restartPolicy: Never
     securityContext:
       runAsUser: 50000
     nodeSelector:
       {}
     affinity:
       {}
     tolerations:
       []
     serviceAccountName: 'RELEASE-NAME-worker-serviceaccount'
     volumes:
       - name: airflow-dags
         emptyDir: {}
       - emptyDir: {}
         name: airflow-logs
       - configMap:
           name: RELEASE-NAME-airflow-config
         name: airflow-config
       - configMap:
           name: RELEASE-NAME-airflow-config
         name: airflow-local-settings
   ```
   
   **How to reproduce it**:
   
   Try to use the existing [`git-sync` template ](https://github.com/astronomer/airflow/blob/master/airflow/kubernetes/pod_template_file_examples/git_sync_template.yaml)
   
   **Anything else we need to know**:
   
   The latest git-sync container is now version v3.2.0 and can be pulled at `k8s.gcr.io/git-sync/git-sync:v3.2.0`.
   
   In my experience only 4 `GIT_SYNC_*` environment variables are needed when pulling a dags repo from a public git repository. We have to assume that the dags are present in the top-level directory of the dags repo otherwise mounting them to the worker pod requires a custom path.
   ```
   - name: GIT_SYNC_BRANCH
      value: "master"
   - name: GIT_SYNC_REPO
      value: "https://github.com/bpleines/airflow-dags"
   - name:  GIT_SYNC_DEST
      value: "dags"
   - name:  GIT_SYNC_ONE_TIME
      value: "true"
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-842702431


   Thanks @bpleines, I somehow missed that this was the example, not the pod template used in the helm chart 🤦‍♂️. #15904  opened to fix the example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bpleines commented on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-842726050


   I did not see the correlation to the `area:helm-chart` github label either. PR looks great thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #11789:
URL: https://github.com/apache/airflow/issues/11789


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scheung38 commented on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
scheung38 commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-786779579


   Hello, have astronomer airflow running rancher k8s on-prem, so this upgrade is sufficient to mount DAGs using Git-Sync side car with persistence enabled? i.e automatically picks you DAG python jobs from another repo each time a python file is committed?
   
   helm upgrade -f git_sync_template.yaml is all I need?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scheung38 edited a comment on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
scheung38 edited a comment on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-786779579


   Hello, have astronomer airflow running rancher k8s on-prem, so this upgrade is sufficient to mount DAGs using Git-Sync side car with persistence enabled? i.e automatically picks you DAG python jobs from another repo each time a python file is committed?
   
   How do we know which git repo to trigger from?
   
   helm upgrade -f git_sync_template.yaml is all I need?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-715475309


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bpleines commented on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-841575921


   @jedcunningham, `master` still uses a [template](https://github.com/astronomer/airflow/blob/master/airflow/kubernetes/pod_template_file_examples/git_sync_template.yaml) with git-sync environment variables that make sense for a sidecar container as opposed to an initContainer. 
   
   `GIT_SYNC_WAIT` is the interval between syncs. `airflow-worker` pods should only sync DAGs once and exit, so `GIT_SYNC_ONE_TIME` should be used instead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] scheung38 edited a comment on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
scheung38 edited a comment on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-786779579


   Hello, have astronomer airflow running rancher k8s on-prem, so this upgrade is sufficient to mount DAGs using Git-Sync side car with persistence enabled? i.e automatically picks you DAG python jobs from another repo each time a python file is committed?
   
   How do we know which git repo to trigger from?
   
   helm upgrade -f git_sync_template.yaml is all I need? Getting the Error: UPGRADE FAILED: release failed, and has been rolled back due to atomic being set: timed out waiting for the condition
   
   not sure exactly what needed to change?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on issue #11789: KubernetesExecutor git-sync example

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-832248025


   @bpleines, can you try with the latest chart in master? I believe it is now functional with KubernetesExecutor.
   
   @scheung38, you might try posting on slack if you are still having issues: https://apache-airflow-slack.herokuapp.com/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org