You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/23 17:30:00 UTC
[GitHub] [airflow] bpleines opened a new issue #11789: KubernetesExecutor git-sync example
bpleines opened a new issue #11789:
URL: https://github.com/apache/airflow/issues/11789
<!--
-->
<!--
IMPORTANT!!!
PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
Please complete the next sections or the issue will be closed.
These questions are the first thing we need to know to understand the context.
-->
**Apache Airflow version**: 1.10.12
**Kubernetes version (if you are using kubernetes)** (use `kubectl version`): `1.18.8`
**Environment**:
- **Cloud provider or hardware configuration**:
- **OS** (e.g. from /etc/os-release):
- **Kernel** (e.g. `uname -a`):
- **Install tools**:
- **Others**:
**What happened**:
The included `git-sync` example has a few issues that if addressed may aid adoption of the powerful config option.
1. It does not properly share the kubernetes volume containing dags
2. omits the `GIT_SYNC_ONE_TIME` option which is necessary for the initContainer to exit after syncing.
3. `GIT_SYNC_WAIT` is not applicable because the initContainer should exit immediately after syncing.
**What you expected to happen**:
The `git-sync` initContainer syncs a dag repository to the shared k8s volume and then exits. The shared k8s volume `airflow-dags` is then consumed by the airflow worker pod. Lastly, because the `git-sync` container always syncs a repo inside a nested directory, force the naming of that destination directory to be `dags` and mount it one directory level up onto the airflow worker pod.
```
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
spec:
initContainers:
- name: git-sync
image: "k8s.gcr.io/git-sync:v3.1.6"
env:
- name: GIT_SYNC_REV
value: "HEAD"
- name: GIT_SYNC_BRANCH
value: "v1-10-stable"
- name: GIT_SYNC_REPO
value: "https://github.com/apache/airflow.git"
- name: GIT_SYNC_DEPTH
value: "1"
- name: GIT_SYNC_ROOT
value: "/git"
- name: GIT_SYNC_DEST
value: "dags"
- name: GIT_SYNC_ADD_USER
value: "true"
- name: GIT_SYNC_ONE_TIME
value: true
- name: GIT_SYNC_MAX_SYNC_FAILURES
value: "0"
volumeMounts:
- name: airflow-dags
mountPath: /git
containers:
- args: []
command: []
env:
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
# Hard Coded Airflow Envs
- name: AIRFLOW__CORE__FERNET_KEY
valueFrom:
secretKeyRef:
name: RELEASE-NAME-fernet-key
key: fernet-key
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
name: RELEASE-NAME-airflow-metadata
key: connection
- name: AIRFLOW_CONN_AIRFLOW_DB
valueFrom:
secretKeyRef:
name: RELEASE-NAME-airflow-metadata
key: connection
envFrom: []
image: dummy_image
imagePullPolicy: IfNotPresent
name: base
ports: []
volumeMounts:
- mountPath: "/opt/airflow/logs"
name: airflow-logs
- mountPath: "/opt/airflow"
name: airflow-dags
readOnly: false
hostNetwork: false
restartPolicy: Never
securityContext:
runAsUser: 50000
nodeSelector:
{}
affinity:
{}
tolerations:
[]
serviceAccountName: 'RELEASE-NAME-worker-serviceaccount'
volumes:
- name: airflow-dags
emptyDir: {}
- emptyDir: {}
name: airflow-logs
- configMap:
name: RELEASE-NAME-airflow-config
name: airflow-config
- configMap:
name: RELEASE-NAME-airflow-config
name: airflow-local-settings
```
**How to reproduce it**:
Try to use the existing [`git-sync` template ](https://github.com/astronomer/airflow/blob/master/airflow/kubernetes/pod_template_file_examples/git_sync_template.yaml)
**Anything else we need to know**:
The latest git-sync container is now version v3.2.0 and can be pulled at `k8s.gcr.io/git-sync/git-sync:v3.2.0`.
In my experience only 4 `GIT_SYNC_*` environment variables are needed when pulling a dags repo from a public git repository. We have to assume that the dags are present in the top-level directory of the dags repo otherwise mounting them to the worker pod requires a custom path.
```
- name: GIT_SYNC_BRANCH
value: "master"
- name: GIT_SYNC_REPO
value: "https://github.com/bpleines/airflow-dags"
- name: GIT_SYNC_DEST
value: "dags"
- name: GIT_SYNC_ONE_TIME
value: "true"
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-842702431
Thanks @bpleines, I somehow missed that this was the example, not the pod template used in the helm chart 🤦♂️. #15904 opened to fix the example.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] bpleines commented on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-842726050
I did not see the correlation to the `area:helm-chart` github label either. PR looks great thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kaxil closed issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
kaxil closed issue #11789:
URL: https://github.com/apache/airflow/issues/11789
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] scheung38 commented on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
scheung38 commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-786779579
Hello, have astronomer airflow running rancher k8s on-prem, so this upgrade is sufficient to mount DAGs using Git-Sync side car with persistence enabled? i.e automatically picks you DAG python jobs from another repo each time a python file is committed?
helm upgrade -f git_sync_template.yaml is all I need?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] scheung38 edited a comment on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
scheung38 edited a comment on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-786779579
Hello, have astronomer airflow running rancher k8s on-prem, so this upgrade is sufficient to mount DAGs using Git-Sync side car with persistence enabled? i.e automatically picks you DAG python jobs from another repo each time a python file is committed?
How do we know which git repo to trigger from?
helm upgrade -f git_sync_template.yaml is all I need?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-715475309
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] bpleines commented on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-841575921
@jedcunningham, `master` still uses a [template](https://github.com/astronomer/airflow/blob/master/airflow/kubernetes/pod_template_file_examples/git_sync_template.yaml) with git-sync environment variables that make sense for a sidecar container as opposed to an initContainer.
`GIT_SYNC_WAIT` is the interval between syncs. `airflow-worker` pods should only sync DAGs once and exit, so `GIT_SYNC_ONE_TIME` should be used instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] scheung38 edited a comment on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
scheung38 edited a comment on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-786779579
Hello, have astronomer airflow running rancher k8s on-prem, so this upgrade is sufficient to mount DAGs using Git-Sync side car with persistence enabled? i.e automatically picks you DAG python jobs from another repo each time a python file is committed?
How do we know which git repo to trigger from?
helm upgrade -f git_sync_template.yaml is all I need? Getting the Error: UPGRADE FAILED: release failed, and has been rolled back due to atomic being set: timed out waiting for the condition
not sure exactly what needed to change?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on issue #11789: KubernetesExecutor git-sync example
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on issue #11789:
URL: https://github.com/apache/airflow/issues/11789#issuecomment-832248025
@bpleines, can you try with the latest chart in master? I believe it is now functional with KubernetesExecutor.
@scheung38, you might try posting on slack if you are still having issues: https://apache-airflow-slack.herokuapp.com/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org