You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by "jordan.zucker@gmail.com" <jo...@gmail.com> on 2018/01/12 23:07:53 UTC

Airflow + Kubernetes + Git Sync

I'm trying to use Airflow and Kubernetes and having trouble using git sync to pull DAGs into workers.

I use a git sync init container on the scheduler to pull in DAGs initially and that works. But when worker pods are spawned, the workers terminate almost immediately because they cannot find the DAGs. But since the workers terminate so quickly, I can't even inspect the file structure to see where the DAGs ended up during the workers git sync init container.

I noticed that the git sync init container for the workers is hard coded into /tmp/dags and there is a git_subpath config setting as well. But I can't understand how the git synced DAGs ever end up in /root/airflow/dags

I am successfully using a git sync init container for the scheduler, so I know my git credentials are valid. Any good way to debug this? Or an example of how to set this up correctly?

Re: Airflow + Kubernetes + Git Sync

Posted by Laura Lorenz <ll...@industrydive.com>.
This is a different tack entirely, but how we do it is bake our DAGs etc
into the images we use for our deploy, which we push to Google Cloud
Registry. We use helm, an extra package on top of kubernetes to interpolate
the image name into the deployment files on release. We basically have a
`Make deploy` type situation that 1. builds the containers with the code
from a certain commit baked into them 2. pushes them to GCR with an
environment and git hash 3. instantiates a helm upgrade with the name of
that image interpolated in the right spots. So, not helpful regarding git
syncing but is possibly another deployment tack to try that's been
successful for us on GKE.

Laura

On Mon, Jan 15, 2018 at 2:53 PM, jordan.zucker@gmail.com <
jordan.zucker@gmail.com> wrote:

>
>
> On 2018-01-12 16:17, Anirudh Ramanathan <ra...@google.com.INVALID>
> wrote:
> > > Any good way to debug this?
> >
> > One way might be reading the events from "kubectl get events". That
> should
> > reveal some information about the pod removal event.
> > This brings up another question - should errored pods be persisted for
> > debugging?
> >
> > On Fri, Jan 12, 2018 at 3:07 PM, jordan.zucker@gmail.com <
> > jordan.zucker@gmail.com> wrote:
> >
> > > I'm trying to use Airflow and Kubernetes and having trouble using git
> sync
> > > to pull DAGs into workers.
> > >
> > > I use a git sync init container on the scheduler to pull in DAGs
> initially
> > > and that works. But when worker pods are spawned, the workers terminate
> > > almost immediately because they cannot find the DAGs. But since the
> workers
> > > terminate so quickly, I can't even inspect the file structure to see
> where
> > > the DAGs ended up during the workers git sync init container.
> > >
> > > I noticed that the git sync init container for the workers is hard
> coded
> > > into /tmp/dags and there is a git_subpath config setting as well. But I
> > > can't understand how the git synced DAGs ever end up in
> /root/airflow/dags
> > >
> > > I am successfully using a git sync init container for the scheduler,
> so I
> > > know my git credentials are valid. Any good way to debug this? Or an
> > > example of how to set this up correctly?
> > >
> >
> >
> >
> > --
> > Anirudh Ramanathan
> Anirudh, in my case, the pods are persisted because I set the
> `delete_worker_pods` to False in the airflow configmap, however, I cannot
> exec into them because they terminated on an error and are no longer
> running.
>

Re: Airflow + Kubernetes + Git Sync

Posted by "jordan.zucker@gmail.com" <jo...@gmail.com>.

On 2018-01-12 16:17, Anirudh Ramanathan <ra...@google.com.INVALID> wrote: 
> > Any good way to debug this?
> 
> One way might be reading the events from "kubectl get events". That should
> reveal some information about the pod removal event.
> This brings up another question - should errored pods be persisted for
> debugging?
> 
> On Fri, Jan 12, 2018 at 3:07 PM, jordan.zucker@gmail.com <
> jordan.zucker@gmail.com> wrote:
> 
> > I'm trying to use Airflow and Kubernetes and having trouble using git sync
> > to pull DAGs into workers.
> >
> > I use a git sync init container on the scheduler to pull in DAGs initially
> > and that works. But when worker pods are spawned, the workers terminate
> > almost immediately because they cannot find the DAGs. But since the workers
> > terminate so quickly, I can't even inspect the file structure to see where
> > the DAGs ended up during the workers git sync init container.
> >
> > I noticed that the git sync init container for the workers is hard coded
> > into /tmp/dags and there is a git_subpath config setting as well. But I
> > can't understand how the git synced DAGs ever end up in /root/airflow/dags
> >
> > I am successfully using a git sync init container for the scheduler, so I
> > know my git credentials are valid. Any good way to debug this? Or an
> > example of how to set this up correctly?
> >
> 
> 
> 
> -- 
> Anirudh Ramanathan
Anirudh, in my case, the pods are persisted because I set the `delete_worker_pods` to False in the airflow configmap, however, I cannot exec into them because they terminated on an error and are no longer running.

Re: Airflow + Kubernetes + Git Sync

Posted by "jordan.zucker@gmail.com" <jo...@gmail.com>.

On 2018-01-13 08:12, Daniel Imberman <da...@gmail.com> wrote: 
> @jordan can you turn delete mode off and post the kubectl describe results
> for the workers?
Already had delete mode turned off. This was a really useful command. I can see the basic logs in the k8s dashboard:
+ airflow run jordan_dag_3 run_this_1 2018-01-15T10:00:00 --local -sd /root/airflow/dags/jordan3.py
[2018-01-15 19:40:52,978] {__init__.py:46} INFO - Using executor LocalExecutor
[2018-01-15 19:40:53,012] {models.py:187} INFO - Filling up the DagBag from /root/airflow/dags/jordan3.py
Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 27, in <module>
    args.func(args)
  File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 350, in run
    dag = get_dag(args)
  File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 128, in get_dag
    'parse.'.format(args.dag_id))
airflow.exceptions.AirflowException: dag_id could not be found: jordan_dag_3. Either the dag did not exist or it failed to parse.

I know the DAG is there in the scheduler and the webserver. I have reason to believe the git-sync init container in the worker isn't checking out the files in a way that the worker can use. Here's the info you requested:

Name:         jordandag3runthis1-8df809f80c874d6ca50acb0d0480307c
Namespace:    default
Node:         minikube/192.168.99.100
Start Time:   Mon, 15 Jan 2018 11:59:57 -0800
Labels:       airflow-slave=
              dag_id=jordan_dag_3
              execution_date=2018-01-15T19_59_54.838835
              task_id=run_this_1
Annotations:  pod.alpha.kubernetes.io/init-container-statuses=[{"name":"git-sync-clone","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2018-01-15T19:59:58Z","finishedAt":"2018-01-15T19:59:59Z...
              pod.alpha.kubernetes.io/init-containers=[{"name":"git-sync-clone","image":"gcr.io/google-containers/git-sync-amd64:v2.0.5","env":[{"name":"GIT_SYNC_REPO","value":"<our git repo>...
              pod.beta.kubernetes.io/init-container-statuses=[{"name":"git-sync-clone","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2018-01-15T19:59:58Z","finishedAt":"2018-01-15T19:59:59Z"...
              pod.beta.kubernetes.io/init-containers=[{"name":"git-sync-clone","image":"gcr.io/google-containers/git-sync-amd64:v2.0.5","env":[{"name":"GIT_SYNC_REPO","value":"https://github.com/pubnub/caravan.git"...
Status:       Failed
IP:           
Init Containers:
  git-sync-clone:
    Container ID:   docker://c3dcc435d18362271fe5ab8098275d082c01ab36fc451d695e6e0e54ad71132a
    Image:          gcr.io/google-containers/git-sync-amd64:v2.0.5
    Image ID:       docker-pullable://gcr.io/google-containers/git-sync-amd64@sha256:904833aedf3f14373e73296240ed44d54aecd4c02367b004452dfeca2465e5bf
    Port:           <none>
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 15 Jan 2018 11:59:58 -0800
      Finished:     Mon, 15 Jan 2018 11:59:59 -0800
    Ready:          True
    Restart Count:  0
    Environment:
      GIT_SYNC_REPO:      <dag repo>
      GIT_SYNC_BRANCH:    master
      GIT_SYNC_ROOT:      /tmp
      GIT_SYNC_DEST:      dags
      GIT_SYNC_ONE_TIME:  true
      GIT_SYNC_USERNAME:  jzucker2
      GIT_SYNC_PASSWORD:  <password>
    Mounts:
      /root/airflow/airflow.cfg from airflow-config (ro)
      /root/airflow/dags/ from airflow-dags (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-0bq1k (ro)
Containers:
  base:
    Container ID:  <container id>
    Image:         <image>
    Image ID:      <our image id>
    Port:          <none>
    Command:
      bash
      -cx
      --
    Args:
      airflow run jordan_dag_3 run_this_1 2018-01-15T19:59:54.838835 --local -sd /root/airflow/dags/jordan3.py
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 15 Jan 2018 12:00:00 -0800
      Finished:     Mon, 15 Jan 2018 12:00:01 -0800
    Ready:          False
    Restart Count:  0
    Environment:
      AIRFLOW__CORE__AIRFLOW_HOME:   /root/airflow
      AIRFLOW__CORE__EXECUTOR:       LocalExecutor
      AIRFLOW__CORE__DAGS_FOLDER:    /tmp/dags
      SQL_ALCHEMY_CONN:              <set to the key 'sql_alchemy_conn' in secret 'airflow-secrets'>      Optional: false
      GIT_SYNC_USERNAME:             <set to the key 'username' in secret 'gitsecret'>                    Optional: false
      GIT_SYNC_PASSWORD:             <set to the key 'password' in secret 'gitsecret'>                    Optional: false
      AIRFLOW_CONN_PORTAL_DB_URI:    <set to the key 'portal_mysql_conn' in secret 'portaldbsecret'>      Optional: false
      AIRFLOW_CONN_OVERMIND_DB_URI:  <set to the key 'overmind_mysql_conn' in secret 'overminddbsecret'>  Optional: false
    Mounts:
      /root/airflow/airflow.cfg from airflow-config (ro)
      /root/airflow/dags/ from airflow-dags (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-0bq1k (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  airflow-dags:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  airflow-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      airflow-configmap
    Optional:  false
  default-token-0bq1k:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-0bq1k
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     <none>
Events:
  Type    Reason                 Age   From               Message
  ----    ------                 ----  ----               -------
  Normal  Scheduled              3m    default-scheduler  Successfully assigned jordandag3runthis1-8df809f80c874d6ca50acb0d0480307c to minikube
  Normal  SuccessfulMountVolume  3m    kubelet, minikube  MountVolume.SetUp succeeded for volume "airflow-dags"
  Normal  SuccessfulMountVolume  3m    kubelet, minikube  MountVolume.SetUp succeeded for volume "airflow-config"
  Normal  SuccessfulMountVolume  3m    kubelet, minikube  MountVolume.SetUp succeeded for volume "default-token-0bq1k"
  Normal  Pulled                 3m    kubelet, minikube  Container image "gcr.io/google-containers/git-sync-amd64:v2.0.5" already present on machine
  Normal  Created                3m    kubelet, minikube  Created container
  Normal  Started                3m    kubelet, minikube  Started container
  Normal  Pulled                 3m    kubelet, minikube  Container image "artifactnub1-docker-local.jfrog.io/pubnub/pnairflow:0.1.6" already present on machine
  Normal  Created                3m    kubelet, minikube  Created container
  Normal  Started                3m    kubelet, minikube  Started container
> 
> On Sat, Jan 13, 2018, 3:20 AM Koen Mevissen <km...@travix.com> wrote:
> 
> > Are you using kubernetes on Google Cloud Platform? (GKE)
> >
> > You should be able to capture the logs from your nodes. In case you run GKE
> > with logging automatically deployed then deamonsets with fluentd will ship
> > logs from /var/log/containers on the nose to Google Cloud Logging.
> >
> > Koen
> >
> >
> >
> >
> >
> >
> > Op za 13 jan. 2018 om 01:18 schreef Anirudh Ramanathan
> > <ra...@google.com.invalid>
> >
> > > > Any good way to debug this?
> > >
> > > One way might be reading the events from "kubectl get events". That
> > should
> > > reveal some information about the pod removal event.
> > > This brings up another question - should errored pods be persisted for
> > > debugging?
> > >
> > > On Fri, Jan 12, 2018 at 3:07 PM, jordan.zucker@gmail.com <
> > > jordan.zucker@gmail.com> wrote:
> > >
> > > > I'm trying to use Airflow and Kubernetes and having trouble using git
> > > sync
> > > > to pull DAGs into workers.
> > > >
> > > > I use a git sync init container on the scheduler to pull in DAGs
> > > initially
> > > > and that works. But when worker pods are spawned, the workers terminate
> > > > almost immediately because they cannot find the DAGs. But since the
> > > workers
> > > > terminate so quickly, I can't even inspect the file structure to see
> > > where
> > > > the DAGs ended up during the workers git sync init container.
> > > >
> > > > I noticed that the git sync init container for the workers is hard
> > coded
> > > > into /tmp/dags and there is a git_subpath config setting as well. But I
> > > > can't understand how the git synced DAGs ever end up in
> > > /root/airflow/dags
> > > >
> > > > I am successfully using a git sync init container for the scheduler,
> > so I
> > > > know my git credentials are valid. Any good way to debug this? Or an
> > > > example of how to set this up correctly?
> > > >
> > >
> > >
> > >
> > > --
> > > Anirudh Ramanathan
> > >
> > --
> > Kind regards,
> > Met vriendelijke groet,
> >
> > *Koen Mevissen*
> > Principal BI Developer
> >
> >
> > *Travix Nederland B.V.*
> > Piet Heinkade 55
> > 1019 GM Amsterdam
> > The Netherlands
> >
> > T. +31 (0)20 203 3241
> > E: KMevissen@travix.com
> > www.travix.com
> >
> > *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
> >  Flugladen
> >
> 

Re: Airflow + Kubernetes + Git Sync

Posted by Daniel Imberman <da...@gmail.com>.
@jordan can you turn delete mode off and post the kubectl describe results
for the workers?

On Sat, Jan 13, 2018, 3:20 AM Koen Mevissen <km...@travix.com> wrote:

> Are you using kubernetes on Google Cloud Platform? (GKE)
>
> You should be able to capture the logs from your nodes. In case you run GKE
> with logging automatically deployed then deamonsets with fluentd will ship
> logs from /var/log/containers on the nose to Google Cloud Logging.
>
> Koen
>
>
>
>
>
>
> Op za 13 jan. 2018 om 01:18 schreef Anirudh Ramanathan
> <ra...@google.com.invalid>
>
> > > Any good way to debug this?
> >
> > One way might be reading the events from "kubectl get events". That
> should
> > reveal some information about the pod removal event.
> > This brings up another question - should errored pods be persisted for
> > debugging?
> >
> > On Fri, Jan 12, 2018 at 3:07 PM, jordan.zucker@gmail.com <
> > jordan.zucker@gmail.com> wrote:
> >
> > > I'm trying to use Airflow and Kubernetes and having trouble using git
> > sync
> > > to pull DAGs into workers.
> > >
> > > I use a git sync init container on the scheduler to pull in DAGs
> > initially
> > > and that works. But when worker pods are spawned, the workers terminate
> > > almost immediately because they cannot find the DAGs. But since the
> > workers
> > > terminate so quickly, I can't even inspect the file structure to see
> > where
> > > the DAGs ended up during the workers git sync init container.
> > >
> > > I noticed that the git sync init container for the workers is hard
> coded
> > > into /tmp/dags and there is a git_subpath config setting as well. But I
> > > can't understand how the git synced DAGs ever end up in
> > /root/airflow/dags
> > >
> > > I am successfully using a git sync init container for the scheduler,
> so I
> > > know my git credentials are valid. Any good way to debug this? Or an
> > > example of how to set this up correctly?
> > >
> >
> >
> >
> > --
> > Anirudh Ramanathan
> >
> --
> Kind regards,
> Met vriendelijke groet,
>
> *Koen Mevissen*
> Principal BI Developer
>
>
> *Travix Nederland B.V.*
> Piet Heinkade 55
> 1019 GM Amsterdam
> The Netherlands
>
> T. +31 (0)20 203 3241
> E: KMevissen@travix.com
> www.travix.com
>
> *Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
>  Flugladen
>

Re: Airflow + Kubernetes + Git Sync

Posted by Koen Mevissen <km...@travix.com>.
Are you using kubernetes on Google Cloud Platform? (GKE)

You should be able to capture the logs from your nodes. In case you run GKE
with logging automatically deployed then deamonsets with fluentd will ship
logs from /var/log/containers on the nose to Google Cloud Logging.

Koen






Op za 13 jan. 2018 om 01:18 schreef Anirudh Ramanathan
<ra...@google.com.invalid>

> > Any good way to debug this?
>
> One way might be reading the events from "kubectl get events". That should
> reveal some information about the pod removal event.
> This brings up another question - should errored pods be persisted for
> debugging?
>
> On Fri, Jan 12, 2018 at 3:07 PM, jordan.zucker@gmail.com <
> jordan.zucker@gmail.com> wrote:
>
> > I'm trying to use Airflow and Kubernetes and having trouble using git
> sync
> > to pull DAGs into workers.
> >
> > I use a git sync init container on the scheduler to pull in DAGs
> initially
> > and that works. But when worker pods are spawned, the workers terminate
> > almost immediately because they cannot find the DAGs. But since the
> workers
> > terminate so quickly, I can't even inspect the file structure to see
> where
> > the DAGs ended up during the workers git sync init container.
> >
> > I noticed that the git sync init container for the workers is hard coded
> > into /tmp/dags and there is a git_subpath config setting as well. But I
> > can't understand how the git synced DAGs ever end up in
> /root/airflow/dags
> >
> > I am successfully using a git sync init container for the scheduler, so I
> > know my git credentials are valid. Any good way to debug this? Or an
> > example of how to set this up correctly?
> >
>
>
>
> --
> Anirudh Ramanathan
>
-- 
Kind regards,
Met vriendelijke groet,

*Koen Mevissen*
Principal BI Developer


*Travix Nederland B.V.*
Piet Heinkade 55
1019 GM Amsterdam
The Netherlands

T. +31 (0)20 203 3241
E: KMevissen@travix.com
www.travix.com

*Brands: * CheapTickets  |  Vliegwinkel  |  Vayama  |  BudgetAir  |
 Flugladen

Re: Airflow + Kubernetes + Git Sync

Posted by Anirudh Ramanathan <ra...@google.com.INVALID>.
> Any good way to debug this?

One way might be reading the events from "kubectl get events". That should
reveal some information about the pod removal event.
This brings up another question - should errored pods be persisted for
debugging?

On Fri, Jan 12, 2018 at 3:07 PM, jordan.zucker@gmail.com <
jordan.zucker@gmail.com> wrote:

> I'm trying to use Airflow and Kubernetes and having trouble using git sync
> to pull DAGs into workers.
>
> I use a git sync init container on the scheduler to pull in DAGs initially
> and that works. But when worker pods are spawned, the workers terminate
> almost immediately because they cannot find the DAGs. But since the workers
> terminate so quickly, I can't even inspect the file structure to see where
> the DAGs ended up during the workers git sync init container.
>
> I noticed that the git sync init container for the workers is hard coded
> into /tmp/dags and there is a git_subpath config setting as well. But I
> can't understand how the git synced DAGs ever end up in /root/airflow/dags
>
> I am successfully using a git sync init container for the scheduler, so I
> know my git credentials are valid. Any good way to debug this? Or an
> example of how to set this up correctly?
>



-- 
Anirudh Ramanathan