You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airflow.apache.org by Daniel Imberman <da...@gmail.com> on 2018/12/30 01:51:49 UTC

Storing a GKE key in our travis build

Hello all,

I wanted to ask about us moving away from minikube for our airflow
kubernetes builds.

1. It takes a long time to build

Building minikube from scratch every time takes the lionshare of our travis
testing time. If we can just connect to GKE and launch a cluster in a few
seconds, we'll get much faster feedback for whether our changes are working
or not.

2. It doesn't really mimic real world kubernetes

We've run into a few bugs in the wild where our minikube-based testing
worked fine but we found issues when performing on a multi-node cluster.
Many in the k8s community are even moving away from minikube because of
these differences in behavior.

3. It makes local testing a pain

Over the next few months I want to start figuring out ways to make
iterative local testing of airflow easier. An ideal for k8s-based testing
to launch a GKE cluster and just run a script every time we want to test
our changes. This would simplify onboarding new devs and reduce the time to
approve PRs.


*Proposal: Create GKE testing that committers can turn on*

When users submit PRs to kubernetes, there seems to be a buildbot setting
that allows the committers to turn on integration testing on the
CloudNative clusters. I'm thinking if all k8s testing is reformatted to
assume kubernetes rather than minikube, then users can test as much as they
want on their own local/GKE clusters and then we can turn on GKE before
merging.

Google had spoken with us a while back about giving a GKE account for
airflow integration, and I wanted to open this up to the community if
anyone had strong feelings either way.

Re: Storing a GKE key in our travis build

Posted by Daniel Imberman <da...@gmail.com>.

I got pretty close with docker-in-docker when we were running into
docker-compose issues and wouldn't recommend in it's current form. The
actual changes required wouldn't be hard code-base wise for gke/vanilla
kubernetes. Just need to figure out how to turn it on/off so people can't
run arbitrary code in our GKE (though technically they're already running
arbitrary code in our Travis so 🤷‍♂️).

On Sun, Dec 30, 2018, 2:06 AM Eamon Keane <ea...@gmail.com> wrote:

> I think that's a good idea.
>
> There are a few kubernetes-in-docker minikube alternatives although I'm not
> sure if they address the issues of speed and ease of local testing and
> they're still alpha.
>
> https://github.com/kubernetes-sigs/kind (single node for now)
> https://github.com/kubernetes-sigs/kubeadm-dind-cluster (multi-node)
>
> For GKE launching a new cluster takes 150-300 seconds (
> https://kubedex.com/google-gke-vs-azure-aks-automation-and-reliability/).
> Is it the intention to connect to an existing cluster or make one from
> scratch each time?
>
> Using one cluster project-wide for testing should be feasible if it's set
> up correctly (if every commit is namespaced ($branch-$sha) and the first
> step of the script is to delete the namespace if it exists). It should be
> reasonably maintenance-free if set up with node auto-provisioning with a
> cap of say 60 cores depending on how many concurrent builds occur.
>
>
> https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning
>
> On Sun, Dec 30, 2018 at 3:09 AM Daniel Imberman <daniel.imberman@gmail.com
> >
> wrote:
>
> > Hello all,
> >
> > I wanted to ask about us moving away from minikube for our airflow
> > kubernetes builds.
> >
> > 1. It takes a long time to build
> >
> > Building minikube from scratch every time takes the lionshare of our
> travis
> > testing time. If we can just connect to GKE and launch a cluster in a few
> > seconds, we'll get much faster feedback for whether our changes are
> working
> > or not.
> >
> > 2. It doesn't really mimic real world kubernetes
> >
> > We've run into a few bugs in the wild where our minikube-based testing
> > worked fine but we found issues when performing on a multi-node cluster.
> > Many in the k8s community are even moving away from minikube because of
> > these differences in behavior.
> >
> > 3. It makes local testing a pain
> >
> > Over the next few months I want to start figuring out ways to make
> > iterative local testing of airflow easier. An ideal for k8s-based testing
> > to launch a GKE cluster and just run a script every time we want to test
> > our changes. This would simplify onboarding new devs and reduce the time
> to
> > approve PRs.
> >
> >
> > *Proposal: Create GKE testing that committers can turn on*
> >
> > When users submit PRs to kubernetes, there seems to be a buildbot setting
> > that allows the committers to turn on integration testing on the
> > CloudNative clusters. I'm thinking if all k8s testing is reformatted to
> > assume kubernetes rather than minikube, then users can test as much as
> they
> > want on their own local/GKE clusters and then we can turn on GKE before
> > merging.
> >
> > Google had spoken with us a while back about giving a GKE account for
> > airflow integration, and I wanted to open this up to the community if
> > anyone had strong feelings either way.
> >
>

Re: Storing a GKE key in our travis build

Posted by Eamon Keane <ea...@gmail.com>.

I think that's a good idea.

There are a few kubernetes-in-docker minikube alternatives although I'm not
sure if they address the issues of speed and ease of local testing and
they're still alpha.

https://github.com/kubernetes-sigs/kind (single node for now)
https://github.com/kubernetes-sigs/kubeadm-dind-cluster (multi-node)

For GKE launching a new cluster takes 150-300 seconds (
https://kubedex.com/google-gke-vs-azure-aks-automation-and-reliability/).
Is it the intention to connect to an existing cluster or make one from
scratch each time?

Using one cluster project-wide for testing should be feasible if it's set
up correctly (if every commit is namespaced ($branch-$sha) and the first
step of the script is to delete the namespace if it exists). It should be
reasonably maintenance-free if set up with node auto-provisioning with a
cap of say 60 cores depending on how many concurrent builds occur.

https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning

On Sun, Dec 30, 2018 at 3:09 AM Daniel Imberman <da...@gmail.com>
wrote:

> Hello all,
>
> I wanted to ask about us moving away from minikube for our airflow
> kubernetes builds.
>
> 1. It takes a long time to build
>
> Building minikube from scratch every time takes the lionshare of our travis
> testing time. If we can just connect to GKE and launch a cluster in a few
> seconds, we'll get much faster feedback for whether our changes are working
> or not.
>
> 2. It doesn't really mimic real world kubernetes
>
> We've run into a few bugs in the wild where our minikube-based testing
> worked fine but we found issues when performing on a multi-node cluster.
> Many in the k8s community are even moving away from minikube because of
> these differences in behavior.
>
> 3. It makes local testing a pain
>
> Over the next few months I want to start figuring out ways to make
> iterative local testing of airflow easier. An ideal for k8s-based testing
> to launch a GKE cluster and just run a script every time we want to test
> our changes. This would simplify onboarding new devs and reduce the time to
> approve PRs.
>
>
> *Proposal: Create GKE testing that committers can turn on*
>
> When users submit PRs to kubernetes, there seems to be a buildbot setting
> that allows the committers to turn on integration testing on the
> CloudNative clusters. I'm thinking if all k8s testing is reformatted to
> assume kubernetes rather than minikube, then users can test as much as they
> want on their own local/GKE clusters and then we can turn on GKE before
> merging.
>
> Google had spoken with us a while back about giving a GKE account for
> airflow integration, and I wanted to open this up to the community if
> anyone had strong feelings either way.
>