You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Daniel Imberman <da...@gmail.com> on 2018/05/14 20:07:42 UTC

Airflow Docker Container

Hi everyone,

I've started looking into creating an official airflow docker container
s.t. users of the KubernetesExecutor could auto-pull from helm
charts/deployment yamls/etc. I was wondering what everyone thinks the best
way to do this would be? Is there an official apache docker repo? Is there
a preferred linux distro?

cc: @anirudh since this was something you had to deal with for spark-on-k8s.

Re: Airflow Docker Container

Posted by Scott Halgrim <sc...@zapier.com.INVALID>.
AstronomerIO has done quite a bit of work on this:

https://github.com/astronomerio/astronomer

https://open.astronomer.io/airflow/index.html

On May 14, 2018, 1:08 PM -0700, Daniel Imberman <da...@gmail.com>, wrote:
> Hi everyone,
>
> I've started looking into creating an official airflow docker container
> s.t. users of the KubernetesExecutor could auto-pull from helm
> charts/deployment yamls/etc. I was wondering what everyone thinks the best
> way to do this would be? Is there an official apache docker repo? Is there
> a preferred linux distro?
>
> cc: @anirudh since this was something you had to deal with for spark-on-k8s.

Re: Airflow Docker Container

Posted by Bas Harenslak <ba...@godatadriven.com>.
Great, I’m unfortunately unable to work on it today due to personal matters, but will check out the astronomerio Dockerfile asap. I think the Docker image should be a “basic” image and thus lightweight, i.e. just enough to get it started on K8S without any configuration.

I’ve forked the docker-library repos and got started on the docs. I’ll put them online tomorrow with a Dockerfile, so that we can get it checked by the docker library people.

Regards, Bas

On 14 May 2018, at 22:55, Andy Cooper <an...@astronomer.io>> wrote:

As Scott pointed out, we at Astronomer.io<http://Astronomer.io> have taken the lightweight image philosophy to heart with our docker images. Our base image inherits from Alpine
(https://github.com/astronomerio/astronomer) and our Airflow image (https://github.com/astronomerio/astronomer/blob/master/docker/platform/airflow/Dockerfile) layers on that. We then have another layer (https://github.com/astronomerio/astronomer/blob/master/docker/platform/airflow/onbuild/Dockerfile) which can be used to install system level packages and python dependencies from packages.txt and requirements.txt.

If, for some reason, these images don't check all your boxes, let us know and we'll see what we can do to make them better for you.

-Andy Cooper

On Mon, May 14, 2018 at 4:44 PM Joe Napolitano <jo...@wework.com>> wrote:
You may consider this base image we put together at Blue Apron. My fork
fixes a build issue by pinning to pip < 10.

https://github.com/joenap/airflow-base

Joe Nap

On Mon, May 14, 2018 at 4:37 PM, Daniel Imberman <da...@gmail.com>>
wrote:

> @Fokko
>
> I definitely agree with that. I think that having a "super lightweight"
> image for just running a basic airflow instance makes sense. We could even
> name the image something like  airflow-k8s so people know it's ONLY meant
> to work in a k8s cluster. I'm trying to figure out what methods besides
> helm we should be considering (Helm doesn't really have full saturation in
> the k8s world so wanna see if there are other deployment tools we should
> consider).
>
> @Scott Dang quite a bit is definitely an understatement :). Would anyone on
> your team have some cycles to work with @jzucker or @sedwards on the
> helm/deployment stuff?
>
> On Mon, May 14, 2018 at 1:18 PM Driesprong, Fokko <fo...@driesprong.frl>>
> wrote:
>
> > Hi Daniel,
> >
> > My dear colleague from GoDataDriven, Bas Harenslak, started on building
> an
> > official Docker container on the Dockerhub. I've put him in the CC. In
> the
> > end I strongly believe the image should end up in the official Docker
> > repository: https://github.com/docker-library/official-images
> >
> > Right now, the excellent images provided by Puckel are widely used for
> > running Airflow in Docker. For the Kubernetes build we need to pull in
> some
> > additional dependencies. Maybe a good idea to do this separately from the
> > one from Puckel, to keep his images lightweight. Any thoughts?
> >
> > Kind regards,
> > Fokko Driesprong
> >
> >
> > 2018-05-14 22:09 GMT+02:00 Anirudh Ramanathan <
> > ramanathana@google.com.invalid<ma...@google.com.invalid>>:
> >
> >> @Erik Erlandson <ej...@redhat.com>> has had conversations about publishing
> >
> >
> >> docker images with the ASF Legal team.
> >> Adding him to the thread.
> >>
> >> On Mon, May 14, 2018 at 1:07 PM Daniel Imberman <
> >> daniel.imberman@gmail.com<ma...@gmail.com>>
> >> wrote:
> >>
> >> > Hi everyone,
> >> >
> >> > I've started looking into creating an official airflow docker
> container
> >> > s.t. users of the KubernetesExecutor could auto-pull from helm
> >> > charts/deployment yamls/etc. I was wondering what everyone thinks the
> >> best
> >> > way to do this would be? Is there an official apache docker repo? Is
> >> there
> >> > a preferred linux distro?
> >> >
> >> > cc: @anirudh since this was something you had to deal with for
> >> > spark-on-k8s.
> >> >
> >>
> >>
> >> --
> >> Anirudh Ramanathan
> >>
> >
>


Re: Airflow Docker Container

Posted by Andy Cooper <an...@astronomer.io>.
As Scott pointed out, we at Astronomer.io have taken the lightweight image
philosophy to heart with our docker images. Our base image inherits from
Alpine
(https://github.com/astronomerio/astronomer) and our Airflow image (
https://github.com/astronomerio/astronomer/blob/master/docker/platform/airflow/Dockerfile)
layers on that. We then have another layer (
https://github.com/astronomerio/astronomer/blob/master/docker/platform/airflow/onbuild/Dockerfile)
which can be used to install system level packages and python dependencies
from packages.txt and requirements.txt.

If, for some reason, these images don't check all your boxes, let us know
and we'll see what we can do to make them better for you.

-Andy Cooper

On Mon, May 14, 2018 at 4:44 PM Joe Napolitano <jo...@wework.com>
wrote:

> You may consider this base image we put together at Blue Apron. My fork
> fixes a build issue by pinning to pip < 10.
>
> https://github.com/joenap/airflow-base
>
> Joe Nap
>
> On Mon, May 14, 2018 at 4:37 PM, Daniel Imberman <
> daniel.imberman@gmail.com>
> wrote:
>
> > @Fokko
> >
> > I definitely agree with that. I think that having a "super lightweight"
> > image for just running a basic airflow instance makes sense. We could
> even
> > name the image something like  airflow-k8s so people know it's ONLY meant
> > to work in a k8s cluster. I'm trying to figure out what methods besides
> > helm we should be considering (Helm doesn't really have full saturation
> in
> > the k8s world so wanna see if there are other deployment tools we should
> > consider).
> >
> > @Scott Dang quite a bit is definitely an understatement :). Would anyone
> on
> > your team have some cycles to work with @jzucker or @sedwards on the
> > helm/deployment stuff?
> >
> > On Mon, May 14, 2018 at 1:18 PM Driesprong, Fokko <fo...@driesprong.frl>
> > wrote:
> >
> > > Hi Daniel,
> > >
> > > My dear colleague from GoDataDriven, Bas Harenslak, started on building
> > an
> > > official Docker container on the Dockerhub. I've put him in the CC. In
> > the
> > > end I strongly believe the image should end up in the official Docker
> > > repository: https://github.com/docker-library/official-images
> > >
> > > Right now, the excellent images provided by Puckel are widely used for
> > > running Airflow in Docker. For the Kubernetes build we need to pull in
> > some
> > > additional dependencies. Maybe a good idea to do this separately from
> the
> > > one from Puckel, to keep his images lightweight. Any thoughts?
> > >
> > > Kind regards,
> > > Fokko Driesprong
> > >
> > >
> > > 2018-05-14 22:09 GMT+02:00 Anirudh Ramanathan <
> > > ramanathana@google.com.invalid>:
> > >
> > >> @Erik Erlandson <ej...@redhat.com> has had conversations about
> publishing
> > >
> > >
> > >> docker images with the ASF Legal team.
> > >> Adding him to the thread.
> > >>
> > >> On Mon, May 14, 2018 at 1:07 PM Daniel Imberman <
> > >> daniel.imberman@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi everyone,
> > >> >
> > >> > I've started looking into creating an official airflow docker
> > container
> > >> > s.t. users of the KubernetesExecutor could auto-pull from helm
> > >> > charts/deployment yamls/etc. I was wondering what everyone thinks
> the
> > >> best
> > >> > way to do this would be? Is there an official apache docker repo? Is
> > >> there
> > >> > a preferred linux distro?
> > >> >
> > >> > cc: @anirudh since this was something you had to deal with for
> > >> > spark-on-k8s.
> > >> >
> > >>
> > >>
> > >> --
> > >> Anirudh Ramanathan
> > >>
> > >
> >
>

Re: Airflow Docker Container

Posted by Joe Napolitano <jo...@wework.com>.
You may consider this base image we put together at Blue Apron. My fork
fixes a build issue by pinning to pip < 10.

https://github.com/joenap/airflow-base

Joe Nap

On Mon, May 14, 2018 at 4:37 PM, Daniel Imberman <da...@gmail.com>
wrote:

> @Fokko
>
> I definitely agree with that. I think that having a "super lightweight"
> image for just running a basic airflow instance makes sense. We could even
> name the image something like  airflow-k8s so people know it's ONLY meant
> to work in a k8s cluster. I'm trying to figure out what methods besides
> helm we should be considering (Helm doesn't really have full saturation in
> the k8s world so wanna see if there are other deployment tools we should
> consider).
>
> @Scott Dang quite a bit is definitely an understatement :). Would anyone on
> your team have some cycles to work with @jzucker or @sedwards on the
> helm/deployment stuff?
>
> On Mon, May 14, 2018 at 1:18 PM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
> > Hi Daniel,
> >
> > My dear colleague from GoDataDriven, Bas Harenslak, started on building
> an
> > official Docker container on the Dockerhub. I've put him in the CC. In
> the
> > end I strongly believe the image should end up in the official Docker
> > repository: https://github.com/docker-library/official-images
> >
> > Right now, the excellent images provided by Puckel are widely used for
> > running Airflow in Docker. For the Kubernetes build we need to pull in
> some
> > additional dependencies. Maybe a good idea to do this separately from the
> > one from Puckel, to keep his images lightweight. Any thoughts?
> >
> > Kind regards,
> > Fokko Driesprong
> >
> >
> > 2018-05-14 22:09 GMT+02:00 Anirudh Ramanathan <
> > ramanathana@google.com.invalid>:
> >
> >> @Erik Erlandson <ej...@redhat.com> has had conversations about publishing
> >
> >
> >> docker images with the ASF Legal team.
> >> Adding him to the thread.
> >>
> >> On Mon, May 14, 2018 at 1:07 PM Daniel Imberman <
> >> daniel.imberman@gmail.com>
> >> wrote:
> >>
> >> > Hi everyone,
> >> >
> >> > I've started looking into creating an official airflow docker
> container
> >> > s.t. users of the KubernetesExecutor could auto-pull from helm
> >> > charts/deployment yamls/etc. I was wondering what everyone thinks the
> >> best
> >> > way to do this would be? Is there an official apache docker repo? Is
> >> there
> >> > a preferred linux distro?
> >> >
> >> > cc: @anirudh since this was something you had to deal with for
> >> > spark-on-k8s.
> >> >
> >>
> >>
> >> --
> >> Anirudh Ramanathan
> >>
> >
>

Re: Airflow Docker Container

Posted by Daniel Imberman <da...@gmail.com>.
@Fokko

I definitely agree with that. I think that having a "super lightweight"
image for just running a basic airflow instance makes sense. We could even
name the image something like  airflow-k8s so people know it's ONLY meant
to work in a k8s cluster. I'm trying to figure out what methods besides
helm we should be considering (Helm doesn't really have full saturation in
the k8s world so wanna see if there are other deployment tools we should
consider).

@Scott Dang quite a bit is definitely an understatement :). Would anyone on
your team have some cycles to work with @jzucker or @sedwards on the
helm/deployment stuff?

On Mon, May 14, 2018 at 1:18 PM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

> Hi Daniel,
>
> My dear colleague from GoDataDriven, Bas Harenslak, started on building an
> official Docker container on the Dockerhub. I've put him in the CC. In the
> end I strongly believe the image should end up in the official Docker
> repository: https://github.com/docker-library/official-images
>
> Right now, the excellent images provided by Puckel are widely used for
> running Airflow in Docker. For the Kubernetes build we need to pull in some
> additional dependencies. Maybe a good idea to do this separately from the
> one from Puckel, to keep his images lightweight. Any thoughts?
>
> Kind regards,
> Fokko Driesprong
>
>
> 2018-05-14 22:09 GMT+02:00 Anirudh Ramanathan <
> ramanathana@google.com.invalid>:
>
>> @Erik Erlandson <ej...@redhat.com> has had conversations about publishing
>
>
>> docker images with the ASF Legal team.
>> Adding him to the thread.
>>
>> On Mon, May 14, 2018 at 1:07 PM Daniel Imberman <
>> daniel.imberman@gmail.com>
>> wrote:
>>
>> > Hi everyone,
>> >
>> > I've started looking into creating an official airflow docker container
>> > s.t. users of the KubernetesExecutor could auto-pull from helm
>> > charts/deployment yamls/etc. I was wondering what everyone thinks the
>> best
>> > way to do this would be? Is there an official apache docker repo? Is
>> there
>> > a preferred linux distro?
>> >
>> > cc: @anirudh since this was something you had to deal with for
>> > spark-on-k8s.
>> >
>>
>>
>> --
>> Anirudh Ramanathan
>>
>

Re: Airflow Docker Container

Posted by "Driesprong, Fokko" <fo...@driesprong.frl>.
Hi Daniel,

My dear colleague from GoDataDriven, Bas Harenslak, started on building an
official Docker container on the Dockerhub. I've put him in the CC. In the
end I strongly believe the image should end up in the official Docker
repository: https://github.com/docker-library/official-images

Right now, the excellent images provided by Puckel are widely used for
running Airflow in Docker. For the Kubernetes build we need to pull in some
additional dependencies. Maybe a good idea to do this separately from the
one from Puckel, to keep his images lightweight. Any thoughts?

Kind regards,
Fokko Driesprong


2018-05-14 22:09 GMT+02:00 Anirudh Ramanathan <
ramanathana@google.com.invalid>:

> @Erik Erlandson <ej...@redhat.com> has had conversations about publishing
> docker images with the ASF Legal team.
> Adding him to the thread.
>
> On Mon, May 14, 2018 at 1:07 PM Daniel Imberman <daniel.imberman@gmail.com
> >
> wrote:
>
> > Hi everyone,
> >
> > I've started looking into creating an official airflow docker container
> > s.t. users of the KubernetesExecutor could auto-pull from helm
> > charts/deployment yamls/etc. I was wondering what everyone thinks the
> best
> > way to do this would be? Is there an official apache docker repo? Is
> there
> > a preferred linux distro?
> >
> > cc: @anirudh since this was something you had to deal with for
> > spark-on-k8s.
> >
>
>
> --
> Anirudh Ramanathan
>

Re: Airflow Docker Container

Posted by Anirudh Ramanathan <ra...@google.com.INVALID>.
@Erik Erlandson <ej...@redhat.com> has had conversations about publishing
docker images with the ASF Legal team.
Adding him to the thread.

On Mon, May 14, 2018 at 1:07 PM Daniel Imberman <da...@gmail.com>
wrote:

> Hi everyone,
>
> I've started looking into creating an official airflow docker container
> s.t. users of the KubernetesExecutor could auto-pull from helm
> charts/deployment yamls/etc. I was wondering what everyone thinks the best
> way to do this would be? Is there an official apache docker repo? Is there
> a preferred linux distro?
>
> cc: @anirudh since this was something you had to deal with for
> spark-on-k8s.
>


-- 
Anirudh Ramanathan