You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <Ja...@polidea.com> on 2019/01/16 21:59:55 UTC

Mutli-layered official image for Airflow

Hello Everyone,

Following the discussion we had on Mono-layered vs. Multi-layered official
image for Airflow here https://github.com/apache/airflow/pull/4483, I
prepared a proof-of-concept PR of multi-layered image (based on the
mono-layered one) and I performed calculations and reached some conclusions
in this proposal (I wanted to have some hard numbers to back the statement
that multi-layered Docker file is better) :

https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+official+Airflow+image

The conclusions I reached:

   - The multi-layered image is even slightly smaller than the mono-layered
   one - so multi-layered image is even better when you download it once
   - Downloading the image regularly by the users is way better in case of
   multi-layered image - for simulated user, downloading airflow image twice a
   week it is:  5.7 GB  (multi-layered) vs. 16.15 GB (mono-layered) downloads
   over the course of 8 weeks.\
   - Multi-layered image is better choice.


I based those calculations on the PR I prepared:
https://github.com/apache/airflow/pull/4543 where I implemented rather nice
multi-layered Dockerfile that can be easily maintained.

It's  based on my experience with Airflow Breeze
<https://github.com/PolideaInternal/airflow-breeze> - the GCP Development
environment we used to develop 30+ GCP based operators recently.

I hope we can reach the conclusion as the community that multi-layered is
better and that we can go in this direction :). I am happy to iterate on my
PR to make it even better.

J.


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com

Re: Mutli-layered official image for Airflow

Posted by Jarek Potiuk <Ja...@polidea.com>.
I had a discussion with Gerardo yesterday night and I realized that it's
not as obvious for everyone how the whole image building works now and how
it is supposed to work with the multi-layerd images.

I think having some pictures might work best so I draw quickly an
architecture and "life of an image" diagrams. The images and editable
diagrams are now in AIP-10
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+and+multi-stage+official+Airflow+image>.
I hope it will help with grasping the concept.

J.

Principal Software Engineer
Phone: +48660796129

wt., 19 mar 2019, 00:00 użytkownik Jarek Potiuk <Ja...@polidea.com>
napisał:

> After some initial discussion and suggestion from Daniel, I split the
> change into three separate PRs which can be reviewed and merged separately:
>
>
>    - AIRFLOW-4115 JIRA
>    <https://issues.apache.org/jira/browse/AIRFLOW-4115>, PR
>    <https://github.com/apache/airflow/pull/4936> - Docker file for Main
>    airflow image is multi-staging and has multiple layers
>
> followed by
>
>    - AIRFLOW-4116 JIRA
>    <https://issues.apache.org/jira/browse/AIRFLOW-4116>, PR
>    <https://github.com/apache/airflow/pull/4937> - Support for Main/CI
>    images in single Dockerfile
>
> followed by
>
>    - AIRFLOW-4117 JIRA
>    <https://issues.apache.org/jira/browse/AIRFLOW-4117>, PR
>    <https://github.com/apache/airflow/pull/4938>- Travis CI uses
>    multi-stage Docker image to run tests
>
>
> J.
>
> On Mon, Mar 18, 2019 at 2:23 AM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> Hello everyone,
>>
>> I believe I am ready now to involve more of the community people in the
>> multi-layered Docker AIP-10 that I am working on for some time (with
>> comments and encouragement from Ash and Fokko as explained in the AIP
>> thread).
>>
>> Any comments, questions, critique, improvement proposals, or even help :)
>> is more than welcome.
>>
>> The work is still WIP: https://github.com/apache/airflow/pull/4543
>>
>> The AIP Confluence page (fairly detailed already) is in
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+and+multi-stage+official+Airflow+image
>> - I think it is the best place for the discussion (as Bas suggested in the
>> AIP thread)
>>
>> I am still working on making the tests on Travis green, but I am on a
>> good path. I'd appreciate any help with it. Especially with the Kubernetes
>> tests which will likely need some small fixes in the environment or maybe
>> even switching to minikube's Docker image in docker-compose.
>>
>> What works now (I think it addresses quite a lot of the concerns Fokko
>> mentioned):
>>
>>    - Tox is removed and replaced with pure-docker execution of tests
>>    (yay!)
>>    - The same Dockerfile is used for both "slim" Airflow image and
>>    Airflow CI image used for tests. Once we merge it, we will be able to
>>    deprecate incubator-airflow-ci image.
>>    - Part of the PR is also related to "Simplified development
>>    environment - AIP-7" (aka Airflow Breeze). I have a nice working Breeze
>>    environment as part of the change now - I will split it off eventually to
>>    separate discussion/PR but for now it makes it easier for me to run tests
>>    so I keep it in.
>>    - The Multi-staging/multi-layered Dockerfile should already improve
>>    CI build "purity". The way "layers" work now is that PIP dependencies are
>>    effectively frozen in-between setup.py changes. Only when setup.py changes,
>>    the corresponding layers are rebuilt and dependencies re-installed. That
>>    should provide 'out-of-the-box" better stability of CI builds even before
>>    we solve dependency problem in more "systematic" way (as Fokko mentioned we
>>    should have separate AIP for that). I am happy to discuss more - either now
>>    or in the future AIP. It's quite close to my interest to fix this
>>    eventually as well.
>>
>> I went through several iterations and what I came up with is already
>> quite simple and straightforward comparing to some initial approaches I
>> took.
>>
>> I added quite detailed description and motivation, proposed design and
>> even measured the impact of layering on build times (All in AIP-10
>> Confluence page).
>>
>> I will continue fixing tests and rebasing the changes for some time (even
>> few weeks if needed) to test how it behaves with real changes coming
>> regularly.
>>
>> For now it's done in the way that I have separate DockerHub build and
>> Travis CI instance where I will keep on running the tests (automatically):
>>
>>    - DockerHub:
>>    https://cloud.docker.com/repository/docker/potiuk/airflow/timeline
>>    - Travis CI: https://travis-ci.org/potiuk/airflow/builds
>>
>> J.
>>
>>
>>
>> On Thu, Jan 17, 2019 at 12:12 PM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> I've updated the calculations after removing some artifacts and
>>> rebulding the images from scratch. Here are the updated conclusions:
>>>
>>>
>>>    - The multi-layered image is only slightly bigger than the
>>>    mono-layered one (around *2% more *in total ) - download time is
>>>    also slightly longer by 1 s  (33.7 vs 32.7s) which is *3% longer.*
>>>    - Downloading the image regularly by the users is way better in case
>>>    of multi-layered image - for simulated user, downloading airflow image
>>>    twice a week it is:  *4950 MB*  (multi-layered) vs. *13546 MB*
>>>    (mono-layered) downloads over the course of 8 weeks. Yielding *64%
>>>    less data* to download.
>>>    - Multi-layered image seems to be much better for users regularly
>>>    downloading the image.
>>>
>>>
>>> On Wed, Jan 16, 2019 at 10:59 PM Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> Hello Everyone,
>>>>
>>>> Following the discussion we had on Mono-layered vs. Multi-layered
>>>> official image for Airflow here
>>>> https://github.com/apache/airflow/pull/4483, I prepared a
>>>> proof-of-concept PR of multi-layered image (based on the mono-layered one)
>>>> and I performed calculations and reached some conclusions in this proposal
>>>> (I wanted to have some hard numbers to back the statement that
>>>> multi-layered Docker file is better) :
>>>>
>>>>
>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+official+Airflow+image
>>>>
>>>> The conclusions I reached:
>>>>
>>>>    - The multi-layered image is even slightly smaller than the
>>>>    mono-layered one - so multi-layered image is even better when you download
>>>>    it once
>>>>    - Downloading the image regularly by the users is way better in
>>>>    case of multi-layered image - for simulated user, downloading airflow image
>>>>    twice a week it is:  5.7 GB  (multi-layered) vs. 16.15 GB (mono-layered)
>>>>    downloads over the course of 8 weeks.\
>>>>    - Multi-layered image is better choice.
>>>>
>>>>
>>>> I based those calculations on the PR I prepared:
>>>> https://github.com/apache/airflow/pull/4543 where I implemented rather
>>>> nice multi-layered Dockerfile that can be easily maintained.
>>>>
>>>> It's  based on my experience with Airflow Breeze
>>>> <https://github.com/PolideaInternal/airflow-breeze> - the GCP
>>>> Development environment we used to develop 30+ GCP based operators recently.
>>>>
>>>> I hope we can reach the conclusion as the community that multi-layered
>>>> is better and that we can go in this direction :). I am happy to iterate on
>>>> my PR to make it even better.
>>>>
>>>> J.
>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> E: jarek.potiuk@polidea.com
>>>>
>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> E: jarek.potiuk@polidea.com
>>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> E: jarek.potiuk@polidea.com
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> E: jarek.potiuk@polidea.com
>

Re: Mutli-layered official image for Airflow

Posted by Jarek Potiuk <Ja...@polidea.com>.
After some initial discussion and suggestion from Daniel, I split the
change into three separate PRs which can be reviewed and merged separately:


   - AIRFLOW-4115 JIRA <https://issues.apache.org/jira/browse/AIRFLOW-4115>
   , PR <https://github.com/apache/airflow/pull/4936> - Docker file for
   Main airflow image is multi-staging and has multiple layers

followed by

   - AIRFLOW-4116 JIRA <https://issues.apache.org/jira/browse/AIRFLOW-4116>
   , PR <https://github.com/apache/airflow/pull/4937> - Support for Main/CI
   images in single Dockerfile

followed by

   - AIRFLOW-4117 JIRA <https://issues.apache.org/jira/browse/AIRFLOW-4117>
   , PR <https://github.com/apache/airflow/pull/4938>- Travis CI uses
   multi-stage Docker image to run tests


J.

On Mon, Mar 18, 2019 at 2:23 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Hello everyone,
>
> I believe I am ready now to involve more of the community people in the
> multi-layered Docker AIP-10 that I am working on for some time (with
> comments and encouragement from Ash and Fokko as explained in the AIP
> thread).
>
> Any comments, questions, critique, improvement proposals, or even help :)
> is more than welcome.
>
> The work is still WIP: https://github.com/apache/airflow/pull/4543
>
> The AIP Confluence page (fairly detailed already) is in
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+and+multi-stage+official+Airflow+image
> - I think it is the best place for the discussion (as Bas suggested in the
> AIP thread)
>
> I am still working on making the tests on Travis green, but I am on a good
> path. I'd appreciate any help with it. Especially with the Kubernetes tests
> which will likely need some small fixes in the environment or maybe even
> switching to minikube's Docker image in docker-compose.
>
> What works now (I think it addresses quite a lot of the concerns Fokko
> mentioned):
>
>    - Tox is removed and replaced with pure-docker execution of tests
>    (yay!)
>    - The same Dockerfile is used for both "slim" Airflow image and
>    Airflow CI image used for tests. Once we merge it, we will be able to
>    deprecate incubator-airflow-ci image.
>    - Part of the PR is also related to "Simplified development
>    environment - AIP-7" (aka Airflow Breeze). I have a nice working Breeze
>    environment as part of the change now - I will split it off eventually to
>    separate discussion/PR but for now it makes it easier for me to run tests
>    so I keep it in.
>    - The Multi-staging/multi-layered Dockerfile should already improve CI
>    build "purity". The way "layers" work now is that PIP dependencies are
>    effectively frozen in-between setup.py changes. Only when setup.py changes,
>    the corresponding layers are rebuilt and dependencies re-installed. That
>    should provide 'out-of-the-box" better stability of CI builds even before
>    we solve dependency problem in more "systematic" way (as Fokko mentioned we
>    should have separate AIP for that). I am happy to discuss more - either now
>    or in the future AIP. It's quite close to my interest to fix this
>    eventually as well.
>
> I went through several iterations and what I came up with is already quite
> simple and straightforward comparing to some initial approaches I took.
>
> I added quite detailed description and motivation, proposed design and
> even measured the impact of layering on build times (All in AIP-10
> Confluence page).
>
> I will continue fixing tests and rebasing the changes for some time (even
> few weeks if needed) to test how it behaves with real changes coming
> regularly.
>
> For now it's done in the way that I have separate DockerHub build and
> Travis CI instance where I will keep on running the tests (automatically):
>
>    - DockerHub:
>    https://cloud.docker.com/repository/docker/potiuk/airflow/timeline
>    - Travis CI: https://travis-ci.org/potiuk/airflow/builds
>
> J.
>
>
>
> On Thu, Jan 17, 2019 at 12:12 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> I've updated the calculations after removing some artifacts and rebulding
>> the images from scratch. Here are the updated conclusions:
>>
>>
>>    - The multi-layered image is only slightly bigger than the
>>    mono-layered one (around *2% more *in total ) - download time is also
>>    slightly longer by 1 s  (33.7 vs 32.7s) which is *3% longer.*
>>    - Downloading the image regularly by the users is way better in case
>>    of multi-layered image - for simulated user, downloading airflow image
>>    twice a week it is:  *4950 MB*  (multi-layered) vs. *13546 MB*
>>    (mono-layered) downloads over the course of 8 weeks. Yielding *64%
>>    less data* to download.
>>    - Multi-layered image seems to be much better for users regularly
>>    downloading the image.
>>
>>
>> On Wed, Jan 16, 2019 at 10:59 PM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> Hello Everyone,
>>>
>>> Following the discussion we had on Mono-layered vs. Multi-layered
>>> official image for Airflow here
>>> https://github.com/apache/airflow/pull/4483, I prepared a
>>> proof-of-concept PR of multi-layered image (based on the mono-layered one)
>>> and I performed calculations and reached some conclusions in this proposal
>>> (I wanted to have some hard numbers to back the statement that
>>> multi-layered Docker file is better) :
>>>
>>>
>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+official+Airflow+image
>>>
>>> The conclusions I reached:
>>>
>>>    - The multi-layered image is even slightly smaller than the
>>>    mono-layered one - so multi-layered image is even better when you download
>>>    it once
>>>    - Downloading the image regularly by the users is way better in case
>>>    of multi-layered image - for simulated user, downloading airflow image
>>>    twice a week it is:  5.7 GB  (multi-layered) vs. 16.15 GB (mono-layered)
>>>    downloads over the course of 8 weeks.\
>>>    - Multi-layered image is better choice.
>>>
>>>
>>> I based those calculations on the PR I prepared:
>>> https://github.com/apache/airflow/pull/4543 where I implemented rather
>>> nice multi-layered Dockerfile that can be easily maintained.
>>>
>>> It's  based on my experience with Airflow Breeze
>>> <https://github.com/PolideaInternal/airflow-breeze> - the GCP
>>> Development environment we used to develop 30+ GCP based operators recently.
>>>
>>> I hope we can reach the conclusion as the community that multi-layered
>>> is better and that we can go in this direction :). I am happy to iterate on
>>> my PR to make it even better.
>>>
>>> J.
>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> E: jarek.potiuk@polidea.com
>>>
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> E: jarek.potiuk@polidea.com
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> E: jarek.potiuk@polidea.com
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com

Re: Mutli-layered official image for Airflow

Posted by Jarek Potiuk <Ja...@polidea.com>.
Hello everyone,

I believe I am ready now to involve more of the community people in the
multi-layered Docker AIP-10 that I am working on for some time (with
comments and encouragement from Ash and Fokko as explained in the AIP
thread).

Any comments, questions, critique, improvement proposals, or even help :)
is more than welcome.

The work is still WIP: https://github.com/apache/airflow/pull/4543

The AIP Confluence page (fairly detailed already) is in
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+and+multi-stage+official+Airflow+image
- I think it is the best place for the discussion (as Bas suggested in the
AIP thread)

I am still working on making the tests on Travis green, but I am on a good
path. I'd appreciate any help with it. Especially with the Kubernetes tests
which will likely need some small fixes in the environment or maybe even
switching to minikube's Docker image in docker-compose.

What works now (I think it addresses quite a lot of the concerns Fokko
mentioned):

   - Tox is removed and replaced with pure-docker execution of tests (yay!)
   - The same Dockerfile is used for both "slim" Airflow image and Airflow
   CI image used for tests. Once we merge it, we will be able to deprecate
   incubator-airflow-ci image.
   - Part of the PR is also related to "Simplified development environment
   - AIP-7" (aka Airflow Breeze). I have a nice working Breeze environment as
   part of the change now - I will split it off eventually to separate
   discussion/PR but for now it makes it easier for me to run tests so I keep
   it in.
   - The Multi-staging/multi-layered Dockerfile should already improve CI
   build "purity". The way "layers" work now is that PIP dependencies are
   effectively frozen in-between setup.py changes. Only when setup.py changes,
   the corresponding layers are rebuilt and dependencies re-installed. That
   should provide 'out-of-the-box" better stability of CI builds even before
   we solve dependency problem in more "systematic" way (as Fokko mentioned we
   should have separate AIP for that). I am happy to discuss more - either now
   or in the future AIP. It's quite close to my interest to fix this
   eventually as well.

I went through several iterations and what I came up with is already quite
simple and straightforward comparing to some initial approaches I took.

I added quite detailed description and motivation, proposed design and even
measured the impact of layering on build times (All in AIP-10 Confluence
page).

I will continue fixing tests and rebasing the changes for some time (even
few weeks if needed) to test how it behaves with real changes coming
regularly.

For now it's done in the way that I have separate DockerHub build and
Travis CI instance where I will keep on running the tests (automatically):

   - DockerHub:
   https://cloud.docker.com/repository/docker/potiuk/airflow/timeline
   - Travis CI: https://travis-ci.org/potiuk/airflow/builds

J.



On Thu, Jan 17, 2019 at 12:12 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> I've updated the calculations after removing some artifacts and rebulding
> the images from scratch. Here are the updated conclusions:
>
>
>    - The multi-layered image is only slightly bigger than the
>    mono-layered one (around *2% more *in total ) - download time is also
>    slightly longer by 1 s  (33.7 vs 32.7s) which is *3% longer.*
>    - Downloading the image regularly by the users is way better in case
>    of multi-layered image - for simulated user, downloading airflow image
>    twice a week it is:  *4950 MB*  (multi-layered) vs. *13546 MB*
>    (mono-layered) downloads over the course of 8 weeks. Yielding *64%
>    less data* to download.
>    - Multi-layered image seems to be much better for users regularly
>    downloading the image.
>
>
> On Wed, Jan 16, 2019 at 10:59 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> Hello Everyone,
>>
>> Following the discussion we had on Mono-layered vs. Multi-layered
>> official image for Airflow here
>> https://github.com/apache/airflow/pull/4483, I prepared a
>> proof-of-concept PR of multi-layered image (based on the mono-layered one)
>> and I performed calculations and reached some conclusions in this proposal
>> (I wanted to have some hard numbers to back the statement that
>> multi-layered Docker file is better) :
>>
>>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+official+Airflow+image
>>
>> The conclusions I reached:
>>
>>    - The multi-layered image is even slightly smaller than the
>>    mono-layered one - so multi-layered image is even better when you download
>>    it once
>>    - Downloading the image regularly by the users is way better in case
>>    of multi-layered image - for simulated user, downloading airflow image
>>    twice a week it is:  5.7 GB  (multi-layered) vs. 16.15 GB (mono-layered)
>>    downloads over the course of 8 weeks.\
>>    - Multi-layered image is better choice.
>>
>>
>> I based those calculations on the PR I prepared:
>> https://github.com/apache/airflow/pull/4543 where I implemented rather
>> nice multi-layered Dockerfile that can be easily maintained.
>>
>> It's  based on my experience with Airflow Breeze
>> <https://github.com/PolideaInternal/airflow-breeze> - the GCP
>> Development environment we used to develop 30+ GCP based operators recently.
>>
>> I hope we can reach the conclusion as the community that multi-layered is
>> better and that we can go in this direction :). I am happy to iterate on my
>> PR to make it even better.
>>
>> J.
>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> E: jarek.potiuk@polidea.com
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> E: jarek.potiuk@polidea.com
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com

Re: Mutli-layered official image for Airflow

Posted by Jarek Potiuk <Ja...@polidea.com>.
I've updated the calculations after removing some artifacts and rebulding
the images from scratch. Here are the updated conclusions:


   - The multi-layered image is only slightly bigger than the mono-layered
   one (around *2% more *in total ) - download time is also slightly longer
   by 1 s  (33.7 vs 32.7s) which is *3% longer.*
   - Downloading the image regularly by the users is way better in case of
   multi-layered image - for simulated user, downloading airflow image twice a
   week it is:  *4950 MB*  (multi-layered) vs. *13546 MB* (mono-layered)
   downloads over the course of 8 weeks. Yielding *64% less data* to
   download.
   - Multi-layered image seems to be much better for users regularly
   downloading the image.


On Wed, Jan 16, 2019 at 10:59 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Hello Everyone,
>
> Following the discussion we had on Mono-layered vs. Multi-layered official
> image for Airflow here https://github.com/apache/airflow/pull/4483, I
> prepared a proof-of-concept PR of multi-layered image (based on the
> mono-layered one) and I performed calculations and reached some conclusions
> in this proposal (I wanted to have some hard numbers to back the statement
> that multi-layered Docker file is better) :
>
>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+official+Airflow+image
>
> The conclusions I reached:
>
>    - The multi-layered image is even slightly smaller than the
>    mono-layered one - so multi-layered image is even better when you download
>    it once
>    - Downloading the image regularly by the users is way better in case
>    of multi-layered image - for simulated user, downloading airflow image
>    twice a week it is:  5.7 GB  (multi-layered) vs. 16.15 GB (mono-layered)
>    downloads over the course of 8 weeks.\
>    - Multi-layered image is better choice.
>
>
> I based those calculations on the PR I prepared:
> https://github.com/apache/airflow/pull/4543 where I implemented rather
> nice multi-layered Dockerfile that can be easily maintained.
>
> It's  based on my experience with Airflow Breeze
> <https://github.com/PolideaInternal/airflow-breeze> - the GCP Development
> environment we used to develop 30+ GCP based operators recently.
>
> I hope we can reach the conclusion as the community that multi-layered is
> better and that we can go in this direction :). I am happy to iterate on my
> PR to make it even better.
>
> J.
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> E: jarek.potiuk@polidea.com
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com