You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/05/14 20:16:36 UTC

[GitHub] [airflow] potiuk opened a new issue #8872: Add dependencies on build image

potiuk opened a new issue #8872:
URL: https://github.com/apache/airflow/issues/8872


   **Description**
   
   Add dependencies with "on-build" feature of Docker build.
   
   **Use case / motivation**
   
   Some dependencies can be added easily using on-build feature of Docker build.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] landier edited a comment on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
landier edited a comment on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-713682454


   Hello,
   
   Chiming on this issue since I'm facing the very same situation. I want to extend the Airflow Dockerfile (or more precisely reduce by getting rid of some contribs) and I'm looking for a way to do it:
   - without git cloning the repository since it complicates a lot my build (and increase a lot the build time) for only one file...
   - nor duplicating only the Dockerfile by hand because it's not future-proof and prone to error if not updated correctly
   
   Since I'm already using my own Dockerfile that extends the Airflow image, I'm super interested in the official Airflow Dockerfile to use ONBUILD actions for everything that is customizable via build args.
   
   Am I missing something here?
   
   Thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-714377595


   I created an issue for that https://github.com/apache/airflow/issues/11740 - so we can continue the discussion there. This one is already closed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #8872:
URL: https://github.com/apache/airflow/issues/8872


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-634664427


   Yeah, I see the point. I think also maybe we should not do onbuild if we can add easily build args to supplement that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-714374727


   Just to justify a bit more my line of thoughts:
   
   I have successfully used subrepo (https://github.com/ingydotnet/git-subrepo) rather than submodule to successfully sync-up the whole airflow to a customer project (and easily contribute back to upstream any changes we've done there). I can definitely recommend subrepo rather than submodule to do this (it is much nicer to work with as every time you sync, you end up with a direct commit in your target repo (and you have local changes that you might decide to keep for yourself or contribute back.
   
   But I agree if you need only the Dockerfile + scripts, there is no need to clone the whole Airflow.
   
   However, having a separate repo with only what is needed for Dockerfile (but one-way published from the main Airlfow repo) is actually much better than working directly in that repo. Currently, we use the Dockerfile to run tests in Airflow but also we run tests in Airflow to test the Dockerfile, so there is a very close coupling between them and often you have to make commits that cover changes in both - Airflow and Dockerfile at the same time. This would make separate Dockerfile repo quite a bit nightmarish to maintain. 
   
   
   That's why I think separate repo where only Docker + scripts to build it is a better idea. And we can easily establish one-way push to that repo after changes to 'airlfow' and make it read-only otherwise.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zikun commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
zikun commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-634616072


   This could be a nice feature but looks like there was an effort to remove `ONBUILD` from official images - https://github.com/docker-library/official-images/issues/2076
   
   As the above issue mentioned, different users might want different customizations. Initially I wanted to wait for this feature, but I realize `ONBUILD` does not solve my problem, e.g. installing packages like `docker-ce-cli` from third-party repositories. Therefore, I would rather build my own image `FROM apache/airflow`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] landier commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
landier commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-713682454


   Hello,
   
   Chiming on this issue since I'm facing the very same situation. I want to extend the Airflow Dockerfile (or more precisely reduce by getting rid of some contribs) and I'm looking for a way to do it:
   - without git cloning the repository it complicates a lot my build for only one file...
   - nor duplicating only the Dockerfile by hand because it's not future-proof and prone to error if not updated correctly
   
   Since I'm already using my own Dockerfile that extends the Airflow image, I'm super interested in the official Airflow Dockerfile to use ONBUILD actions for everything that is customizable via build args.
   
   Am I missing something here?
   
   Thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-713729571


   ONBUILD does not solve your problems. I wonder how do you imagine it solve the problem? We have no GCC/build-essentials in the  image anyway so at the very best you can get exactly the same as with extending the image (unless you have a better idea).  So all the DEV actions are not possible. And everything that you can do with ONBUILD can be done via extending the current image.
   
   ONBUILD is a very controversial idea and it does not serve any other purpose that extension of the Docker image would do.
   
   I.e. @landier  how any ONBUILD is different from:
   
   ```
   FROM apache/airflow:image
   
   USER root
   
   # do whatever you'd like to do in ONBUILD
   
   USER airflow 
   ```
   ?
   
   I actually think about something entirely different. You do not want to use one file. If you look closer, there are some other files used by the docker (scripts mainly). What I thought might be a better way, is to set up automated copying (using copybara or a dedicated Github Actions worfklow)  only what is needed in order to build the image into a separate repository where only those scripts + Dockerfile would be and nothing more. 
   
   This can happen automatically at every commit and then you'd have to only clone the other repository to build your custom image.
   
   WDYT? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] landier commented on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
landier commented on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-714367018


   Hello,
   
   First, thanks for the response.
   
   We're currently actually extending the image exactly as your example however we're currently thinking of getting rid of some packages that are already installed (contribs mainly) hence we can try to identify each and every package and make sure we uninstall them but it sounds like something painful to do in the long run.
   
   That is why I thought of the ONBUILD in order to set AIRFLOW_EXTRAS with only what we need but you're right with the multistage build, this should happen on the build stage and not the final image stage.
   
   I get your point; it's actually quite similar to my thinking of having airflow in a submodule in order to use the official Dockerfile with my build-args. I'll think it over.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #8872: Add dependencies on build image

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #8872:
URL: https://github.com/apache/airflow/issues/8872#issuecomment-713729571


   ONBUILD does not solve your problems. I wonder how do you imagine it solve the problem? We have no GCC/build-essentials in the  image anyway so at the very best you can get exactly the same as with extending the image (unless you have a better idea).  So all the DEV actions are not possible. And everything that you can do with ONBUILD can be done via extending the current image.
   
   ONBUILD is a very controversial idea and it does not serve any other purpose that extension of the Docker image would do.
   
   I.e. @landier  how any ONBUILD is different from:
   
   ```
   FROM apache/airflow:image
   
   USER root
   
   # do whatever you'd like to do in ONBUILD
   
   USER airflow 
   ```
   ?
   
   I actually think about something entirely different. You do not want to use one file. If you look closer, there are some other files used by the image (scripts mainly). What I thought might be a better way, is to set up automated copying (using copybara or a dedicated Github Actions worfklow)  only what is needed in order to build the image into a separate repository where only those scripts + Dockerfile would be and nothing more. 
   
   This can happen automatically at every commit and then you'd have to only clone the other repository to build your custom image.
   
   WDYT? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org