You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/02 11:46:55 UTC

[GitHub] [airflow] potiuk commented on pull request #21145: enter the shell breeze2 environment

potiuk commented on pull request #21145:
URL: https://github.com/apache/airflow/pull/21145#issuecomment-1027858573


   > @potiuk could you explain to me this part? I can understand that we check if the files in the list of `FILES_FOR_REBUILD_CHECK` are modified. we use md5sum to determine that. we persist the old md5sum calculated in build cache and if it varies from the recent file, we assume that file is modified and do build the local image. But why do we have to calculate them for only specific files in that list?
   
   Yeah. The thing is that we have a lot of files "mounted" inside the Dockerfile. All the sources of airflow are mounted in the place where they would normally be baked in when docker image is built. Mounting files when you run breeze is super quick (milliseconds). Re-building image on the other hand takes 20-30 seconds if you have almost nothing to build (just modified sources). That's why when you modified your file localy, you can enter breeze and run test in a matter of seconds. This allow for very quick iterations when you modify the code and want to test. Imagine if after every single change of your file, you'd have to wait 20-30 seconds to just be able to test it. That's not going to fly. And rebuilding the image is not really needed in this case because mounting the files effectively does what rebuilding the image does.
   
   So instead - we only suggest to rebuild the files when we know the "important" files changed and mounting is not "enough":
   
   * When Dockerfile.ci changes - you likely need to rebuild it because maybe somene added new tools/dependencies via apt
   * When setup.py/setup.cfg changes - you need to rebuild it to reinstall `pip` dependencies - because someone likely added a new dependency (or maybe even you added a new dependency and you want to get it installed inside the image persistently)
   * The ./docker scripts - similarly as Dockerfile.ci - they might do some new installation - if one of them is changed they influence 
   the docker image internals
   * www/ui  package/yarn/webpack- they likely require new "node_modules" to be installed and those are also installed as a separate step in Dockerfile
   * . dockerignore - might mean that some new files have been excluded/added to "Docker context" when docker is build, so likelly docker build is needed to reflect those changes.
   
   Unlike Python souce code - it's not enough to "mount" that files inside the docker to replaces the files in the image. You need to run an extra "action" afterwards if you want changes in those files to be reflected in the image: 'pip install', "yarn install", or just running the scripts to install mysql or others.
   
   So this is purely optimization of iteration speed.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org