You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/16 14:16:13 UTC

[GitHub] [airflow] potiuk opened a new pull request #17635: Forces rebuilding the image for cache pushing

potiuk opened a new pull request #17635:
URL: https://github.com/apache/airflow/pull/17635


   Fixes bug in pushing latest image to cache on "push/schedule".
   
   When the build is successful and passes all tests vi either
   `push' or 'schedule' events, we attempt to rebuild the image
   with latest constraints just pushed and push it as a fresh
   cache for Github Registry. This keeps the time to build image
   small without manually refreshing the cache, it also automatically
   checks if there is a new "python" base image available so that
   we can use it in the new cache.
   
   There was a bug that the image has not been FORCE_PULLED and
   rebuilt in this case - just latest images were used.
   
   This had so far no negative effects because due to test
   instability, latest main images pretty much never succeeded in
   all tests, so the images in `main` were refreshed manually
   periodically anyway. However for v2-1-test the scope of tests
   run is far smaller now (no Helm tests, no Provider tests)
   and they succeed mostly when they should.
   
   This PR fixes it so that the images are built properly and pushed.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900164110


   > Wait -- we rebuilt the images, meaning that the images we pushed _aren't_ the ones we just ran tests against? That doesn't seem like desirable behaviour to me.
   
   It's all deliberate. We only do because the original images in push/schedule perform "eager upgrade" of constraints. If we push those images, then the cache for those images will be invalidated at the step where the eager upgrade is performed. The result of the eager-upgrade in previous steps is that we have a new set of constraints that get pushed automatically. And then we use those constraints to build the new images (this time without eager upgrade so that they can be used as cache for regular PR builds that do not perform "eager upgrade".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900166514


   > Ah gotcha, we test with eager upgrade, build constraints based on those images, and the rebuild using the pinned constraints, right?
   
   Precisely.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900155076


   Wait -- we rebuilt the images, meaning that the images we pushed _aren't_ the ones we just ran tests against? That doesn't seem like desirable behaviour to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900164110


   > Wait -- we rebuilt the images, meaning that the images we pushed _aren't_ the ones we just ran tests against? That doesn't seem like desirable behaviour to me.
   
   It's all deliberate. We only do because the original images in push/schedule perform "eager upgrade" of constraints. If we push those images used for tests, then the cache for those images will be invalidated at the step where the eager upgrade is performed. The result of the eager-upgrade in previous steps is that we have a new set of constraints that get pushed automatically. And then we use those constraints to build the new images (this time without eager upgrade so that they can be used as cache for regular PR builds that do not perform "eager upgrade".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-899559601


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900289705


   Ok. The `v2-1-test` build with this one cherry-picked is green :): 
   https://github.com/apache/airflow/actions/runs/1138535672
   
   Merging!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900077034


   Ok. One more fix needed but we are close


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900164110


   > Wait -- we rebuilt the images, meaning that the images we pushed _aren't_ the ones we just ran tests against? That doesn't seem like desirable behaviour to me.
   
   It's all deliberate. We do it because the original images in push/schedule perform "eager upgrade" of constraints. If we push those images used for tests, then the cache for those images will be invalidated at the step where the eager upgrade is performed. 
   
   The result of the eager-upgrade in previous steps is that we have a new set of constraints that get pushed automatically. And then we use those constraints to build the new images (this time without eager upgrade so that they can be used as cache for regular PR builds that do not perform "eager upgrade".
   
   This is why I recently reverse the steps. First constraints are pushed and only after that the "cache images" are rebuilt using those new constraints.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900166337


   Ah gotcha, we test with eager upgrade, build constraints based on those images, and the rebuild using the pinned constraints, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #17635:
URL: https://github.com/apache/airflow/pull/17635


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900175666


   The nice side-effect of this approach is that now we can do "eager-upgrade" in both - "schedule" and "push" runs. This means that constraints might get upgraded even if there is not a single PR merged - by scheduled build. The images for "schedule" builds are always build from scratch (because then  we pull latest Python base image and test if the Dockerflles are still working "from scratch". 
   
   Previously we could not really push those images as cache (because they were built from scratch). Now - even in scheduled build - we are rebuilding the images from cache in this "push cache images" step (using the generated constraints) so we will not loose cache.
   
   But what we also do at those rebuilds (and this is also smarter than before) we check if there were newer python base images released in the meantime and we pull them when we rebuild the 'cache image' if they did - which is the "smoothest" way I could figure our we can automatically updated to the latest base images automatically. Previously when we upgraded to latest python base image, there was potentially long-ish period of time where base python image and latest cache image were not synchronized (Resulting in way longer build-image step > 10 minutes rather than ~1 minute). That made  Breeze sometime use different python base image and not rebuilding as quickly as possible in development environment as it could.
   
   With the current approach we will only ever push the latest base python image and cache image together when the latest python base image changed. 
   
   There are still a few edge cases where this might cause longer rebuilds than needed but I have some ideas how to get rid of those as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #17635: Forces rebuilding the image for cache pushing

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #17635:
URL: https://github.com/apache/airflow/pull/17635#issuecomment-900164110


   > Wait -- we rebuilt the images, meaning that the images we pushed _aren't_ the ones we just ran tests against? That doesn't seem like desirable behaviour to me.
   
   It's all deliberate. We do it because the original images in push/schedule perform "eager upgrade" of constraints. If we push those images used for tests, then the cache for those images will be invalidated at the step where the eager upgrade is performed (because regular pull requests do not perform eager upgrade).
   
   The result of the eager-upgrade in previous steps is that we have a new set of constraints that get pushed automatically. And then we use those constraints to build the new images (this time without eager upgrade so that they can be used as cache for regular PR builds that do not perform "eager upgrade".
   
   This is why I recently reverse the steps. First constraints are pushed and only after that the "cache images" are rebuilt using those new constraints.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org