You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/04/20 01:30:40 UTC
[GitHub] [airflow] potiuk opened a new pull request, #23104: Use new Breese for building, pulling and verifying the images.
potiuk opened a new pull request, #23104:
URL: https://github.com/apache/airflow/pull/23104
We have the new Python-based breeze and we want to replace all
the functionality used from the old Breeze with the new one.
The most important is image management. It is used in many
places and this PR replaces all the places and removes image
builidng, pushing, pulling to use the new Breeze everywhere.
That includes:
* Building and pushing CI image on CI
* Building and pushing PROD image on CI which includes:
* building Airflow packages
* building Provider packages
* waiting for images in parallel (both CI and and PROD)
* veifying the images (both CI and PROD)
All those commands have been moved to breeze.py and that required
some modification in the Breeze parameter handling - mainly related
to adding spaces when help was displayed because long list of
packages looked very bad in help output.
It's been easier to implement it in one big PR as using
image building and pulling was deeply embedded in many scripts.
With this change all the scripts in CI use directly breeze
and combine using command line parameters with evn variables
directly in the job that execute the breeze commands which
makes it much easier to understand what is going on and
repeat it locally using Breeze.
Fixes: #22825
Fixes: #23077
Fixes: #23076
Fixes: #22829
Fixes: #22828
Fixes: #22826
Fixes: #20961
Fixes: #23102
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
---
**^ Add meaningful description above**
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #23104:
URL: https://github.com/apache/airflow/pull/23104#issuecomment-1104334261
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #23104:
URL: https://github.com/apache/airflow/pull/23104#issuecomment-1104273051
cc: @Bowrna @edithturn -> getting much closer to replace all of it with the new Breeze :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #23104:
URL: https://github.com/apache/airflow/pull/23104#issuecomment-1107406260
All right. All green. Merging.
I tested:
* PR coming from fork here
* main build here: https://github.com/potiuk/airflow/actions/runs/2210724864
* PR coming from airflow here: https://github.com/apache/airflow/pull/23150
That should cover all scenarios - there is slight chance of problems with cache refreshing but we can handle it if we see a problem.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk merged PR #23104:
URL: https://github.com/apache/airflow/pull/23104
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on a diff in pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #23104:
URL: https://github.com/apache/airflow/pull/23104#discussion_r854433222
##########
tests/system/providers/google/bigquery/example_bigquery_sensors.py:
##########
@@ -23,6 +23,7 @@
from datetime import datetime
from airflow import models
+from airflow.models.baseoperator import BaseOperator
Review Comment:
this came as part of the semi-rndom "mypy" failures. Mypy detected slightly different errors when there was a different set of files - so I had some 20 more mypy failures with this change.
I thought it has been merged separately but I will definitely check it (I will myself do a thorough review of it before I merge :).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #23104:
URL: https://github.com/apache/airflow/pull/23104#issuecomment-1104267648
This is the currernt set of commands supported now:
![image](https://user-images.githubusercontent.com/595491/164295871-ff7e1718-b0c2-446c-8ecb-eab5027f8a57.png)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] eladkal commented on a diff in pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #23104:
URL: https://github.com/apache/airflow/pull/23104#discussion_r854426385
##########
tests/system/providers/google/bigquery/example_bigquery_sensors.py:
##########
@@ -23,6 +23,7 @@
from datetime import datetime
from airflow import models
+from airflow.models.baseoperator import BaseOperator
Review Comment:
Why changing system tests is part of this PR?
##########
dev/REFRESHING_CI_CACHE.md:
##########
@@ -91,18 +91,57 @@ docker run --privileged --rm tonistiigi/binfmt --install all
More information can be found [here](https://docs.docker.com/engine/reference/commandline/buildx_create/)
-The images can be rebuilt and refreshed after the constraints are pushed. Refreshing image for particular
-python version is a simple as running the [refresh_images.sh](refresh_images.sh) script with python version
-as parameter:
+However, emulation is very slow - more than 10x slower than hardware-backed builds.
+
+## Setting up cache refreshing with hardware ARM/AMD support
+
+If you plan to build a number of images, probably better solution is to set up a hardware remote builder
Review Comment:
```suggestion
If you plan to build a number of images, probably better solution is to set up a hardware remote builder
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #23104:
URL: https://github.com/apache/airflow/pull/23104#issuecomment-1104266008
Apologies for such a HUGE PR, but I could not find a better way of replacing old with new Breeze's build/pull/push image.
I could keep on replacing piece by piece but keeping the two in parallel in CI was really burdensome. That's why I replaced all the related build/pull/push tasks in one go. I am still testing it (I will do some tests with "main" push to my own repo). But it should be generally working.
It removes another 1000 lines of bash code (YAY!) and paves the way to remove nearly all of them in subsequent PRs.
Most of all It unifies the way how CI and local development works - rather than having scripts on CI and local breeze commands, everythong is done via python-based breeze commands (including running stuff in parallel on CI which does not use Gnu Parallel but Python Multiprocessing)
It's way neater, nicere and much easier to reproduce locally. Example - most complex "building PROD image" is this:
````yaml
- name: "Build & Push PROD image ${{ matrix.python-version }}:${{ env.IMAGE_TAG_FOR_THE_BUILD }}"
run: |
rm -f "./dist/"*.{whl,tar.gz}
rm -f "./docker-context-files/"*.{whl,tar.gz}
breeze prepare-provider-packages \
--package-list-file scripts/ci/installed_providers.txt" \
--package-format wheel
breeze prepare-airflow-package --package-format wheel
mv "./dist/"*.{whl,tar.gz} "./docker-context-files"
breeze build-prod-image --push-image --install-from-docker-context-files
env:
PYTHON_MAJOR_MINOR_VERSION: ${{ matrix.python-version }}
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgradeToNewerDependencies }}
DOCKER_CACHE: ${{ needs.build-info.outputs.cacheDirective }}
IMAGE_TAG: ${{ env.IMAGE_TAG_FOR_THE_BUILD }}
````
Which you can very easily reproduce locally with the new breeze. It combines (for clarity of the commands) both evn-vars and flags, but all the relevant env vars used are now added directly in each step rather than in job or in workflow, which makes it much more obvious what's going on.
@mik-laj - I also removed the "special" preparation of virtualenv for docker tests - the docker tests now can be executed via `breeze` command and they will automatically use breeze's virtualenv (which already containts pytest, pytest-xdistts and requests):
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #23104: Use new Breeze for building, pulling and verifying the images.
Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #23104:
URL: https://github.com/apache/airflow/pull/23104#issuecomment-1104290139
> This is massive! tada
Oh yeah. Unfortunately. Now You know why your PRs were neglected a bit recently ;). The nice thing about it that it really implements all the remaining bits of the "breeze" infrastructure (for example running stuff in parallel is now embedded in some Breeze's commands which make the maintenance stuff much nicer and easier to perform by anyone without too much of a learning curve. After we finish - pretty much everything what happens in CI will be just the right sequence of breeze commands.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org