You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <ja...@potiuk.com> on 2024/02/19 20:53:13 UTC

[DISCUSS] Considering trying out uv for our CI workflows

Hey everyone,

Few days ago the ruff creators have released a new tool uv - which is an
extremely fast (written in rust) and fully featured tool generally fully
compatible with `pip`.

Blog post here: https://astral.sh/blog/uv

It looks like It has a number of things that would make our CI cases and
tooling quite a bit faster and better including a few things that I have
implemented some workarounds for and some that I have not
implemented because `pip` had no good solution.

I looked at the docs and it solves some problems that are currently
difficult or impossible to handle with `pip`:

* ability to use overrides (which are constraints on steroids - allowing to
override limits specified by the packages - this will be very useful to
better handle our cases with "chicken-egg" providers (for example like we
had in FAB) where we have pre-release packages depending on each other

* different resolution strategies including --resolution=lowest which will
finally allow us to see whether airflow's lower bounds are still holding
(i.e. - will our test still pass if we use the lowest supported version of
our dependencies?  this is something i wanted to do for quite some time and
recorded an issue for that - https://github.com/apache/airflow/issues/35549
but lack of tooling support made it a wish, with `--resolution=lowest` it
seems like super-easy thing to do.

* It is said to be many, many times faster - with better caching and
resolution speeds (similarly like with ruff they claim orders of magnitude
speedups in a number of cases). We can likely make very good use of it and
speed up some parts of our CI workflow significantly.

I might likely do some experimenting with uv in our toolchain, but wanted
to make sure we are all aware of it - and ask if someone has something
against it (and maybe someone would like to do some work there trying it
out - I will be happy to guide others with the dev/tooling mindset and
incline to do some changes there/review PRs and cooperate on testing those
things.

It's not a user-facing change, and I do not think we want to get rid of
`pip` as an installation tool in general (in our images and user facing
side) - it's mostly an internal CI tooling improvement I am thinking of.
Maybe at some point in time we can recommend it also for development
workflows, and maybe someday it will gain enough popularity to think about
recommending it to our users, but definitely not now nor in even mid-term
future.

Let me know what you think.

Repo here: https://github.com/astral-sh/uv

J.

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Of course the PR is not yet complete - there are few things to fix there
for special cases we have.

On Sun, Feb 25, 2024 at 12:33 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Hello here.
>
> I have a PR https://github.com/apache/airflow/pull/37683 that implements:
>
> * ability to choose either uv or PIP when building our images
> * CI images are built with uv by default (but you can use `--no-use-uv` as
> a flag and switch back to `pip`
> * PROD images are built with pip by default (but you can us `--use-uv` as
> a flag an switch to uv
>
> The preliminary tests show indeed that uv not only has a much faster
> baseline, but  also their use of caching fits extremely well into our
> strategy of building images and we will get huge improvements of our CI
> build timing when using uv.
>
> Just for the context - our CI images when built are using a caching
> strategy to optimise for f
>
> 1) fast building when there are no changes (around 1 minute to build with
> pip),
> 2) slower building when someone adds or modifies non-conflicting
> dependency (around. 8 minutes to build, out of which ~ 6 m is pip
> resolution and installation)
> 3) much longer build time when there are conflicting dependencies or when
> we change Dockerfile or scripts or when Python base image changes (around
> 27 minutes build out of which pip resolving is ~ 20m).
>
> Those are all `pip` numbers. Currently `pip` does not use resolution
> caching between the steps. Comparison of some basic installation steps from
> initial tests show that UV is way faster:
>
> * Resolving and Installing airflow with [devel-ci] (610 dependencies): pip
> ~ 6m, uv ~ 1m 30 s
> * Re-resolving and reinstalling [devel-ci] using local pyproject.toml; pip
> ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in this
> case.
>
> I have not yet tested well (but I will once they happen) --eager upgrade
> of dependencies (pip - very much depends but it's often in the range of 10
> minutes) - I expect it not to take more than 2-3 minutes with uv
>
> So overall it looks like we are looking at those improvements:
>
> 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
> (because we are using docker layer caching and pip resolution and
> installation is not used at all)
> 2) Updating dependencies: 8m with pip will probably go down with uv to ~
> 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no remote
> changes and cache is used (70% improvement)
> 3) Re-resolving and reinstalling everything 27 m will probably go down
> with uv to ~ 9m => 67% improvements.
>
> If those numbers hold and the resolution quality will be comparable to
> `pip` - then well, it's definitely worth it - and the numbers are very
> close to what the `uv` authors claimed.
>
> I am impressed :)
>
> J.
>
>
>
> On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <am...@gmail.com>
> wrote:
>
>> I agree with Niko here.
>>
>> If someone is willing to give it a try, we should enable it experimentally
>> and give it a stint for a couple of weeks. If we see significant results,
>> we can adopt it.
>>
>> Thanks & Regards,
>> Amogh Desai
>>
>> On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
>> <on...@amazon.com.invalid>
>> wrote:
>>
>> > The Astral folks also seem very focused on it being a drop-in/compliant
>> > replacement for pip. So I think it's definitely worth dropping it in and
>> > seeing if we get the expected performance improvements. If tests still
>> pass
>> > and user facing constraints and install instructions remain unchanged I
>> > don't see why not, if someone is willing to spend the time on it. Never
>> > mind the extra features it would give us (I, like others, am also very
>> > excited about --resolution=lowest, ability).
>> >
>> > ________________________________
>> > From: Andrey Anshin <an...@taragol.is>
>> > Sent: Tuesday, February 20, 2024 12:26:56 AM
>> > To: dev@airflow.apache.org
>> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying
>> > out uv for our CI workflows
>> >
>> > CAUTION: This email originated from outside of the organization. Do not
>> > click links or open attachments unless you can confirm the sender and
>> know
>> > the content is safe.
>> >
>> >
>> >
>> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
>> externe.
>> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
>> pouvez
>> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
>> que
>> > le contenu ne présente aucun risque.
>> >
>> >
>> >
>> > > I share Andrey's skepticism. It's just yet another tool which has an
>> > unclear
>> > development strategy.
>> >
>> > My point was more about a matter of presentation. If someone told you
>> "this
>> > is a new tool, like a killer of previous tools" then you might think
>> > "Yeah...yeah...yeah.. yet another replacement to tool X...  not really
>> > interesting". On the other hand if someone told you what in cases you
>> might
>> > solve, then this might be a mind changer.
>> >
>> > Especially the promising `--resolution=lowest` option. We always want to
>> > test something with minimal dependencies because we are not sure that it
>> > might work with pretty old dependencies, and recently I've started to
>> work
>> > on POC to collect minimal versions of the Airflow and Providers. And at
>> the
>> > moment when I almost finished it the uv was released. Well sometimes it
>> is
>> > better to wait a bit and maybe someone would invent the same
>> > solution 😁 and you don't have to spend a personal time.
>> >
>> > So as POC I'm on it, we still need a `pip` and validate some stuff by a
>> pip
>> > because it is only one officially supported way to install Airflow but
>> if
>> > something could be improved in the CI then I'm on it, in most cases it
>> > would be behind of Breeze and many of the contributors might be even not
>> > noticed that something changed.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:
>> >
>> > > Actually - of you read that blog post, the strategy is clear - they
>> aim
>> > to
>> > > create a comprehensive packaging tooling and improvnts are measured
>> > (80-100
>> > > times they claim - I using caching - they (unlike pip) use a lot of
>> local
>> > > caching including resolving  dependencies).
>> > >
>> > > So I think both arguments are not valid if you ask me.
>> > >
>> > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
>> kxepal@apache.org>
>> > > napisał:
>> > >
>> > > > I share Andrey's skepticism. It's just yet another tool which has an
>> > > > unclear development strategy. Should you make it a free testing
>> suite?
>> > > What
>> > > > project would receive in exchange? A lot of words about being
>> faster,
>> > but
>> > > > how much? Are these milliseconds worth to change the stable tool
>> with a
>> > > new
>> > > > one? And will it notably improve something?
>> > > >
>> > > > I think it's worth to try it just for fun and provide feedback, but
>> > it'll
>> > > > have to pass a long road to become such stable as pip.
>> > > >
>> > > > --
>> > > > ,,,^..^,,,
>> > > >
>> > > >
>> > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> > > >
>> > > > > My opinion:
>> > > > >
>> > > > > I think there is a place for a number of such tools. For a long
>> time
>> > > the
>> > > > > packaging team and `pip` team have been working not only on `pip`
>> > > > > implementation but also (and most importantly) to make sure that
>> what
>> > > > `pip`
>> > > > > does is to be the beacon of standardisation of packaging APIs and
>> > PEPs.
>> > > > It
>> > > > > will never IMHO have a lot of the fancy features that other tools
>> > might
>> > > > > provide (like the ones I mentioned). It will always be there to
>> > provide
>> > > > the
>> > > > > robust and solid CLI to run all packaging things, but there are
>> > plenty
>> > > of
>> > > > > opportunities to provide improved or modified, or more (or less)
>> > > > > opinionated ways of doing things that are addressing some cases
>> that
>> > > > `pip`
>> > > > > team simply will not be able or willing to handle, preferring
>> "pure"
>> > > > > standard approach vs. implement all the optional things. For
>> example
>> > > the
>> > > > > way how pre-releases are handled can be improved to be more
>> > selective.
>> > > > The
>> > > > > PEP describing it gives the tools an option to add more fancy
>> > > behaviours
>> > > > > (some of which we could find useful in our CI tooling). Should
>> `pip`
>> > > > > implement those - I don't think so. It would distract maintainers
>> > from
>> > > > > other more important things. It is quite ok to use other tooling
>> in
>> > > > places
>> > > > > like our CI, where they do some parts of the installation better.
>> > > > >
>> > > > > For me `pip` is going more into the direction of `usable reference
>> > > > > implementation of package installed` - any standard/ PEP will not
>> > > matter
>> > > > if
>> > > > > `pip` does not implement it. But others might go in different
>> > > directions
>> > > > > and implement some less popular features and do it better, faster,
>> > with
>> > > > > greater flexibility. IMHO it's a win-win.
>> > > > >
>> > > > > J.
>> > > > >
>> > > > >
>> > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
>> > > andrey.anshin@taragol.is
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Yesterday my friend shared with me that tool and I've been told
>> > that
>> > > > more
>> > > > > > presumably it would be a niche tool. I've been told "who needs
>> yet
>> > > > > another
>> > > > > > installer which stands to resolve all your problems' '.
>> > > > > > I guess I was wrong?
>> > > > > >
>> > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com>
>> > wrote:
>> > > > > >
>> > > > > > > Hey everyone,
>> > > > > > >
>> > > > > > > Few days ago the ruff creators have released a new tool uv -
>> > which
>> > > is
>> > > > > an
>> > > > > > > extremely fast (written in rust) and fully featured tool
>> > generally
>> > > > > fully
>> > > > > > > compatible with `pip`.
>> > > > > > >
>> > > > > > > Blog post here: https://astral.sh/blog/uv
>> > > > > > >
>> > > > > > > It looks like It has a number of things that would make our CI
>> > > cases
>> > > > > and
>> > > > > > > tooling quite a bit faster and better including a few things
>> > that I
>> > > > > have
>> > > > > > > implemented some workarounds for and some that I have not
>> > > > > > > implemented because `pip` had no good solution.
>> > > > > > >
>> > > > > > > I looked at the docs and it solves some problems that are
>> > currently
>> > > > > > > difficult or impossible to handle with `pip`:
>> > > > > > >
>> > > > > > > * ability to use overrides (which are constraints on steroids
>> -
>> > > > > allowing
>> > > > > > to
>> > > > > > > override limits specified by the packages - this will be very
>> > > useful
>> > > > to
>> > > > > > > better handle our cases with "chicken-egg" providers (for
>> example
>> > > > like
>> > > > > we
>> > > > > > > had in FAB) where we have pre-release packages depending on
>> each
>> > > > other
>> > > > > > >
>> > > > > > > * different resolution strategies including
>> --resolution=lowest
>> > > which
>> > > > > > will
>> > > > > > > finally allow us to see whether airflow's lower bounds are
>> still
>> > > > > holding
>> > > > > > > (i.e. - will our test still pass if we use the lowest
>> supported
>> > > > version
>> > > > > > of
>> > > > > > > our dependencies?  this is something i wanted to do for quite
>> > some
>> > > > time
>> > > > > > and
>> > > > > > > recorded an issue for that -
>> > > > > > > https://github.com/apache/airflow/issues/35549
>> > > > > > > but lack of tooling support made it a wish, with
>> > > > `--resolution=lowest`
>> > > > > it
>> > > > > > > seems like super-easy thing to do.
>> > > > > > >
>> > > > > > > * It is said to be many, many times faster - with better
>> caching
>> > > and
>> > > > > > > resolution speeds (similarly like with ruff they claim orders
>> of
>> > > > > > magnitude
>> > > > > > > speedups in a number of cases). We can likely make very good
>> use
>> > of
>> > > > it
>> > > > > > and
>> > > > > > > speed up some parts of our CI workflow significantly.
>> > > > > > >
>> > > > > > > I might likely do some experimenting with uv in our toolchain,
>> > but
>> > > > > wanted
>> > > > > > > to make sure we are all aware of it - and ask if someone has
>> > > > something
>> > > > > > > against it (and maybe someone would like to do some work there
>> > > trying
>> > > > > it
>> > > > > > > out - I will be happy to guide others with the dev/tooling
>> > mindset
>> > > > and
>> > > > > > > incline to do some changes there/review PRs and cooperate on
>> > > testing
>> > > > > > those
>> > > > > > > things.
>> > > > > > >
>> > > > > > > It's not a user-facing change, and I do not think we want to
>> get
>> > > rid
>> > > > of
>> > > > > > > `pip` as an installation tool in general (in our images and
>> user
>> > > > facing
>> > > > > > > side) - it's mostly an internal CI tooling improvement I am
>> > > thinking
>> > > > > of.
>> > > > > > > Maybe at some point in time we can recommend it also for
>> > > development
>> > > > > > > workflows, and maybe someday it will gain enough popularity to
>> > > think
>> > > > > > about
>> > > > > > > recommending it to our users, but definitely not now nor in
>> even
>> > > > > mid-term
>> > > > > > > future.
>> > > > > > >
>> > > > > > > Let me know what you think.
>> > > > > > >
>> > > > > > > Repo here: https://github.com/astral-sh/uv
>> > > > > > >
>> > > > > > > J.
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Follow up from today:

1) I optimised image building for CI a little bit stil (but that's
single-percent digit)
2) I also found a way to make uv works for PROD images. It was not working
so far because of our `--user` way of installing packages, but for a long
time I wanted to get rid of it as it caused many problems, but I believe I
finally found a good way to do it  - with 100% backwards compatibility with
some of the cases of PythonVirtualenv - previously I could not use
virtualenv to install Airflow in our PROD image but it seems that a small
trick with the right location of the venv in our image does the job with
100% compatibility

This one is a bit tricky - because we do not want (for a long time) to
switch `pip` to `uv` for our users, so while in CI most of the PROD images
(to save time) will be build with `--use-uv`, there is a separate build and
set of tests that will run for `--no-use-uv`. Regular users will have to
use `--build-arg AIRFLOW_USE_UV` to switch to using uv to build the image.
Bonus point: even in `pip` built images users will be able to use `uv` for
their installations (this is something our users are already asking for
https://github.com/apache/airflow/issues/37785 - seems like uv is -
similarly like ruff - spreading like fire).

The PR here: https://github.com/apache/airflow/pull/37796. Overall it's
~55% faster to build a PROD image from scratch with uv than with pip on my
machine (2m vs 4m45s)  - pretty consistent percentage gain as in the CI
image. End result are pretty much identical images (for size and looks like
content - and they pass our PROD image tests - and airflow works as usual
in them)

J.

On Tue, Feb 27, 2024 at 7:58 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> One more update - I am still looking at it and fine-tuning stuff and will
> have a few  more things coming
>
> I found out that we were still using `pip` for `pip constraints
> generation` (those are the constraints that our users use).
> I switched that one to `uv` and it's now 30 seconds instead of more than 5
> minutes - which is more than 10x improvement.
>
> Plus - we get all-canonical `pypi` names back, because I also switched to
> `uv pip freeze` one and uv nicely canonicalizes all the constraints
> generated. I am also switching now with
> https://github.com/apache/airflow/pull/37754 to a new 0.1.11 version that
> has some bug-fixes and new features, this PR also add upgrade-check that
> will tell us when the new version of `pip` and `uv` are available (by
> failing canary build job).
>
> J.
>
> On Tue, Feb 27, 2024 at 7:49 PM Oliveira, Niko <on...@amazon.com.invalid>
> wrote:
>
>> Fantastic results!
>>
>> > It also means that if you've been using breeze and were sometimes
>> afraid to
>>
>> > hit "y" to rebuild the image, being afraid that it will take 20 minutes
>> or
>> > so - not any more. It should be WAY faster now.
>>
>> I'm very excited about this speed up as well as our CI :)
>>
>> ________________________________
>> From: Jarek Potiuk <ja...@potiuk.com>
>> Sent: Tuesday, February 27, 2024 2:44:14 AM
>> To: dev@airflow.apache.org
>> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying
>> out uv for our CI workflows
>>
>> CAUTION: This email originated from outside of the organization. Do not
>> click links or open attachments unless you can confirm the sender and know
>> the content is safe.
>>
>>
>>
>> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
>> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
>> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
>> le contenu ne présente aucun risque.
>>
>>
>>
>> Summarising where we are:
>>
>> After ~24 hrs of operations, it looks really cool and fulfills (and
>> actually exceeds) all my expectations.
>>
>> * Multiple PRs succeeded, we got quite a few constraints updated
>> automatically after successful canary runs:
>> https://github.com/apache/airflow/commits/constraints-main/ (and they
>> look
>> perfectly fine - pretty much what I'd expect)
>> * I looked through a number of image builds in "canary" runs and the
>> regular 10-12 minutes build-image jobs are down to 3-4 minutes
>> * I just did an experiment and on my machine I run a complete from the
>> scratch CI image with new dependencies build for breeze (with `breeze ci
>> image build --python 3.9 --docker-cache disabled
>> --upgrade-to-newer-dependencies` ) and compared it with v2-8-test branch
>> where we do not have the change applied yet
>>
>> Results (on my desktop machine (16 cores, network 1Gb download and very
>> fast disk):
>>
>> * v2-8-test: 730 s -> *12 minutes *
>> * main: 227 s -> less than *4 minutes (!)*
>>
>> That's 70% (!) faster. This is a complete full rebuild of the image,
>> including installing all dependencies from the scratch and attempting to
>> upgrade them to the latest compatible versions. That is the WORST case.
>> Of course it will vary - depending on the network speed you have and
>> number
>> of CPU (unlike `pip` for now `uv` heavily uses parallelism - both for
>> downloads and installation and that is one of the reasons why the
>> difference is so huge). I'd love to hear the results of such comparisons
>> from others with different machines/networking/disks - to get a bit more
>> scientific data points.
>>
>> It also means that if you've been using breeze and were sometimes afraid
>> to
>> hit "y" to rebuild the image, being afraid that it will take 20 minutes or
>> so - not any more. It should be WAY faster now.
>>
>> I will also proceed to attempt to use the `--resolution lowest` soon and
>> try to see if we can have a nice automation in place to bump our
>> min-versions to the "actually working" versions - for all our extras. That
>> would be a major win for our users - as there will never be a case in the
>> future that they upgrade airflow to a newer version and some old
>> dependency
>> remains and is not compatible. It does not happen often,
>>
>> Seeing the speed difference - I am actually going now to regularly use `uv
>> pip` for any local installation as well - it should save a LOT of time -
>> especially that if you have multiple environments, it keeps a single cache
>> for all your installed packages (and their metadata) - this means that if
>> you have several virtualenvs installed and switch between them, the
>> installation and reinstallation of packages between those packages should
>> be lightning fast (like single seconds rather than 10s of seconds for
>> smallest installation). I'd heartily recommend it to anyone.
>>
>> Let's see about the stability. I know there are few edge-cases that are
>> not
>> handled well - Damian helpfully pointed out to the "apache-airflow[all]"
>> case that currently is problematic, so I will keep an eye on new versions
>> and fixes (In CI of ours we are currently pinned to 0.1.10 - so we are
>> shielded from any potential stability problems and we will need to
>> manually
>> upgrade to newer versions when they appear).
>>
>> J.
>>
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
One more update - I am still looking at it and fine-tuning stuff and will
have a few  more things coming

I found out that we were still using `pip` for `pip constraints generation`
(those are the constraints that our users use).
I switched that one to `uv` and it's now 30 seconds instead of more than 5
minutes - which is more than 10x improvement.

Plus - we get all-canonical `pypi` names back, because I also switched to
`uv pip freeze` one and uv nicely canonicalizes all the constraints
generated. I am also switching now with
https://github.com/apache/airflow/pull/37754 to a new 0.1.11 version that
has some bug-fixes and new features, this PR also add upgrade-check that
will tell us when the new version of `pip` and `uv` are available (by
failing canary build job).

J.

On Tue, Feb 27, 2024 at 7:49 PM Oliveira, Niko <on...@amazon.com.invalid>
wrote:

> Fantastic results!
>
> > It also means that if you've been using breeze and were sometimes afraid
> to
>
> > hit "y" to rebuild the image, being afraid that it will take 20 minutes
> or
> > so - not any more. It should be WAY faster now.
>
> I'm very excited about this speed up as well as our CI :)
>
> ________________________________
> From: Jarek Potiuk <ja...@potiuk.com>
> Sent: Tuesday, February 27, 2024 2:44:14 AM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying
> out uv for our CI workflows
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> Summarising where we are:
>
> After ~24 hrs of operations, it looks really cool and fulfills (and
> actually exceeds) all my expectations.
>
> * Multiple PRs succeeded, we got quite a few constraints updated
> automatically after successful canary runs:
> https://github.com/apache/airflow/commits/constraints-main/ (and they look
> perfectly fine - pretty much what I'd expect)
> * I looked through a number of image builds in "canary" runs and the
> regular 10-12 minutes build-image jobs are down to 3-4 minutes
> * I just did an experiment and on my machine I run a complete from the
> scratch CI image with new dependencies build for breeze (with `breeze ci
> image build --python 3.9 --docker-cache disabled
> --upgrade-to-newer-dependencies` ) and compared it with v2-8-test branch
> where we do not have the change applied yet
>
> Results (on my desktop machine (16 cores, network 1Gb download and very
> fast disk):
>
> * v2-8-test: 730 s -> *12 minutes *
> * main: 227 s -> less than *4 minutes (!)*
>
> That's 70% (!) faster. This is a complete full rebuild of the image,
> including installing all dependencies from the scratch and attempting to
> upgrade them to the latest compatible versions. That is the WORST case.
> Of course it will vary - depending on the network speed you have and number
> of CPU (unlike `pip` for now `uv` heavily uses parallelism - both for
> downloads and installation and that is one of the reasons why the
> difference is so huge). I'd love to hear the results of such comparisons
> from others with different machines/networking/disks - to get a bit more
> scientific data points.
>
> It also means that if you've been using breeze and were sometimes afraid to
> hit "y" to rebuild the image, being afraid that it will take 20 minutes or
> so - not any more. It should be WAY faster now.
>
> I will also proceed to attempt to use the `--resolution lowest` soon and
> try to see if we can have a nice automation in place to bump our
> min-versions to the "actually working" versions - for all our extras. That
> would be a major win for our users - as there will never be a case in the
> future that they upgrade airflow to a newer version and some old dependency
> remains and is not compatible. It does not happen often,
>
> Seeing the speed difference - I am actually going now to regularly use `uv
> pip` for any local installation as well - it should save a LOT of time -
> especially that if you have multiple environments, it keeps a single cache
> for all your installed packages (and their metadata) - this means that if
> you have several virtualenvs installed and switch between them, the
> installation and reinstallation of packages between those packages should
> be lightning fast (like single seconds rather than 10s of seconds for
> smallest installation). I'd heartily recommend it to anyone.
>
> Let's see about the stability. I know there are few edge-cases that are not
> handled well - Damian helpfully pointed out to the "apache-airflow[all]"
> case that currently is problematic, so I will keep an eye on new versions
> and fixes (In CI of ours we are currently pinned to 0.1.10 - so we are
> shielded from any potential stability problems and we will need to manually
> upgrade to newer versions when they appear).
>
> J.
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by "Oliveira, Niko" <on...@amazon.com.INVALID>.
Fantastic results!

> It also means that if you've been using breeze and were sometimes afraid to

> hit "y" to rebuild the image, being afraid that it will take 20 minutes or
> so - not any more. It should be WAY faster now.

I'm very excited about this speed up as well as our CI :)

________________________________
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Tuesday, February 27, 2024 2:44:14 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying out uv for our CI workflows

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le contenu ne présente aucun risque.



Summarising where we are:

After ~24 hrs of operations, it looks really cool and fulfills (and
actually exceeds) all my expectations.

* Multiple PRs succeeded, we got quite a few constraints updated
automatically after successful canary runs:
https://github.com/apache/airflow/commits/constraints-main/ (and they look
perfectly fine - pretty much what I'd expect)
* I looked through a number of image builds in "canary" runs and the
regular 10-12 minutes build-image jobs are down to 3-4 minutes
* I just did an experiment and on my machine I run a complete from the
scratch CI image with new dependencies build for breeze (with `breeze ci
image build --python 3.9 --docker-cache disabled
--upgrade-to-newer-dependencies` ) and compared it with v2-8-test branch
where we do not have the change applied yet

Results (on my desktop machine (16 cores, network 1Gb download and very
fast disk):

* v2-8-test: 730 s -> *12 minutes *
* main: 227 s -> less than *4 minutes (!)*

That's 70% (!) faster. This is a complete full rebuild of the image,
including installing all dependencies from the scratch and attempting to
upgrade them to the latest compatible versions. That is the WORST case.
Of course it will vary - depending on the network speed you have and number
of CPU (unlike `pip` for now `uv` heavily uses parallelism - both for
downloads and installation and that is one of the reasons why the
difference is so huge). I'd love to hear the results of such comparisons
from others with different machines/networking/disks - to get a bit more
scientific data points.

It also means that if you've been using breeze and were sometimes afraid to
hit "y" to rebuild the image, being afraid that it will take 20 minutes or
so - not any more. It should be WAY faster now.

I will also proceed to attempt to use the `--resolution lowest` soon and
try to see if we can have a nice automation in place to bump our
min-versions to the "actually working" versions - for all our extras. That
would be a major win for our users - as there will never be a case in the
future that they upgrade airflow to a newer version and some old dependency
remains and is not compatible. It does not happen often,

Seeing the speed difference - I am actually going now to regularly use `uv
pip` for any local installation as well - it should save a LOT of time -
especially that if you have multiple environments, it keeps a single cache
for all your installed packages (and their metadata) - this means that if
you have several virtualenvs installed and switch between them, the
installation and reinstallation of packages between those packages should
be lightning fast (like single seconds rather than 10s of seconds for
smallest installation). I'd heartily recommend it to anyone.

Let's see about the stability. I know there are few edge-cases that are not
handled well - Damian helpfully pointed out to the "apache-airflow[all]"
case that currently is problematic, so I will keep an eye on new versions
and fixes (In CI of ours we are currently pinned to 0.1.10 - so we are
shielded from any potential stability problems and we will need to manually
upgrade to newer versions when they appear).

J.

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Summarising where we are:

After ~24 hrs of operations, it looks really cool and fulfills (and
actually exceeds) all my expectations.

* Multiple PRs succeeded, we got quite a few constraints updated
automatically after successful canary runs:
https://github.com/apache/airflow/commits/constraints-main/ (and they look
perfectly fine - pretty much what I'd expect)
* I looked through a number of image builds in "canary" runs and the
regular 10-12 minutes build-image jobs are down to 3-4 minutes
* I just did an experiment and on my machine I run a complete from the
scratch CI image with new dependencies build for breeze (with `breeze ci
image build --python 3.9 --docker-cache disabled
--upgrade-to-newer-dependencies` ) and compared it with v2-8-test branch
where we do not have the change applied yet

Results (on my desktop machine (16 cores, network 1Gb download and very
fast disk):

* v2-8-test: 730 s -> *12 minutes *
* main: 227 s -> less than *4 minutes (!)*

That's 70% (!) faster. This is a complete full rebuild of the image,
including installing all dependencies from the scratch and attempting to
upgrade them to the latest compatible versions. That is the WORST case.
Of course it will vary - depending on the network speed you have and number
of CPU (unlike `pip` for now `uv` heavily uses parallelism - both for
downloads and installation and that is one of the reasons why the
difference is so huge). I'd love to hear the results of such comparisons
from others with different machines/networking/disks - to get a bit more
scientific data points.

It also means that if you've been using breeze and were sometimes afraid to
hit "y" to rebuild the image, being afraid that it will take 20 minutes or
so - not any more. It should be WAY faster now.

I will also proceed to attempt to use the `--resolution lowest` soon and
try to see if we can have a nice automation in place to bump our
min-versions to the "actually working" versions - for all our extras. That
would be a major win for our users - as there will never be a case in the
future that they upgrade airflow to a newer version and some old dependency
remains and is not compatible. It does not happen often,

Seeing the speed difference - I am actually going now to regularly use `uv
pip` for any local installation as well - it should save a LOT of time -
especially that if you have multiple environments, it keeps a single cache
for all your installed packages (and their metadata) - this means that if
you have several virtualenvs installed and switch between them, the
installation and reinstallation of packages between those packages should
be lightning fast (like single seconds rather than 10s of seconds for
smallest installation). I'd heartily recommend it to anyone.

Let's see about the stability. I know there are few edge-cases that are not
handled well - Damian helpfully pointed out to the "apache-airflow[all]"
case that currently is problematic, so I will keep an eye on new versions
and fixes (In CI of ours we are currently pinned to 0.1.10 - so we are
shielded from any potential stability problems and we will need to manually
upgrade to newer versions when they appear).

J.

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
> Btw, I've been testing how well the resolver works for uv based on many
real world examples I have built up over the years improving the pip
resolver, it still can't resolve apache-airflow[all]:
https://github.com/astral-sh/uv/issues/1560. I have recommended heuristics
that uv could implement which I think would help here.

Strange with [all] - because with we are now using uv nicely to install
`[devel-ci]` which is way bigger than all (though all has one more level of
indirection via released providers (all has dependencies to providers and
then only those - various versions of providers have other dependencies,
where devel-ci is has all the dependencies of latest dependencies which is
quite a bit simpler resolution to happen). Curious to see the progress on
it.

> But one interesting idea that airflow could look at is periodically
running test cases against either "--resolution=lowest" or
"--resolution=lowest-direct" to try and flush out bad old dependencies?

Look at the beginning of the discussion :) . This is definitely one of the
features that we are going to use (and soon). And we have an issue already
planned to do so: https://github.com/apache/airflow/issues/35549

The (little) difficulty here is that we won't be able to weed all the weeds
out this way - because with 90 optional providers/ extras  we have a lot of
overlapping dependencies so just "lowest" does not mean that our
lower-bound criteria for particular provider are "good enough" - because
the same transitive dependency from one provider might be limited by
another - so in order to **really** find out the right lower bounds we will
literally have to get <lowest> for a single provider and run tests for it
for that provider. So this is going to be a little harder than just running
--lowest on everything. That might be a good start but is not nearly
enough. But - we have all the right tooling with breeze, test parallelism
that we can employ for that.

J.


On Mon, Feb 26, 2024 at 6:22 PM Damian Shaw <ds...@striketechnologies.com>
wrote:

> Btw, I've been testing how well the resolver works for uv based on many
> real world examples I have built up over the years improving the pip
> resolver, it still can't resolve apache-airflow[all]:
> https://github.com/astral-sh/uv/issues/1560. I have recommended
> heuristics that uv could implement which I think would help here.
>
> But one interesting idea that airflow could look at is periodically
> running test cases against either "--resolution=lowest" or
> "--resolution=lowest-direct" to try and flush out bad old dependencies?
>
> Damian
>
> -----Original Message-----
> From: Jarek Potiuk <ja...@potiuk.com>
> Sent: Monday, February 26, 2024 7:45 AM
> To: dev@airflow.apache.org
> Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows
>
> And merged. I will keep an eye on it for the next few days.
>
> On Mon, Feb 26, 2024 at 11:47 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Yes. The difference was because of caching. I forgot to mention that.
> > This is due to the way our CI image docker optimisation works
> >
> > The way how the image is constructed is that it first installs
> > dependencies from https://github.com/apache/airflow to save on any
> > future re-installs - using docker caching mechanism (it works the same
> > for `pip` and `uv pip`.
> >
> > It works roughly like this:
> >
> > LAYER 1: pip install "apache-airflow[devel-ci] @
> > https://github.com/apache/airflow/archive/main.tar.gz <- here we
> > install all airflow dependencies in the main AT THE MOMENT THIS LAYER
> > WAS BUILT LAYER 2: COPY pyproject.toml LAYER 3: pip install
> > ".[devel-ci]" --constraints https://.....
> >
> > It is a very nice way of caching and speeding up adding new
> > dependencies I introduced years ago that works very nicely for us for
> > remote builds (so local breeze builds are making use of it as well) -
> > this means that ALL dependencies of airflow will be pre-installed as a
> > cached layer in the image, regardless of modification in
> > pyproject.toml. So whenever someone modifies pyproject.toml and adds
> > new dependencies or modifies the existing ones - LAYER 1 will NOT be
> > invalidated, but LAYER 2 (and LAYER 3) will - which means that the
> > LAYER 3 pip install will only install new dependencies (and it will use
> latest constraints for that).
> >
> > LAYER 1 is only modified when:
> >
> > a) Python base image changes (every few weeks)
> > b) Docker scripts change (those scripts that are COPIED before that
> > layer
> > - so for example airlfow installation scripts).
> > c) DEPENDENCY_EPOCH change -> we can manually bump it to force the
> > reinstallation after we remove some dependency, to make sure it is
> > regenerated
> >
> > This has the side effect that when you add or modify a dependency, it
> > is very fast - instead of reinstalling all 600+ dependencies, they are
> > already installed and you only get dif Another side effect of it that
> > the image (between Python base image updates or epoch/docker script
> > changes) is that the image gets ever so bigger - every time new
> > constraints are update and cache rebuilt and constraints updated, the
> > image gets updated with new dependencies in main, incrementally adding
> > changed (and only those changed ones) dependencies in LAYER 3. So I
> > was comparing an image where LAYER 1 was created some time ago with
> > pip (and LAYER 3 got bigger) with pretty much "From the scratch"
> > image  where LAYER 1 was "latest deps" and LAYER 3 has almost no
> > updated/ new dependencies.
> >
> > That explains the difference. The new image will also get slightly
> > bigger in the next few days or weeks, until a new Python base image is
> > released or we will update the scripts.
> >
> > Also - all tests pass and that's most important. The CI image is
> > exclusively used to run tests, it's not used in production. The
> > production image is still using `pip` (I had some problems with PROD
> > image building with uv - because it expects virtualenv. rather than
> > --user installation of
> > ours) . We might want to fix it some time in the future (and uv might
> > add it as a feature in the meantime) - let's give `uv` some time to
> settle :).
> >
> > So - it has no impact on the user-facing side (at all).
> >
> > Re: dependencies after `uv` was successful have been updated here:
> > https://github.com/apache/airflow/commit/fd64235a481adb4aaff1b2f432eac
> > eb9d0b5c53c
> > in our constraints 2 hrs ago.
> >
> > As you can see - the changes are "as expected" - there are a few
> > dependencies bumped since yesterday (correctly picked up by uv
> > --highest resolution mechanism). The `uv pip freeze` command of uv for
> > now uses original, non-canonical names of packages - but the original
> > now (underscores instead of dashes) - but that's perfectly fine, those
> > packages get canonical names. This will likely get changed in the future.
> >
> > J.
> >
> >
> >
> >
> >
> >
> > On Mon, Feb 26, 2024 at 10:18 AM Scheffler Jens (XC-AS/EAE-ADA-T)
> > <Je...@de.bosch.com.invalid> wrote:
> >
> >> @Jarek, had no time to review PR.
> >> If the Docker image is ~400MB smaller, I fear there is a diff. Were
> >> you able to dump a file list to inspect the diff?
> >> If not I would propose to make it in the PR to understand "why". If
> >> there care cache files (only) then in general it would make sense to
> >> think about if "cache/garbage" is anyway left in pip/uv which we
> >> should clean to shrink images.
> >>
> >> Mit freundlichen Grüßen / Best regards
> >>
> >> Jens Scheffler
> >>
> >> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T) Robert Bosch GmbH |
> >> Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY |
> >> http://www.bosch.com/ Tel. +49 711 811-91508 | Mobil +49 160 90417410
> >> | Jens.Scheffler@de.bosch.com
> >>
> >> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> >> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> >> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr.
> >> Markus Forschner, Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer,
> >> Dr. Tanja Rückert
> >>
> >> -----Original Message-----
> >> From: Jarek Potiuk <ja...@potiuk.com>
> >> Sent: Montag, 26. Februar 2024 08:54
> >> To: Amogh Desai <am...@gmail.com>
> >> Cc: dev@airflow.apache.org
> >> Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows
> >>
> >> Yep. It all looks good now and I re-ran last intermittently failing job:
> >> Final effect of it:
> >>
> >> * CI image (uncompressed) with uv is slightly smaller (3.5 GB vs. 3.9
> >> GB)
> >> * regular code only PRs: same time to incrementally build image ~ 1m
> >> * adding/modifying dependency in the PR:: 12 m  -> 6m : 50%
> >> improvement
> >> * removing dependency/rebuilding things from scratch -> 27m -> 12 m :
> >> 55% improvement
> >>
> >> Depending on the speed of your network, also locally rebuilding your
> >> image should be generally much faster in all cases once we merge it
> >> and update cache.
> >>
> >> Also the flaky test turned out to be really just "sometimes running
> >> much slower than expected" case - I increased the number of retries
> >> and gave the test a bit more time and added better message, so
> >> hopefully the flaky test will stop happening now.
> >>
> >> I think it's a no-brainer :).
> >> https://github.com/apache/airflow/pull/37692 waiting for reviews
> >>
> >> J.
> >>
> >>
> >>
> >> On Mon, Feb 26, 2024 at 4:50 AM Amogh Desai
> >> <am...@gmail.com>
> >> wrote:
> >>
> >> > Thanks for the superb investigation and effort @Jarek Potiuk
> >> > <ja...@potiuk.com>!
> >> >
> >> > I quite like the performance improvement numbers uv brings in
> >> > compared to pip.
> >> > I see no reason not to switch to UV in prod images as well.
> >> >
> >> > I will take a look at the pull request soon.
> >> >
> >> > Thanks & Regards,
> >> > Amogh Desai
> >> >
> >> > On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> >> >
> >> >> I think I will get it green finally:
> >> >> https://github.com/apache/airflow/pull/37692.
> >> >>
> >> >> I know where the test flakiness was from. Generally speaking it
> >> >> turned out that there is no free lunch and - of course - cache
> >> >> from uv increased our CI image size significantly (by around 1.5G)
> >> >> - and it caused much slower test execution (and test became more
> >> >> flaky because of that). So after looking at that I decided to
> >> >> disable the cache - it's definitely not worth it to increase the
> >> >> size of our images that much. We still have significant (50% - 60%
> >> >> improvements - not the 60% - 70% like we had with cache), but it's
> >> >> still significant enough. Without cache the "upgrade scenario is ~
> >> >> 40s (so no 4s any
> >> >> more) instead of 7m with pip - so this is still a huge improvement
> >> >> (image size is even smaller than the one with `pip`).
> >> >>
> >> >>
> >> >> J,
> >> >>
> >> >>
> >> >>
> >> >> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> >> >>
> >> >> > Some more findings.
> >> >> >
> >> >> > Overall, I can confirm that with `uv` we will get significant -
> >> >> > 60
> >> >> > - 70% on build image times. This will impact both CI but also
> >> >> > `breeze` local rebuilds.
> >> >> >
> >> >> > I am getting closer to a mergeable state. I switched to
> >> >> > https://g/ ithub.com
> >> %2Fapache%2Fairflow%2Fpull%2F37692&data=05%7C02%7CJens.Scheffler%
> >> 40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bb
> >> b6d648ee58410f4%7C0%7C0%7C638445308555397453%7CUnknown%7CTWFpbGZsb3d8
> >> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> >> 0%7C%7C%7C&sdata=QD6nJwgUu5Hncvb3Vu%2F0WCbL2iYXzxu3z8N7xYhcjAg%3D&res
> >> erved=0 to test "upgrade to latest dependencies" workflow and canary
> >> build impact.
> >> >> >
> >> >> > The PR is getting greener and greener. I have a few last things
> >> >> > to address.
> >> >> >
> >> >> > An interesting story is that a flaky test in CLI
> >> >> >
> >> >> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::t
> >> >> est
> >> >> _cli_webserver_background)
> >> >> > we had is suddenly significantly more flaky, so I will have to
> >> >> > take a
> >> >> look
> >> >> > at how to finally remove the flakiness from it.
> >> >> > This is a good thing because this test had been flaky for quite
> >> >> > a while but it was very difficult to reproduce and seems that
> >> >> > for some reason
> >> >> it is
> >> >> > now much easier to reproduce (which also means we will know when
> >> >> > we fix
> >> >> it0.
> >> >> >
> >> >> > Looking at stats it seems that a lot  (but not all) of the speed
> >> >> > improvement might come with Parallel downloading of dependencies
> >> >> > - which are in the works also for pip ( https://g/
> >> >> > ithub.com%2Fpypa%2Fpip%2Fpull%2F12388&data=05%7C02%7CJens.Scheff
> >> >> > ler
> >> >> > %40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c
> >> >> > 84e
> >> >> > 4bbb6d648ee58410f4%7C0%7C0%7C638445308555401666%7CUnknown%7CTWFp
> >> >> > bGZ
> >> >> > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
> >> >> > 6Mn
> >> >> > 0%3D%7C0%7C%7C%7C&sdata=Ycl1VKKK3Rb6iMVLq4kX3OToJXe119GlfUBE8DXK
> >> >> > 9dc
> >> >> > %3D&reserved=0) - though it's not clear how
> >> >> much
> >> >> > it will help as the Batch Dowloader in pip is involved only
> >> >> > after resolution. We will see after it is implemented if it
> changes things.
> >> >> >
> >> >> > I am also now switching PROD builds to use uv to see how much we
> >> >> > can
> >> >> save,
> >> >> > but I leave `pip` as default for releases and users, the only
> >> >> difference is
> >> >> > CI - I've added separate step for `pip` PROD build to compare
> >> >> > and to
> >> >> make
> >> >> > sure it's running fine in CI.
> >> >> >
> >> >> > The numbers:
> >> >> >
> >> >> > * for "upgrade to newer dependencies" scenario - uv is WAY
> >> >> > faster - as I thought. In the "current" stage of the main it is:
> >> >> > ~7m pip, 5 s
> >> (!) uv.
> >> >> > Here caching of uv makes a huge difference, and while there is
> >> >> > some
> >> >> work in
> >> >> > `pip` and resolvelib (looking at PRs/issues) it's going to be
> >> >> > quite some time to get similar results from pip and "upgrade"
> >> >> > builds will go down eventually from 12m to 5 m - which is a
> >> >> > major improvement - especially
> >> >> for
> >> >> > elapsed time of CI builds.
> >> >> >
> >> >> > * from what I see package installation is super-fast in uv.
> >> >> > Installing
> >> >> 614
> >> >> > packages takes (wait for it) 1s (!) where I saw it taking way
> >> >> > over a
> >> >> minute
> >> >> > with `pip`. This will be hard to beat I think with Python vs. rust.
> >> >> >
> >> >> > Some notes about differences I saw:
> >> >> >
> >> >> > PIP and UV lead to slightly different resolutions when upgrading.
> >> >> > This
> >> >> is
> >> >> > not a surprise because different heuristics are involved (the
> >> >> > resolution algorithm is np-complete
> >> >> > (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F
> >> >> > %2F
> >> >> > research.swtch.com%2Fversion-sat&data=05%7C02%7CJens.Scheffler%4
> >> >> > 0de
> >> >> > .bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4b
> >> >> > bb6
> >> >> > d648ee58410f4%7C0%7C0%7C638445308555405774%7CUnknown%7CTWFpbGZsb
> >> >> > 3d8
> >> >> > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> >> >> > 3D%
> >> >> > 7C0%7C%7C%7C&sdata=2yp6zlzYsFMa2Qfua6T62ADSn5q2A8hUNSVSVd3VC8Q%3
> >> >> > D&r
> >> >> > eserved=0)  and it's very inefficient to run the full
> >> >> > resolution, so both pip and uv
> >> >> take a
> >> >> > little different approach for shortcuts and limiting the
> >> >> > possible space
> >> >> of
> >> >> > solutions. I've done a few PRs limiting (lower-bound) some
> >> >> > dependencies
> >> >> to
> >> >> > bring them closer) - but at the end what we get is "correct" in
> >> >> > both
> >> >> cases
> >> >> > - I continue running `pip check` to make sure that whatever UV
> >> >> > finds is also correct according to `pip`. Nothing really major
> >> >> > there. There were literally few cases that required some manual
> >> >> > adjustments. Nothing unmanageable also in the future, I was
> >> >> > doing similar tweaks with `pip`
> >> >> as
> >> >> > well to help with the resolution.
> >> >> >
> >> >> > Example of differences (left. first is pip, right, second is uv)
> >> >> >
> >> >> > < importlib-resources==5.13.0
> >> >> > ---
> >> >> > > importlib-resources==6.1.1
> >> >> >
> >> >> > vs.
> >> >> >
> >> >> > < pycountry==23.12.11
> >> >> > ---
> >> >> > > pycountry==22.3.5
> >> >> >
> >> >> > It means that with `uv` we have a newer version of
> >> >> > importlib_resources
> >> >> but
> >> >> > an older version of pycountry.
> >> >> >
> >> >> > This one I will handle by bumping pycountry in case of facebook
> >> >> > provider and bump it to > 23.12 as the old version is 1.5 years
> old.
> >> >> >
> >> >> > J.
> >> >> >
> >> >> >
> >> >> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala
> >> >> > <hu...@awala.fr>
> >> >> wrote:
> >> >> >
> >> >> >> That's impressive! I love this tool, not only for reducing CI
> >> >> >> time but also for saving the environment.
> >> >> >> Some of the previous improvements were to further parallelize
> >> >> >> CI jobs
> >> >> to
> >> >> >> complete the CI faster, but this tool will help reduce the
> >> >> >> overall
> >> >> time.
> >> >> >>
> >> >> >> Big +1
> >> >> >>
> >> >> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk
> >> >> >> <ja...@potiuk.com>
> >> >> wrote:
> >> >> >>
> >> >> >> > Hello here.
> >> >> >> >
> >> >> >> > I have a PR
> >> >> >> >
> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%25
> >> >> >> > 2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F37683&data=05%7C02%7
> >> >> >> > CJe
> >> >> >> > ns.Scheffler%40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff
> >> >> >> > 7%7
> >> >> >> > C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C63844530855541012
> >> >> >> > 5%7
> >> >> >> > CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC
> >> >> >> > JBT
> >> >> >> > iI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=I8DF0ugyN53LKTOZ
> >> >> >> > y8N
> >> >> >> > dhKS%2FUKGdZuI9SoOVIwgx9MI%3D&reserved=0 that
> >> >> >> implements:
> >> >> >> >
> >> >> >> > * ability to choose either uv or PIP when building our images
> >> >> >> > * CI images are built with uv by default (but you can use
> >> >> `--no-use-uv`
> >> >> >> as
> >> >> >> > a flag and switch back to `pip`
> >> >> >> > * PROD images are built with pip by default (but you can us
> >> >> `--use-uv`
> >> >> >> as a
> >> >> >> > flag an switch to uv
> >> >> >> >
> >> >> >> > The preliminary tests show indeed that uv not only has a much
> >> >> >> > faster baseline, but  also their use of caching fits
> >> >> >> > extremely well into our strategy of building images and we
> >> >> >> > will get huge improvements of our
> >> >> CI
> >> >> >> > build timing when using uv.
> >> >> >> >
> >> >> >> > Just for the context - our CI images when built are using a
> >> >> >> > caching strategy to optimise for f
> >> >> >> >
> >> >> >> > 1) fast building when there are no changes (around 1 minute
> >> >> >> > to build
> >> >> >> with
> >> >> >> > pip),
> >> >> >> > 2) slower building when someone adds or modifies
> >> >> >> > non-conflicting
> >> >> >> dependency
> >> >> >> > (around. 8 minutes to build, out of which ~ 6 m is pip
> >> >> >> > resolution and
> >> >> >> > installation)
> >> >> >> > 3) much longer build time when there are conflicting
> >> >> >> > dependencies or
> >> >> >> when
> >> >> >> > we change Dockerfile or scripts or when Python base image
> >> >> >> > changes
> >> >> >> (around
> >> >> >> > 27 minutes build out of which pip resolving is ~ 20m).
> >> >> >> >
> >> >> >> > Those are all `pip` numbers. Currently `pip` does not use
> >> >> >> > resolution caching between the steps. Comparison of some
> >> >> >> > basic installation
> >> >> steps
> >> >> >> from
> >> >> >> > initial tests show that UV is way faster:
> >> >> >> >
> >> >> >> > * Resolving and Installing airflow with [devel-ci] (610
> >> >> dependencies):
> >> >> >> pip
> >> >> >> > ~ 6m, uv ~ 1m 30 s
> >> >> >> > * Re-resolving and reinstalling [devel-ci] using local
> >> >> pyproject.toml;
> >> >> >> pip
> >> >> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is
> >> >> >> > used in
> >> >> this
> >> >> >> > case.
> >> >> >> >
> >> >> >> > I have not yet tested well (but I will once they happen)
> >> >> >> > --eager
> >> >> >> upgrade of
> >> >> >> > dependencies (pip - very much depends but it's often in the
> >> >> >> > range of
> >> >> 10
> >> >> >> > minutes) - I expect it not to take more than 2-3 minutes with
> >> >> >> > uv
> >> >> >> >
> >> >> >> > So overall it looks like we are looking at those improvements:
> >> >> >> >
> >> >> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~
> >> >> >> > 1m (because we are using docker layer caching and pip
> >> >> >> > resolution and installation is not used at all)
> >> >> >> > 2) Updating dependencies: 8m with pip will probably go down
> >> >> >> > with uv
> >> >> to ~
> >> >> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there
> >> >> >> > are no
> >> >> >> remote
> >> >> >> > changes and cache is used (70% improvement)
> >> >> >> > 3) Re-resolving and reinstalling everything 27 m will
> >> >> >> > probably go
> >> >> down
> >> >> >> with
> >> >> >> > uv to ~ 9m => 67% improvements.
> >> >> >> >
> >> >> >> > If those numbers hold and the resolution quality will be
> >> >> >> > comparable
> >> >> to
> >> >> >> > `pip` - then well, it's definitely worth it - and the numbers
> >> >> >> > are
> >> >> very
> >> >> >> > close to what the `uv` authors claimed.
> >> >> >> >
> >> >> >> > I am impressed :)
> >> >> >> >
> >> >> >> > J.
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <
> >> >> amoghdesai.oss@gmail.com>
> >> >> >> > wrote:
> >> >> >> >
> >> >> >> > > I agree with Niko here.
> >> >> >> > >
> >> >> >> > > If someone is willing to give it a try, we should enable it
> >> >> >> > experimentally
> >> >> >> > > and give it a stint for a couple of weeks. If we see
> >> >> >> > > significant
> >> >> >> results,
> >> >> >> > > we can adopt it.
> >> >> >> > >
> >> >> >> > > Thanks & Regards,
> >> >> >> > > Amogh Desai
> >> >> >> > >
> >> >> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
> >> >> >> > <onikolas@amazon.com.invalid
> >> >> >> > > >
> >> >> >> > > wrote:
> >> >> >> > >
> >> >> >> > > > The Astral folks also seem very focused on it being a
> >> >> >> drop-in/compliant
> >> >> >> > > > replacement for pip. So I think it's definitely worth
> >> >> >> > > > dropping
> >> >> it in
> >> >> >> > and
> >> >> >> > > > seeing if we get the expected performance improvements.
> >> >> >> > > > If tests
> >> >> >> still
> >> >> >> > > pass
> >> >> >> > > > and user facing constraints and install instructions
> >> >> >> > > > remain
> >> >> >> unchanged I
> >> >> >> > > > don't see why not, if someone is willing to spend the
> >> >> >> > > > time on
> >> it.
> >> >> >> Never
> >> >> >> > > > mind the extra features it would give us (I, like others,
> >> >> >> > > > am also
> >> >> >> very
> >> >> >> > > > excited about --resolution=lowest, ability).
> >> >> >> > > >
> >> >> >> > > > ________________________________
> >> >> >> > > > From: Andrey Anshin <an...@taragol.is>
> >> >> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
> >> >> >> > > > To: dev@airflow.apache.org
> >> >> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS]
> >> >> >> > > > Considering
> >> >> >> trying
> >> >> >> > > > out uv for our CI workflows
> >> >> >> > > >
> >> >> >> > > > CAUTION: This email originated from outside of the
> >> organization.
> >> >> Do
> >> >> >> not
> >> >> >> > > > click links or open attachments unless you can confirm
> >> >> >> > > > the sender
> >> >> >> and
> >> >> >> > > know
> >> >> >> > > > the content is safe.
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un
> >> >> >> > > > expéditeur
> >> >> >> > externe.
> >> >> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe
> >> >> >> > > > si
> >> >> vous ne
> >> >> >> > > pouvez
> >> >> >> > > > pas confirmer l’identité de l’expéditeur et si vous
> >> >> >> > > > n’êtes pas
> >> >> >> certain
> >> >> >> > > que
> >> >> >> > > > le contenu ne présente aucun risque.
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > > > I share Andrey's skepticism. It's just yet another tool
> >> >> >> > > > > which
> >> >> has
> >> >> >> an
> >> >> >> > > > unclear
> >> >> >> > > > development strategy.
> >> >> >> > > >
> >> >> >> > > > My point was more about a matter of presentation. If
> >> >> >> > > > someone told
> >> >> >> you
> >> >> >> > > "this
> >> >> >> > > > is a new tool, like a killer of previous tools" then you
> >> >> >> > > > might
> >> >> think
> >> >> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...
> >> >> >> > > > not
> >> >> >> really
> >> >> >> > > > interesting". On the other hand if someone told you what
> >> >> >> > > > in cases
> >> >> >> you
> >> >> >> > > might
> >> >> >> > > > solve, then this might be a mind changer.
> >> >> >> > > >
> >> >> >> > > > Especially the promising `--resolution=lowest` option. We
> >> >> >> > > > always
> >> >> >> want
> >> >> >> > to
> >> >> >> > > > test something with minimal dependencies because we are
> >> >> >> > > > not sure
> >> >> >> that
> >> >> >> > it
> >> >> >> > > > might work with pretty old dependencies, and recently
> >> >> >> > > > I've
> >> >> started
> >> >> >> to
> >> >> >> > > work
> >> >> >> > > > on POC to collect minimal versions of the Airflow and
> >> Providers.
> >> >> >> And at
> >> >> >> > > the
> >> >> >> > > > moment when I almost finished it the uv was released.
> >> >> >> > > > Well
> >> >> >> sometimes it
> >> >> >> > > is
> >> >> >> > > > better to wait a bit and maybe someone would invent the
> >> >> >> > > > same solution 😁 and you don't have to spend a personal
> time.
> >> >> >> > > >
> >> >> >> > > > So as POC I'm on it, we still need a `pip` and validate
> >> >> >> > > > some
> >> >> stuff
> >> >> >> by a
> >> >> >> > > pip
> >> >> >> > > > because it is only one officially supported way to
> >> >> >> > > > install
> >> >> Airflow
> >> >> >> but
> >> >> >> > if
> >> >> >> > > > something could be improved in the CI then I'm on it, in
> >> >> >> > > > most
> >> >> cases
> >> >> >> it
> >> >> >> > > > would be behind of Breeze and many of the contributors
> >> >> >> > > > might be
> >> >> even
> >> >> >> > not
> >> >> >> > > > noticed that something changed.
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk
> >> >> >> > > > <ja...@potiuk.com>
> >> >> >> wrote:
> >> >> >> > > >
> >> >> >> > > > > Actually - of you read that blog post, the strategy is
> >> >> >> > > > > clear -
> >> >> >> they
> >> >> >> > aim
> >> >> >> > > > to
> >> >> >> > > > > create a comprehensive packaging tooling and improvnts
> >> >> >> > > > > are
> >> >> >> measured
> >> >> >> > > > (80-100
> >> >> >> > > > > times they claim - I using caching - they (unlike pip)
> >> >> >> > > > > use a
> >> >> lot
> >> >> >> of
> >> >> >> > > local
> >> >> >> > > > > caching including resolving  dependencies).
> >> >> >> > > > >
> >> >> >> > > > > So I think both arguments are not valid if you ask me.
> >> >> >> > > > >
> >> >> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
> >> >> >> > kxepal@apache.org
> >> >> >> > > >
> >> >> >> > > > > napisał:
> >> >> >> > > > >
> >> >> >> > > > > > I share Andrey's skepticism. It's just yet another
> >> >> >> > > > > > tool which
> >> >> >> has
> >> >> >> > an
> >> >> >> > > > > > unclear development strategy. Should you make it a
> >> >> >> > > > > > free
> >> >> testing
> >> >> >> > > suite?
> >> >> >> > > > > What
> >> >> >> > > > > > project would receive in exchange? A lot of words
> >> >> >> > > > > > about being
> >> >> >> > faster,
> >> >> >> > > > but
> >> >> >> > > > > > how much? Are these milliseconds worth to change the
> >> >> >> > > > > > stable
> >> >> tool
> >> >> >> > > with a
> >> >> >> > > > > new
> >> >> >> > > > > > one? And will it notably improve something?
> >> >> >> > > > > >
> >> >> >> > > > > > I think it's worth to try it just for fun and provide
> >> >> feedback,
> >> >> >> but
> >> >> >> > > > it'll
> >> >> >> > > > > > have to pass a long road to become such stable as pip.
> >> >> >> > > > > >
> >> >> >> > > > > > --
> >> >> >> > > > > > ,,,^..^,,,
> >> >> >> > > > > >
> >> >> >> > > > > >
> >> >> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
> >> >> jarek@potiuk.com>
> >> >> >> > > wrote:
> >> >> >> > > > > >
> >> >> >> > > > > > > My opinion:
> >> >> >> > > > > > >
> >> >> >> > > > > > > I think there is a place for a number of such tools.
> >> >> >> > > > > > > For a
> >> >> >> long
> >> >> >> > > time
> >> >> >> > > > > the
> >> >> >> > > > > > > packaging team and `pip` team have been working not
> >> >> >> > > > > > > only on
> >> >> >> `pip`
> >> >> >> > > > > > > implementation but also (and most importantly) to
> >> >> >> > > > > > > make sure
> >> >> >> that
> >> >> >> > > what
> >> >> >> > > > > > `pip`
> >> >> >> > > > > > > does is to be the beacon of standardisation of
> >> >> >> > > > > > > packaging
> >> >> APIs
> >> >> >> and
> >> >> >> > > > PEPs.
> >> >> >> > > > > > It
> >> >> >> > > > > > > will never IMHO have a lot of the fancy features
> >> >> >> > > > > > > that other
> >> >> >> tools
> >> >> >> > > > might
> >> >> >> > > > > > > provide (like the ones I mentioned). It will always
> >> >> >> > > > > > > be
> >> >> there
> >> >> >> to
> >> >> >> > > > provide
> >> >> >> > > > > > the
> >> >> >> > > > > > > robust and solid CLI to run all packaging things,
> >> >> >> > > > > > > but there
> >> >> >> are
> >> >> >> > > > plenty
> >> >> >> > > > > of
> >> >> >> > > > > > > opportunities to provide improved or modified, or
> >> >> >> > > > > > > more (or
> >> >> >> less)
> >> >> >> > > > > > > opinionated ways of doing things that are
> >> >> >> > > > > > > addressing some
> >> >> >> cases
> >> >> >> > > that
> >> >> >> > > > > > `pip`
> >> >> >> > > > > > > team simply will not be able or willing to handle,
> >> >> preferring
> >> >> >> > > "pure"
> >> >> >> > > > > > > standard approach vs. implement all the optional
> things.
> >> >> For
> >> >> >> > > example
> >> >> >> > > > > the
> >> >> >> > > > > > > way how pre-releases are handled can be improved to
> >> >> >> > > > > > > be more
> >> >> >> > > > selective.
> >> >> >> > > > > > The
> >> >> >> > > > > > > PEP describing it gives the tools an option to add
> >> >> >> > > > > > > more
> >> >> fancy
> >> >> >> > > > > behaviours
> >> >> >> > > > > > > (some of which we could find useful in our CI
> tooling).
> >> >> Should
> >> >> >> > > `pip`
> >> >> >> > > > > > > implement those - I don't think so. It would
> >> >> >> > > > > > > distract
> >> >> >> maintainers
> >> >> >> > > > from
> >> >> >> > > > > > > other more important things. It is quite ok to use
> >> >> >> > > > > > > other
> >> >> >> tooling
> >> >> >> > in
> >> >> >> > > > > > places
> >> >> >> > > > > > > like our CI, where they do some parts of the
> >> >> >> > > > > > > installation
> >> >> >> better.
> >> >> >> > > > > > >
> >> >> >> > > > > > > For me `pip` is going more into the direction of
> >> >> >> > > > > > > `usable
> >> >> >> > reference
> >> >> >> > > > > > > implementation of package installed` - any
> >> >> >> > > > > > > standard/ PEP
> >> >> will
> >> >> >> not
> >> >> >> > > > > matter
> >> >> >> > > > > > if
> >> >> >> > > > > > > `pip` does not implement it. But others might go in
> >> >> different
> >> >> >> > > > > directions
> >> >> >> > > > > > > and implement some less popular features and do it
> >> >> >> > > > > > > better,
> >> >> >> > faster,
> >> >> >> > > > with
> >> >> >> > > > > > > greater flexibility. IMHO it's a win-win.
> >> >> >> > > > > > >
> >> >> >> > > > > > > J.
> >> >> >> > > > > > >
> >> >> >> > > > > > >
> >> >> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> >> >> >> > > > > andrey.anshin@taragol.is
> >> >> >> > > > > > >
> >> >> >> > > > > > > wrote:
> >> >> >> > > > > > >
> >> >> >> > > > > > > > Yesterday my friend shared with me that tool and
> >> >> >> > > > > > > > I've
> >> >> been
> >> >> >> told
> >> >> >> > > > that
> >> >> >> > > > > > more
> >> >> >> > > > > > > > presumably it would be a niche tool. I've been
> >> >> >> > > > > > > > told "who
> >> >> >> needs
> >> >> >> > > yet
> >> >> >> > > > > > > another
> >> >> >> > > > > > > > installer which stands to resolve all your problems'
> >> '.
> >> >> >> > > > > > > > I guess I was wrong?
> >> >> >> > > > > > > >
> >> >> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
> >> >> >> jarek@potiuk.com>
> >> >> >> > > > wrote:
> >> >> >> > > > > > > >
> >> >> >> > > > > > > > > Hey everyone,
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > Few days ago the ruff creators have released a
> >> >> >> > > > > > > > > new tool
> >> >> >> uv -
> >> >> >> > > > which
> >> >> >> > > > > is
> >> >> >> > > > > > > an
> >> >> >> > > > > > > > > extremely fast (written in rust) and fully
> >> >> >> > > > > > > > > featured
> >> >> tool
> >> >> >> > > > generally
> >> >> >> > > > > > > fully
> >> >> >> > > > > > > > > compatible with `pip`.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > Blog post here:
> >> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/
> >> >> >> > > > > > > > > ?ur
> >> >> >> > > > > > > > > l=https%3A%2F%2Fastral.sh%2Fblog%2Fuv&data=05%7
> >> >> >> > > > > > > > > C02
> >> >> >> > > > > > > > > %7CJens.Scheffler%40de.bosch.com%7Cda57d392cf6a
> >> >> >> > > > > > > > > 479
> >> >> >> > > > > > > > > 9ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58
> >> >> >> > > > > > > > > 410
> >> >> >> > > > > > > > > f4%7C0%7C0%7C638445308555414247%7CUnknown%7CTWF
> >> >> >> > > > > > > > > pbG
> >> >> >> > > > > > > > > Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT
> >> >> >> > > > > > > > > iI6
> >> >> >> > > > > > > > > Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=KTqeT
> >> >> >> > > > > > > > > xus
> >> >> >> > > > > > > > > gSBxgBClVc8LhjvPCJAhcmlkXM%2FK%2B53EzYM%3D&rese
> >> >> >> > > > > > > > > rve
> >> >> >> > > > > > > > > d=0
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > It looks like It has a number of things that
> >> >> >> > > > > > > > > would make
> >> >> >> our
> >> >> >> > CI
> >> >> >> > > > > cases
> >> >> >> > > > > > > and
> >> >> >> > > > > > > > > tooling quite a bit faster and better including
> >> >> >> > > > > > > > > a few
> >> >> >> things
> >> >> >> > > > that I
> >> >> >> > > > > > > have
> >> >> >> > > > > > > > > implemented some workarounds for and some that
> >> >> >> > > > > > > > > I have
> >> >> not
> >> >> >> > > > > > > > > implemented because `pip` had no good solution.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > I looked at the docs and it solves some
> >> >> >> > > > > > > > > problems that
> >> >> are
> >> >> >> > > > currently
> >> >> >> > > > > > > > > difficult or impossible to handle with `pip`:
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > * ability to use overrides (which are
> >> >> >> > > > > > > > > constraints on
> >> >> >> > steroids -
> >> >> >> > > > > > > allowing
> >> >> >> > > > > > > > to
> >> >> >> > > > > > > > > override limits specified by the packages -
> >> >> >> > > > > > > > > this will
> >> >> be
> >> >> >> very
> >> >> >> > > > > useful
> >> >> >> > > > > > to
> >> >> >> > > > > > > > > better handle our cases with "chicken-egg"
> >> >> >> > > > > > > > > providers
> >> >> (for
> >> >> >> > > example
> >> >> >> > > > > > like
> >> >> >> > > > > > > we
> >> >> >> > > > > > > > > had in FAB) where we have pre-release packages
> >> >> depending
> >> >> >> on
> >> >> >> > > each
> >> >> >> > > > > > other
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > * different resolution strategies including
> >> >> >> > --resolution=lowest
> >> >> >> > > > > which
> >> >> >> > > > > > > > will
> >> >> >> > > > > > > > > finally allow us to see whether airflow's lower
> >> >> >> > > > > > > > > bounds
> >> >> are
> >> >> >> > > still
> >> >> >> > > > > > > holding
> >> >> >> > > > > > > > > (i.e. - will our test still pass if we use the
> >> >> >> > > > > > > > > lowest
> >> >> >> > supported
> >> >> >> > > > > > version
> >> >> >> > > > > > > > of
> >> >> >> > > > > > > > > our dependencies?  this is something i wanted
> >> >> >> > > > > > > > > to do for
> >> >> >> quite
> >> >> >> > > > some
> >> >> >> > > > > > time
> >> >> >> > > > > > > > and
> >> >> >> > > > > > > > > recorded an issue for that -
> >> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/
> >> >> >> > > > > > > > > ?ur
> >> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2
> >> >> >> > > > > > > > > Fis
> >> >> >> > > > > > > > > sues%2F35549&data=05%7C02%7CJens.Scheffler%40de
> >> >> >> > > > > > > > > .bo
> >> >> >> > > > > > > > > sch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0a
> >> >> >> > > > > > > > > e51
> >> >> >> > > > > > > > > e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308
> >> >> >> > > > > > > > > 555
> >> >> >> > > > > > > > > 418852%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> >> >> >> > > > > > > > > MDA
> >> >> >> > > > > > > > > iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> >> >> >> > > > > > > > > %7C
> >> >> >> > > > > > > > > 0%7C%7C%7C&sdata=Nz7du2MmavpWhHcFFfd8Qj2SbKWZcm
> >> >> >> > > > > > > > > Xxs
> >> >> >> > > > > > > > > OlfMGgftwQ%3D&reserved=0 but lack of tooling
> >> >> >> > > > > > > > > support made it a wish, with
> >> >> >> > > > > > `--resolution=lowest`
> >> >> >> > > > > > > it
> >> >> >> > > > > > > > > seems like super-easy thing to do.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > * It is said to be many, many times faster -
> >> >> >> > > > > > > > > with
> >> >> better
> >> >> >> > > caching
> >> >> >> > > > > and
> >> >> >> > > > > > > > > resolution speeds (similarly like with ruff
> >> >> >> > > > > > > > > they claim
> >> >> >> orders
> >> >> >> > > of
> >> >> >> > > > > > > > magnitude
> >> >> >> > > > > > > > > speedups in a number of cases). We can likely
> >> >> >> > > > > > > > > make very
> >> >> >> good
> >> >> >> > > use
> >> >> >> > > > of
> >> >> >> > > > > > it
> >> >> >> > > > > > > > and
> >> >> >> > > > > > > > > speed up some parts of our CI workflow
> >> significantly.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > I might likely do some experimenting with uv in
> >> >> >> > > > > > > > > our
> >> >> >> > toolchain,
> >> >> >> > > > but
> >> >> >> > > > > > > wanted
> >> >> >> > > > > > > > > to make sure we are all aware of it - and ask
> >> >> >> > > > > > > > > if
> >> >> someone
> >> >> >> has
> >> >> >> > > > > > something
> >> >> >> > > > > > > > > against it (and maybe someone would like to do
> >> >> >> > > > > > > > > some
> >> >> work
> >> >> >> > there
> >> >> >> > > > > trying
> >> >> >> > > > > > > it
> >> >> >> > > > > > > > > out - I will be happy to guide others with the
> >> >> dev/tooling
> >> >> >> > > > mindset
> >> >> >> > > > > > and
> >> >> >> > > > > > > > > incline to do some changes there/review PRs and
> >> >> cooperate
> >> >> >> on
> >> >> >> > > > > testing
> >> >> >> > > > > > > > those
> >> >> >> > > > > > > > > things.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > It's not a user-facing change, and I do not
> >> >> >> > > > > > > > > think we
> >> >> want
> >> >> >> to
> >> >> >> > > get
> >> >> >> > > > > rid
> >> >> >> > > > > > of
> >> >> >> > > > > > > > > `pip` as an installation tool in general (in
> >> >> >> > > > > > > > > our images
> >> >> >> and
> >> >> >> > > user
> >> >> >> > > > > > facing
> >> >> >> > > > > > > > > side) - it's mostly an internal CI tooling
> >> >> >> > > > > > > > > improvement
> >> >> I
> >> >> >> am
> >> >> >> > > > > thinking
> >> >> >> > > > > > > of.
> >> >> >> > > > > > > > > Maybe at some point in time we can recommend it
> >> >> >> > > > > > > > > also
> >> >> for
> >> >> >> > > > > development
> >> >> >> > > > > > > > > workflows, and maybe someday it will gain
> >> >> >> > > > > > > > > enough
> >> >> >> popularity
> >> >> >> > to
> >> >> >> > > > > think
> >> >> >> > > > > > > > about
> >> >> >> > > > > > > > > recommending it to our users, but definitely
> >> >> >> > > > > > > > > not now
> >> >> nor
> >> >> >> in
> >> >> >> > > even
> >> >> >> > > > > > > mid-term
> >> >> >> > > > > > > > > future.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > Let me know what you think.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > Repo here:
> >> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/
> >> >> >> > > > > > > > > ?ur
> >> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fastral-sh%2Fuv&dat
> >> >> >> > > > > > > > > a=0
> >> >> >> > > > > > > > > 5%7C02%7CJens.Scheffler%40de.bosch.com%7Cda57d3
> >> >> >> > > > > > > > > 92c
> >> >> >> > > > > > > > > f6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d6
> >> >> >> > > > > > > > > 48e
> >> >> >> > > > > > > > > e58410f4%7C0%7C0%7C638445308555424433%7CUnknown
> >> >> >> > > > > > > > > %7C
> >> >> >> > > > > > > > > TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> >> >> >> > > > > > > > > iLC
> >> >> >> > > > > > > > > JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata
> >> >> >> > > > > > > > > =Ln
> >> >> >> > > > > > > > >
> >> XRuNo6aJwsLPWwbSJrls47%2BfqH2JSMpyt61h%2F0e1g%3D&reserved=0
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > > > J.
> >> >> >> > > > > > > > >
> >> >> >> > > > > > > >
> >> >> >> > > > > > >
> >> >> >> > > > > >
> >> >> >> > > > >
> >> >> >> > > >
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> >> For additional commands, e-mail: dev-help@airflow.apache.org
> >>
> >
> ________________________________
>  Strike Technologies, LLC (“Strike”) is part of the GTS family of
> companies. Strike is a technology solutions provider, and is not a broker
> or dealer and does not transact any securities related business directly
> whatsoever. This communication is the property of Strike and its
> affiliates, and does not constitute an offer to sell or the solicitation of
> an offer to buy any security in any jurisdiction. It is intended only for
> the person to whom it is addressed and may contain information that is
> privileged, confidential, or otherwise protected from disclosure.
> Distribution or copying of this communication, or the information contained
> herein, by anyone other than the intended recipient is prohibited. If you
> have received this communication in error, please immediately notify Strike
> at info@striketechnologies.com, and delete and destroy any copies hereof.
> ________________________________
>
> CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any attachments
> are intended solely for the addressee. This transmission is covered by the
> Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The
> information contained in this transmission is confidential in nature and
> protected from further use or disclosure under U.S. Pub. L. 106-102, 113
> U.S. Stat. 1338 (1999), and may be subject to attorney-client or other
> legal privilege. Your use or disclosure of this information for any purpose
> other than that intended by its transmittal is strictly prohibited, and may
> subject you to fines and/or penalties under federal and state law. If you
> are not the intended recipient of this transmission, please DESTROY ALL
> COPIES RECEIVED and confirm destruction to the sender via return
> transmittal.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> For additional commands, e-mail: dev-help@airflow.apache.org
>

RE: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Damian Shaw <ds...@striketechnologies.com>.
Btw, I've been testing how well the resolver works for uv based on many real world examples I have built up over the years improving the pip resolver, it still can't resolve apache-airflow[all]: https://github.com/astral-sh/uv/issues/1560. I have recommended heuristics that uv could implement which I think would help here.

But one interesting idea that airflow could look at is periodically running test cases against either "--resolution=lowest" or "--resolution=lowest-direct" to try and flush out bad old dependencies?

Damian

-----Original Message-----
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Monday, February 26, 2024 7:45 AM
To: dev@airflow.apache.org
Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows

And merged. I will keep an eye on it for the next few days.

On Mon, Feb 26, 2024 at 11:47 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Yes. The difference was because of caching. I forgot to mention that.
> This is due to the way our CI image docker optimisation works
>
> The way how the image is constructed is that it first installs
> dependencies from https://github.com/apache/airflow to save on any
> future re-installs - using docker caching mechanism (it works the same
> for `pip` and `uv pip`.
>
> It works roughly like this:
>
> LAYER 1: pip install "apache-airflow[devel-ci] @
> https://github.com/apache/airflow/archive/main.tar.gz <- here we
> install all airflow dependencies in the main AT THE MOMENT THIS LAYER
> WAS BUILT LAYER 2: COPY pyproject.toml LAYER 3: pip install
> ".[devel-ci]" --constraints https://.....
>
> It is a very nice way of caching and speeding up adding new
> dependencies I introduced years ago that works very nicely for us for
> remote builds (so local breeze builds are making use of it as well) -
> this means that ALL dependencies of airflow will be pre-installed as a
> cached layer in the image, regardless of modification in
> pyproject.toml. So whenever someone modifies pyproject.toml and adds
> new dependencies or modifies the existing ones - LAYER 1 will NOT be
> invalidated, but LAYER 2 (and LAYER 3) will - which means that the
> LAYER 3 pip install will only install new dependencies (and it will use latest constraints for that).
>
> LAYER 1 is only modified when:
>
> a) Python base image changes (every few weeks)
> b) Docker scripts change (those scripts that are COPIED before that
> layer
> - so for example airlfow installation scripts).
> c) DEPENDENCY_EPOCH change -> we can manually bump it to force the
> reinstallation after we remove some dependency, to make sure it is
> regenerated
>
> This has the side effect that when you add or modify a dependency, it
> is very fast - instead of reinstalling all 600+ dependencies, they are
> already installed and you only get dif Another side effect of it that
> the image (between Python base image updates or epoch/docker script
> changes) is that the image gets ever so bigger - every time new
> constraints are update and cache rebuilt and constraints updated, the
> image gets updated with new dependencies in main, incrementally adding
> changed (and only those changed ones) dependencies in LAYER 3. So I
> was comparing an image where LAYER 1 was created some time ago with
> pip (and LAYER 3 got bigger) with pretty much "From the scratch"
> image  where LAYER 1 was "latest deps" and LAYER 3 has almost no
> updated/ new dependencies.
>
> That explains the difference. The new image will also get slightly
> bigger in the next few days or weeks, until a new Python base image is
> released or we will update the scripts.
>
> Also - all tests pass and that's most important. The CI image is
> exclusively used to run tests, it's not used in production. The
> production image is still using `pip` (I had some problems with PROD
> image building with uv - because it expects virtualenv. rather than
> --user installation of
> ours) . We might want to fix it some time in the future (and uv might
> add it as a feature in the meantime) - let's give `uv` some time to settle :).
>
> So - it has no impact on the user-facing side (at all).
>
> Re: dependencies after `uv` was successful have been updated here:
> https://github.com/apache/airflow/commit/fd64235a481adb4aaff1b2f432eac
> eb9d0b5c53c
> in our constraints 2 hrs ago.
>
> As you can see - the changes are "as expected" - there are a few
> dependencies bumped since yesterday (correctly picked up by uv
> --highest resolution mechanism). The `uv pip freeze` command of uv for
> now uses original, non-canonical names of packages - but the original
> now (underscores instead of dashes) - but that's perfectly fine, those
> packages get canonical names. This will likely get changed in the future.
>
> J.
>
>
>
>
>
>
> On Mon, Feb 26, 2024 at 10:18 AM Scheffler Jens (XC-AS/EAE-ADA-T)
> <Je...@de.bosch.com.invalid> wrote:
>
>> @Jarek, had no time to review PR.
>> If the Docker image is ~400MB smaller, I fear there is a diff. Were
>> you able to dump a file list to inspect the diff?
>> If not I would propose to make it in the PR to understand "why". If
>> there care cache files (only) then in general it would make sense to
>> think about if "cache/garbage" is anyway left in pip/uv which we
>> should clean to shrink images.
>>
>> Mit freundlichen Grüßen / Best regards
>>
>> Jens Scheffler
>>
>> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T) Robert Bosch GmbH |
>> Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY |
>> http://www.bosch.com/ Tel. +49 711 811-91508 | Mobil +49 160 90417410
>> | Jens.Scheffler@de.bosch.com
>>
>> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
>> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
>> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr.
>> Markus Forschner, Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer,
>> Dr. Tanja Rückert
>>
>> -----Original Message-----
>> From: Jarek Potiuk <ja...@potiuk.com>
>> Sent: Montag, 26. Februar 2024 08:54
>> To: Amogh Desai <am...@gmail.com>
>> Cc: dev@airflow.apache.org
>> Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows
>>
>> Yep. It all looks good now and I re-ran last intermittently failing job:
>> Final effect of it:
>>
>> * CI image (uncompressed) with uv is slightly smaller (3.5 GB vs. 3.9
>> GB)
>> * regular code only PRs: same time to incrementally build image ~ 1m
>> * adding/modifying dependency in the PR:: 12 m  -> 6m : 50%
>> improvement
>> * removing dependency/rebuilding things from scratch -> 27m -> 12 m :
>> 55% improvement
>>
>> Depending on the speed of your network, also locally rebuilding your
>> image should be generally much faster in all cases once we merge it
>> and update cache.
>>
>> Also the flaky test turned out to be really just "sometimes running
>> much slower than expected" case - I increased the number of retries
>> and gave the test a bit more time and added better message, so
>> hopefully the flaky test will stop happening now.
>>
>> I think it's a no-brainer :).
>> https://github.com/apache/airflow/pull/37692 waiting for reviews
>>
>> J.
>>
>>
>>
>> On Mon, Feb 26, 2024 at 4:50 AM Amogh Desai
>> <am...@gmail.com>
>> wrote:
>>
>> > Thanks for the superb investigation and effort @Jarek Potiuk
>> > <ja...@potiuk.com>!
>> >
>> > I quite like the performance improvement numbers uv brings in
>> > compared to pip.
>> > I see no reason not to switch to UV in prod images as well.
>> >
>> > I will take a look at the pull request soon.
>> >
>> > Thanks & Regards,
>> > Amogh Desai
>> >
>> > On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >
>> >> I think I will get it green finally:
>> >> https://github.com/apache/airflow/pull/37692.
>> >>
>> >> I know where the test flakiness was from. Generally speaking it
>> >> turned out that there is no free lunch and - of course - cache
>> >> from uv increased our CI image size significantly (by around 1.5G)
>> >> - and it caused much slower test execution (and test became more
>> >> flaky because of that). So after looking at that I decided to
>> >> disable the cache - it's definitely not worth it to increase the
>> >> size of our images that much. We still have significant (50% - 60%
>> >> improvements - not the 60% - 70% like we had with cache), but it's
>> >> still significant enough. Without cache the "upgrade scenario is ~
>> >> 40s (so no 4s any
>> >> more) instead of 7m with pip - so this is still a huge improvement
>> >> (image size is even smaller than the one with `pip`).
>> >>
>> >>
>> >> J,
>> >>
>> >>
>> >>
>> >> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >>
>> >> > Some more findings.
>> >> >
>> >> > Overall, I can confirm that with `uv` we will get significant -
>> >> > 60
>> >> > - 70% on build image times. This will impact both CI but also
>> >> > `breeze` local rebuilds.
>> >> >
>> >> > I am getting closer to a mergeable state. I switched to
>> >> > https://g/ ithub.com
>> %2Fapache%2Fairflow%2Fpull%2F37692&data=05%7C02%7CJens.Scheffler%
>> 40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bb
>> b6d648ee58410f4%7C0%7C0%7C638445308555397453%7CUnknown%7CTWFpbGZsb3d8
>> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>> 0%7C%7C%7C&sdata=QD6nJwgUu5Hncvb3Vu%2F0WCbL2iYXzxu3z8N7xYhcjAg%3D&res
>> erved=0 to test "upgrade to latest dependencies" workflow and canary
>> build impact.
>> >> >
>> >> > The PR is getting greener and greener. I have a few last things
>> >> > to address.
>> >> >
>> >> > An interesting story is that a flaky test in CLI
>> >> >
>> >> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::t
>> >> est
>> >> _cli_webserver_background)
>> >> > we had is suddenly significantly more flaky, so I will have to
>> >> > take a
>> >> look
>> >> > at how to finally remove the flakiness from it.
>> >> > This is a good thing because this test had been flaky for quite
>> >> > a while but it was very difficult to reproduce and seems that
>> >> > for some reason
>> >> it is
>> >> > now much easier to reproduce (which also means we will know when
>> >> > we fix
>> >> it0.
>> >> >
>> >> > Looking at stats it seems that a lot  (but not all) of the speed
>> >> > improvement might come with Parallel downloading of dependencies
>> >> > - which are in the works also for pip ( https://g/
>> >> > ithub.com%2Fpypa%2Fpip%2Fpull%2F12388&data=05%7C02%7CJens.Scheff
>> >> > ler
>> >> > %40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c
>> >> > 84e
>> >> > 4bbb6d648ee58410f4%7C0%7C0%7C638445308555401666%7CUnknown%7CTWFp
>> >> > bGZ
>> >> > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI
>> >> > 6Mn
>> >> > 0%3D%7C0%7C%7C%7C&sdata=Ycl1VKKK3Rb6iMVLq4kX3OToJXe119GlfUBE8DXK
>> >> > 9dc
>> >> > %3D&reserved=0) - though it's not clear how
>> >> much
>> >> > it will help as the Batch Dowloader in pip is involved only
>> >> > after resolution. We will see after it is implemented if it changes things.
>> >> >
>> >> > I am also now switching PROD builds to use uv to see how much we
>> >> > can
>> >> save,
>> >> > but I leave `pip` as default for releases and users, the only
>> >> difference is
>> >> > CI - I've added separate step for `pip` PROD build to compare
>> >> > and to
>> >> make
>> >> > sure it's running fine in CI.
>> >> >
>> >> > The numbers:
>> >> >
>> >> > * for "upgrade to newer dependencies" scenario - uv is WAY
>> >> > faster - as I thought. In the "current" stage of the main it is:
>> >> > ~7m pip, 5 s
>> (!) uv.
>> >> > Here caching of uv makes a huge difference, and while there is
>> >> > some
>> >> work in
>> >> > `pip` and resolvelib (looking at PRs/issues) it's going to be
>> >> > quite some time to get similar results from pip and "upgrade"
>> >> > builds will go down eventually from 12m to 5 m - which is a
>> >> > major improvement - especially
>> >> for
>> >> > elapsed time of CI builds.
>> >> >
>> >> > * from what I see package installation is super-fast in uv.
>> >> > Installing
>> >> 614
>> >> > packages takes (wait for it) 1s (!) where I saw it taking way
>> >> > over a
>> >> minute
>> >> > with `pip`. This will be hard to beat I think with Python vs. rust.
>> >> >
>> >> > Some notes about differences I saw:
>> >> >
>> >> > PIP and UV lead to slightly different resolutions when upgrading.
>> >> > This
>> >> is
>> >> > not a surprise because different heuristics are involved (the
>> >> > resolution algorithm is np-complete
>> >> > (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F
>> >> > %2F
>> >> > research.swtch.com%2Fversion-sat&data=05%7C02%7CJens.Scheffler%4
>> >> > 0de
>> >> > .bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4b
>> >> > bb6
>> >> > d648ee58410f4%7C0%7C0%7C638445308555405774%7CUnknown%7CTWFpbGZsb
>> >> > 3d8
>> >> > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
>> >> > 3D%
>> >> > 7C0%7C%7C%7C&sdata=2yp6zlzYsFMa2Qfua6T62ADSn5q2A8hUNSVSVd3VC8Q%3
>> >> > D&r
>> >> > eserved=0)  and it's very inefficient to run the full
>> >> > resolution, so both pip and uv
>> >> take a
>> >> > little different approach for shortcuts and limiting the
>> >> > possible space
>> >> of
>> >> > solutions. I've done a few PRs limiting (lower-bound) some
>> >> > dependencies
>> >> to
>> >> > bring them closer) - but at the end what we get is "correct" in
>> >> > both
>> >> cases
>> >> > - I continue running `pip check` to make sure that whatever UV
>> >> > finds is also correct according to `pip`. Nothing really major
>> >> > there. There were literally few cases that required some manual
>> >> > adjustments. Nothing unmanageable also in the future, I was
>> >> > doing similar tweaks with `pip`
>> >> as
>> >> > well to help with the resolution.
>> >> >
>> >> > Example of differences (left. first is pip, right, second is uv)
>> >> >
>> >> > < importlib-resources==5.13.0
>> >> > ---
>> >> > > importlib-resources==6.1.1
>> >> >
>> >> > vs.
>> >> >
>> >> > < pycountry==23.12.11
>> >> > ---
>> >> > > pycountry==22.3.5
>> >> >
>> >> > It means that with `uv` we have a newer version of
>> >> > importlib_resources
>> >> but
>> >> > an older version of pycountry.
>> >> >
>> >> > This one I will handle by bumping pycountry in case of facebook
>> >> > provider and bump it to > 23.12 as the old version is 1.5 years old.
>> >> >
>> >> > J.
>> >> >
>> >> >
>> >> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala
>> >> > <hu...@awala.fr>
>> >> wrote:
>> >> >
>> >> >> That's impressive! I love this tool, not only for reducing CI
>> >> >> time but also for saving the environment.
>> >> >> Some of the previous improvements were to further parallelize
>> >> >> CI jobs
>> >> to
>> >> >> complete the CI faster, but this tool will help reduce the
>> >> >> overall
>> >> time.
>> >> >>
>> >> >> Big +1
>> >> >>
>> >> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk
>> >> >> <ja...@potiuk.com>
>> >> wrote:
>> >> >>
>> >> >> > Hello here.
>> >> >> >
>> >> >> > I have a PR
>> >> >> >
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>> >> >> > 2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F37683&data=05%7C02%7
>> >> >> > CJe
>> >> >> > ns.Scheffler%40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff
>> >> >> > 7%7
>> >> >> > C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C63844530855541012
>> >> >> > 5%7
>> >> >> > CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC
>> >> >> > JBT
>> >> >> > iI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=I8DF0ugyN53LKTOZ
>> >> >> > y8N
>> >> >> > dhKS%2FUKGdZuI9SoOVIwgx9MI%3D&reserved=0 that
>> >> >> implements:
>> >> >> >
>> >> >> > * ability to choose either uv or PIP when building our images
>> >> >> > * CI images are built with uv by default (but you can use
>> >> `--no-use-uv`
>> >> >> as
>> >> >> > a flag and switch back to `pip`
>> >> >> > * PROD images are built with pip by default (but you can us
>> >> `--use-uv`
>> >> >> as a
>> >> >> > flag an switch to uv
>> >> >> >
>> >> >> > The preliminary tests show indeed that uv not only has a much
>> >> >> > faster baseline, but  also their use of caching fits
>> >> >> > extremely well into our strategy of building images and we
>> >> >> > will get huge improvements of our
>> >> CI
>> >> >> > build timing when using uv.
>> >> >> >
>> >> >> > Just for the context - our CI images when built are using a
>> >> >> > caching strategy to optimise for f
>> >> >> >
>> >> >> > 1) fast building when there are no changes (around 1 minute
>> >> >> > to build
>> >> >> with
>> >> >> > pip),
>> >> >> > 2) slower building when someone adds or modifies
>> >> >> > non-conflicting
>> >> >> dependency
>> >> >> > (around. 8 minutes to build, out of which ~ 6 m is pip
>> >> >> > resolution and
>> >> >> > installation)
>> >> >> > 3) much longer build time when there are conflicting
>> >> >> > dependencies or
>> >> >> when
>> >> >> > we change Dockerfile or scripts or when Python base image
>> >> >> > changes
>> >> >> (around
>> >> >> > 27 minutes build out of which pip resolving is ~ 20m).
>> >> >> >
>> >> >> > Those are all `pip` numbers. Currently `pip` does not use
>> >> >> > resolution caching between the steps. Comparison of some
>> >> >> > basic installation
>> >> steps
>> >> >> from
>> >> >> > initial tests show that UV is way faster:
>> >> >> >
>> >> >> > * Resolving and Installing airflow with [devel-ci] (610
>> >> dependencies):
>> >> >> pip
>> >> >> > ~ 6m, uv ~ 1m 30 s
>> >> >> > * Re-resolving and reinstalling [devel-ci] using local
>> >> pyproject.toml;
>> >> >> pip
>> >> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is
>> >> >> > used in
>> >> this
>> >> >> > case.
>> >> >> >
>> >> >> > I have not yet tested well (but I will once they happen)
>> >> >> > --eager
>> >> >> upgrade of
>> >> >> > dependencies (pip - very much depends but it's often in the
>> >> >> > range of
>> >> 10
>> >> >> > minutes) - I expect it not to take more than 2-3 minutes with
>> >> >> > uv
>> >> >> >
>> >> >> > So overall it looks like we are looking at those improvements:
>> >> >> >
>> >> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~
>> >> >> > 1m (because we are using docker layer caching and pip
>> >> >> > resolution and installation is not used at all)
>> >> >> > 2) Updating dependencies: 8m with pip will probably go down
>> >> >> > with uv
>> >> to ~
>> >> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there
>> >> >> > are no
>> >> >> remote
>> >> >> > changes and cache is used (70% improvement)
>> >> >> > 3) Re-resolving and reinstalling everything 27 m will
>> >> >> > probably go
>> >> down
>> >> >> with
>> >> >> > uv to ~ 9m => 67% improvements.
>> >> >> >
>> >> >> > If those numbers hold and the resolution quality will be
>> >> >> > comparable
>> >> to
>> >> >> > `pip` - then well, it's definitely worth it - and the numbers
>> >> >> > are
>> >> very
>> >> >> > close to what the `uv` authors claimed.
>> >> >> >
>> >> >> > I am impressed :)
>> >> >> >
>> >> >> > J.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <
>> >> amoghdesai.oss@gmail.com>
>> >> >> > wrote:
>> >> >> >
>> >> >> > > I agree with Niko here.
>> >> >> > >
>> >> >> > > If someone is willing to give it a try, we should enable it
>> >> >> > experimentally
>> >> >> > > and give it a stint for a couple of weeks. If we see
>> >> >> > > significant
>> >> >> results,
>> >> >> > > we can adopt it.
>> >> >> > >
>> >> >> > > Thanks & Regards,
>> >> >> > > Amogh Desai
>> >> >> > >
>> >> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
>> >> >> > <onikolas@amazon.com.invalid
>> >> >> > > >
>> >> >> > > wrote:
>> >> >> > >
>> >> >> > > > The Astral folks also seem very focused on it being a
>> >> >> drop-in/compliant
>> >> >> > > > replacement for pip. So I think it's definitely worth
>> >> >> > > > dropping
>> >> it in
>> >> >> > and
>> >> >> > > > seeing if we get the expected performance improvements.
>> >> >> > > > If tests
>> >> >> still
>> >> >> > > pass
>> >> >> > > > and user facing constraints and install instructions
>> >> >> > > > remain
>> >> >> unchanged I
>> >> >> > > > don't see why not, if someone is willing to spend the
>> >> >> > > > time on
>> it.
>> >> >> Never
>> >> >> > > > mind the extra features it would give us (I, like others,
>> >> >> > > > am also
>> >> >> very
>> >> >> > > > excited about --resolution=lowest, ability).
>> >> >> > > >
>> >> >> > > > ________________________________
>> >> >> > > > From: Andrey Anshin <an...@taragol.is>
>> >> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
>> >> >> > > > To: dev@airflow.apache.org
>> >> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS]
>> >> >> > > > Considering
>> >> >> trying
>> >> >> > > > out uv for our CI workflows
>> >> >> > > >
>> >> >> > > > CAUTION: This email originated from outside of the
>> organization.
>> >> Do
>> >> >> not
>> >> >> > > > click links or open attachments unless you can confirm
>> >> >> > > > the sender
>> >> >> and
>> >> >> > > know
>> >> >> > > > the content is safe.
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un
>> >> >> > > > expéditeur
>> >> >> > externe.
>> >> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe
>> >> >> > > > si
>> >> vous ne
>> >> >> > > pouvez
>> >> >> > > > pas confirmer l’identité de l’expéditeur et si vous
>> >> >> > > > n’êtes pas
>> >> >> certain
>> >> >> > > que
>> >> >> > > > le contenu ne présente aucun risque.
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > > I share Andrey's skepticism. It's just yet another tool
>> >> >> > > > > which
>> >> has
>> >> >> an
>> >> >> > > > unclear
>> >> >> > > > development strategy.
>> >> >> > > >
>> >> >> > > > My point was more about a matter of presentation. If
>> >> >> > > > someone told
>> >> >> you
>> >> >> > > "this
>> >> >> > > > is a new tool, like a killer of previous tools" then you
>> >> >> > > > might
>> >> think
>> >> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...
>> >> >> > > > not
>> >> >> really
>> >> >> > > > interesting". On the other hand if someone told you what
>> >> >> > > > in cases
>> >> >> you
>> >> >> > > might
>> >> >> > > > solve, then this might be a mind changer.
>> >> >> > > >
>> >> >> > > > Especially the promising `--resolution=lowest` option. We
>> >> >> > > > always
>> >> >> want
>> >> >> > to
>> >> >> > > > test something with minimal dependencies because we are
>> >> >> > > > not sure
>> >> >> that
>> >> >> > it
>> >> >> > > > might work with pretty old dependencies, and recently
>> >> >> > > > I've
>> >> started
>> >> >> to
>> >> >> > > work
>> >> >> > > > on POC to collect minimal versions of the Airflow and
>> Providers.
>> >> >> And at
>> >> >> > > the
>> >> >> > > > moment when I almost finished it the uv was released.
>> >> >> > > > Well
>> >> >> sometimes it
>> >> >> > > is
>> >> >> > > > better to wait a bit and maybe someone would invent the
>> >> >> > > > same solution 😁 and you don't have to spend a personal time.
>> >> >> > > >
>> >> >> > > > So as POC I'm on it, we still need a `pip` and validate
>> >> >> > > > some
>> >> stuff
>> >> >> by a
>> >> >> > > pip
>> >> >> > > > because it is only one officially supported way to
>> >> >> > > > install
>> >> Airflow
>> >> >> but
>> >> >> > if
>> >> >> > > > something could be improved in the CI then I'm on it, in
>> >> >> > > > most
>> >> cases
>> >> >> it
>> >> >> > > > would be behind of Breeze and many of the contributors
>> >> >> > > > might be
>> >> even
>> >> >> > not
>> >> >> > > > noticed that something changed.
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk
>> >> >> > > > <ja...@potiuk.com>
>> >> >> wrote:
>> >> >> > > >
>> >> >> > > > > Actually - of you read that blog post, the strategy is
>> >> >> > > > > clear -
>> >> >> they
>> >> >> > aim
>> >> >> > > > to
>> >> >> > > > > create a comprehensive packaging tooling and improvnts
>> >> >> > > > > are
>> >> >> measured
>> >> >> > > > (80-100
>> >> >> > > > > times they claim - I using caching - they (unlike pip)
>> >> >> > > > > use a
>> >> lot
>> >> >> of
>> >> >> > > local
>> >> >> > > > > caching including resolving  dependencies).
>> >> >> > > > >
>> >> >> > > > > So I think both arguments are not valid if you ask me.
>> >> >> > > > >
>> >> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
>> >> >> > kxepal@apache.org
>> >> >> > > >
>> >> >> > > > > napisał:
>> >> >> > > > >
>> >> >> > > > > > I share Andrey's skepticism. It's just yet another
>> >> >> > > > > > tool which
>> >> >> has
>> >> >> > an
>> >> >> > > > > > unclear development strategy. Should you make it a
>> >> >> > > > > > free
>> >> testing
>> >> >> > > suite?
>> >> >> > > > > What
>> >> >> > > > > > project would receive in exchange? A lot of words
>> >> >> > > > > > about being
>> >> >> > faster,
>> >> >> > > > but
>> >> >> > > > > > how much? Are these milliseconds worth to change the
>> >> >> > > > > > stable
>> >> tool
>> >> >> > > with a
>> >> >> > > > > new
>> >> >> > > > > > one? And will it notably improve something?
>> >> >> > > > > >
>> >> >> > > > > > I think it's worth to try it just for fun and provide
>> >> feedback,
>> >> >> but
>> >> >> > > > it'll
>> >> >> > > > > > have to pass a long road to become such stable as pip.
>> >> >> > > > > >
>> >> >> > > > > > --
>> >> >> > > > > > ,,,^..^,,,
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
>> >> jarek@potiuk.com>
>> >> >> > > wrote:
>> >> >> > > > > >
>> >> >> > > > > > > My opinion:
>> >> >> > > > > > >
>> >> >> > > > > > > I think there is a place for a number of such tools.
>> >> >> > > > > > > For a
>> >> >> long
>> >> >> > > time
>> >> >> > > > > the
>> >> >> > > > > > > packaging team and `pip` team have been working not
>> >> >> > > > > > > only on
>> >> >> `pip`
>> >> >> > > > > > > implementation but also (and most importantly) to
>> >> >> > > > > > > make sure
>> >> >> that
>> >> >> > > what
>> >> >> > > > > > `pip`
>> >> >> > > > > > > does is to be the beacon of standardisation of
>> >> >> > > > > > > packaging
>> >> APIs
>> >> >> and
>> >> >> > > > PEPs.
>> >> >> > > > > > It
>> >> >> > > > > > > will never IMHO have a lot of the fancy features
>> >> >> > > > > > > that other
>> >> >> tools
>> >> >> > > > might
>> >> >> > > > > > > provide (like the ones I mentioned). It will always
>> >> >> > > > > > > be
>> >> there
>> >> >> to
>> >> >> > > > provide
>> >> >> > > > > > the
>> >> >> > > > > > > robust and solid CLI to run all packaging things,
>> >> >> > > > > > > but there
>> >> >> are
>> >> >> > > > plenty
>> >> >> > > > > of
>> >> >> > > > > > > opportunities to provide improved or modified, or
>> >> >> > > > > > > more (or
>> >> >> less)
>> >> >> > > > > > > opinionated ways of doing things that are
>> >> >> > > > > > > addressing some
>> >> >> cases
>> >> >> > > that
>> >> >> > > > > > `pip`
>> >> >> > > > > > > team simply will not be able or willing to handle,
>> >> preferring
>> >> >> > > "pure"
>> >> >> > > > > > > standard approach vs. implement all the optional things.
>> >> For
>> >> >> > > example
>> >> >> > > > > the
>> >> >> > > > > > > way how pre-releases are handled can be improved to
>> >> >> > > > > > > be more
>> >> >> > > > selective.
>> >> >> > > > > > The
>> >> >> > > > > > > PEP describing it gives the tools an option to add
>> >> >> > > > > > > more
>> >> fancy
>> >> >> > > > > behaviours
>> >> >> > > > > > > (some of which we could find useful in our CI tooling).
>> >> Should
>> >> >> > > `pip`
>> >> >> > > > > > > implement those - I don't think so. It would
>> >> >> > > > > > > distract
>> >> >> maintainers
>> >> >> > > > from
>> >> >> > > > > > > other more important things. It is quite ok to use
>> >> >> > > > > > > other
>> >> >> tooling
>> >> >> > in
>> >> >> > > > > > places
>> >> >> > > > > > > like our CI, where they do some parts of the
>> >> >> > > > > > > installation
>> >> >> better.
>> >> >> > > > > > >
>> >> >> > > > > > > For me `pip` is going more into the direction of
>> >> >> > > > > > > `usable
>> >> >> > reference
>> >> >> > > > > > > implementation of package installed` - any
>> >> >> > > > > > > standard/ PEP
>> >> will
>> >> >> not
>> >> >> > > > > matter
>> >> >> > > > > > if
>> >> >> > > > > > > `pip` does not implement it. But others might go in
>> >> different
>> >> >> > > > > directions
>> >> >> > > > > > > and implement some less popular features and do it
>> >> >> > > > > > > better,
>> >> >> > faster,
>> >> >> > > > with
>> >> >> > > > > > > greater flexibility. IMHO it's a win-win.
>> >> >> > > > > > >
>> >> >> > > > > > > J.
>> >> >> > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
>> >> >> > > > > andrey.anshin@taragol.is
>> >> >> > > > > > >
>> >> >> > > > > > > wrote:
>> >> >> > > > > > >
>> >> >> > > > > > > > Yesterday my friend shared with me that tool and
>> >> >> > > > > > > > I've
>> >> been
>> >> >> told
>> >> >> > > > that
>> >> >> > > > > > more
>> >> >> > > > > > > > presumably it would be a niche tool. I've been
>> >> >> > > > > > > > told "who
>> >> >> needs
>> >> >> > > yet
>> >> >> > > > > > > another
>> >> >> > > > > > > > installer which stands to resolve all your problems'
>> '.
>> >> >> > > > > > > > I guess I was wrong?
>> >> >> > > > > > > >
>> >> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
>> >> >> jarek@potiuk.com>
>> >> >> > > > wrote:
>> >> >> > > > > > > >
>> >> >> > > > > > > > > Hey everyone,
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Few days ago the ruff creators have released a
>> >> >> > > > > > > > > new tool
>> >> >> uv -
>> >> >> > > > which
>> >> >> > > > > is
>> >> >> > > > > > > an
>> >> >> > > > > > > > > extremely fast (written in rust) and fully
>> >> >> > > > > > > > > featured
>> >> tool
>> >> >> > > > generally
>> >> >> > > > > > > fully
>> >> >> > > > > > > > > compatible with `pip`.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Blog post here:
>> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/
>> >> >> > > > > > > > > ?ur
>> >> >> > > > > > > > > l=https%3A%2F%2Fastral.sh%2Fblog%2Fuv&data=05%7
>> >> >> > > > > > > > > C02
>> >> >> > > > > > > > > %7CJens.Scheffler%40de.bosch.com%7Cda57d392cf6a
>> >> >> > > > > > > > > 479
>> >> >> > > > > > > > > 9ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58
>> >> >> > > > > > > > > 410
>> >> >> > > > > > > > > f4%7C0%7C0%7C638445308555414247%7CUnknown%7CTWF
>> >> >> > > > > > > > > pbG
>> >> >> > > > > > > > > Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT
>> >> >> > > > > > > > > iI6
>> >> >> > > > > > > > > Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=KTqeT
>> >> >> > > > > > > > > xus
>> >> >> > > > > > > > > gSBxgBClVc8LhjvPCJAhcmlkXM%2FK%2B53EzYM%3D&rese
>> >> >> > > > > > > > > rve
>> >> >> > > > > > > > > d=0
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > It looks like It has a number of things that
>> >> >> > > > > > > > > would make
>> >> >> our
>> >> >> > CI
>> >> >> > > > > cases
>> >> >> > > > > > > and
>> >> >> > > > > > > > > tooling quite a bit faster and better including
>> >> >> > > > > > > > > a few
>> >> >> things
>> >> >> > > > that I
>> >> >> > > > > > > have
>> >> >> > > > > > > > > implemented some workarounds for and some that
>> >> >> > > > > > > > > I have
>> >> not
>> >> >> > > > > > > > > implemented because `pip` had no good solution.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > I looked at the docs and it solves some
>> >> >> > > > > > > > > problems that
>> >> are
>> >> >> > > > currently
>> >> >> > > > > > > > > difficult or impossible to handle with `pip`:
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > * ability to use overrides (which are
>> >> >> > > > > > > > > constraints on
>> >> >> > steroids -
>> >> >> > > > > > > allowing
>> >> >> > > > > > > > to
>> >> >> > > > > > > > > override limits specified by the packages -
>> >> >> > > > > > > > > this will
>> >> be
>> >> >> very
>> >> >> > > > > useful
>> >> >> > > > > > to
>> >> >> > > > > > > > > better handle our cases with "chicken-egg"
>> >> >> > > > > > > > > providers
>> >> (for
>> >> >> > > example
>> >> >> > > > > > like
>> >> >> > > > > > > we
>> >> >> > > > > > > > > had in FAB) where we have pre-release packages
>> >> depending
>> >> >> on
>> >> >> > > each
>> >> >> > > > > > other
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > * different resolution strategies including
>> >> >> > --resolution=lowest
>> >> >> > > > > which
>> >> >> > > > > > > > will
>> >> >> > > > > > > > > finally allow us to see whether airflow's lower
>> >> >> > > > > > > > > bounds
>> >> are
>> >> >> > > still
>> >> >> > > > > > > holding
>> >> >> > > > > > > > > (i.e. - will our test still pass if we use the
>> >> >> > > > > > > > > lowest
>> >> >> > supported
>> >> >> > > > > > version
>> >> >> > > > > > > > of
>> >> >> > > > > > > > > our dependencies?  this is something i wanted
>> >> >> > > > > > > > > to do for
>> >> >> quite
>> >> >> > > > some
>> >> >> > > > > > time
>> >> >> > > > > > > > and
>> >> >> > > > > > > > > recorded an issue for that -
>> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/
>> >> >> > > > > > > > > ?ur
>> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2
>> >> >> > > > > > > > > Fis
>> >> >> > > > > > > > > sues%2F35549&data=05%7C02%7CJens.Scheffler%40de
>> >> >> > > > > > > > > .bo
>> >> >> > > > > > > > > sch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0a
>> >> >> > > > > > > > > e51
>> >> >> > > > > > > > > e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308
>> >> >> > > > > > > > > 555
>> >> >> > > > > > > > > 418852%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
>> >> >> > > > > > > > > MDA
>> >> >> > > > > > > > > iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
>> >> >> > > > > > > > > %7C
>> >> >> > > > > > > > > 0%7C%7C%7C&sdata=Nz7du2MmavpWhHcFFfd8Qj2SbKWZcm
>> >> >> > > > > > > > > Xxs
>> >> >> > > > > > > > > OlfMGgftwQ%3D&reserved=0 but lack of tooling
>> >> >> > > > > > > > > support made it a wish, with
>> >> >> > > > > > `--resolution=lowest`
>> >> >> > > > > > > it
>> >> >> > > > > > > > > seems like super-easy thing to do.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > * It is said to be many, many times faster -
>> >> >> > > > > > > > > with
>> >> better
>> >> >> > > caching
>> >> >> > > > > and
>> >> >> > > > > > > > > resolution speeds (similarly like with ruff
>> >> >> > > > > > > > > they claim
>> >> >> orders
>> >> >> > > of
>> >> >> > > > > > > > magnitude
>> >> >> > > > > > > > > speedups in a number of cases). We can likely
>> >> >> > > > > > > > > make very
>> >> >> good
>> >> >> > > use
>> >> >> > > > of
>> >> >> > > > > > it
>> >> >> > > > > > > > and
>> >> >> > > > > > > > > speed up some parts of our CI workflow
>> significantly.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > I might likely do some experimenting with uv in
>> >> >> > > > > > > > > our
>> >> >> > toolchain,
>> >> >> > > > but
>> >> >> > > > > > > wanted
>> >> >> > > > > > > > > to make sure we are all aware of it - and ask
>> >> >> > > > > > > > > if
>> >> someone
>> >> >> has
>> >> >> > > > > > something
>> >> >> > > > > > > > > against it (and maybe someone would like to do
>> >> >> > > > > > > > > some
>> >> work
>> >> >> > there
>> >> >> > > > > trying
>> >> >> > > > > > > it
>> >> >> > > > > > > > > out - I will be happy to guide others with the
>> >> dev/tooling
>> >> >> > > > mindset
>> >> >> > > > > > and
>> >> >> > > > > > > > > incline to do some changes there/review PRs and
>> >> cooperate
>> >> >> on
>> >> >> > > > > testing
>> >> >> > > > > > > > those
>> >> >> > > > > > > > > things.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > It's not a user-facing change, and I do not
>> >> >> > > > > > > > > think we
>> >> want
>> >> >> to
>> >> >> > > get
>> >> >> > > > > rid
>> >> >> > > > > > of
>> >> >> > > > > > > > > `pip` as an installation tool in general (in
>> >> >> > > > > > > > > our images
>> >> >> and
>> >> >> > > user
>> >> >> > > > > > facing
>> >> >> > > > > > > > > side) - it's mostly an internal CI tooling
>> >> >> > > > > > > > > improvement
>> >> I
>> >> >> am
>> >> >> > > > > thinking
>> >> >> > > > > > > of.
>> >> >> > > > > > > > > Maybe at some point in time we can recommend it
>> >> >> > > > > > > > > also
>> >> for
>> >> >> > > > > development
>> >> >> > > > > > > > > workflows, and maybe someday it will gain
>> >> >> > > > > > > > > enough
>> >> >> popularity
>> >> >> > to
>> >> >> > > > > think
>> >> >> > > > > > > > about
>> >> >> > > > > > > > > recommending it to our users, but definitely
>> >> >> > > > > > > > > not now
>> >> nor
>> >> >> in
>> >> >> > > even
>> >> >> > > > > > > mid-term
>> >> >> > > > > > > > > future.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Let me know what you think.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Repo here:
>> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/
>> >> >> > > > > > > > > ?ur
>> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fastral-sh%2Fuv&dat
>> >> >> > > > > > > > > a=0
>> >> >> > > > > > > > > 5%7C02%7CJens.Scheffler%40de.bosch.com%7Cda57d3
>> >> >> > > > > > > > > 92c
>> >> >> > > > > > > > > f6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d6
>> >> >> > > > > > > > > 48e
>> >> >> > > > > > > > > e58410f4%7C0%7C0%7C638445308555424433%7CUnknown
>> >> >> > > > > > > > > %7C
>> >> >> > > > > > > > > TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
>> >> >> > > > > > > > > iLC
>> >> >> > > > > > > > > JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata
>> >> >> > > > > > > > > =Ln
>> >> >> > > > > > > > >
>> XRuNo6aJwsLPWwbSJrls47%2BfqH2JSMpyt61h%2F0e1g%3D&reserved=0
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > J.
>> >> >> > > > > > > > >
>> >> >> > > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > >
>> >> >> > > > >
>> >> >> > > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
>> For additional commands, e-mail: dev-help@airflow.apache.org
>>
>
________________________________
 Strike Technologies, LLC (“Strike”) is part of the GTS family of companies. Strike is a technology solutions provider, and is not a broker or dealer and does not transact any securities related business directly whatsoever. This communication is the property of Strike and its affiliates, and does not constitute an offer to sell or the solicitation of an offer to buy any security in any jurisdiction. It is intended only for the person to whom it is addressed and may contain information that is privileged, confidential, or otherwise protected from disclosure. Distribution or copying of this communication, or the information contained herein, by anyone other than the intended recipient is prohibited. If you have received this communication in error, please immediately notify Strike at info@striketechnologies.com, and delete and destroy any copies hereof.
________________________________

CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any attachments are intended solely for the addressee. This transmission is covered by the Electronic Communications Privacy Act, 18 U.S.C ''2510-2521. The information contained in this transmission is confidential in nature and protected from further use or disclosure under U.S. Pub. L. 106-102, 113 U.S. Stat. 1338 (1999), and may be subject to attorney-client or other legal privilege. Your use or disclosure of this information for any purpose other than that intended by its transmittal is strictly prohibited, and may subject you to fines and/or penalties under federal and state law. If you are not the intended recipient of this transmission, please DESTROY ALL COPIES RECEIVED and confirm destruction to the sender via return transmittal.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
And merged. I will keep an eye on it for the next few days.

On Mon, Feb 26, 2024 at 11:47 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Yes. The difference was because of caching. I forgot to mention that. This
> is due to the way our CI image docker optimisation works
>
> The way how the image is constructed is that it first installs
> dependencies from https://github.com/apache/airflow to save on any future
> re-installs - using docker caching mechanism (it works the same for `pip`
> and `uv pip`.
>
> It works roughly like this:
>
> LAYER 1: pip install "apache-airflow[devel-ci] @
> https://github.com/apache/airflow/archive/main.tar.gz <- here we install
> all airflow dependencies in the main AT THE MOMENT THIS LAYER WAS BUILT
> LAYER 2: COPY pyproject.toml
> LAYER 3: pip install ".[devel-ci]" --constraints https://.....
>
> It is a very nice way of caching and speeding up adding new dependencies I
> introduced years ago that works very nicely for us for remote builds (so
> local breeze builds are making use of it as well) - this means that ALL
> dependencies of airflow will be pre-installed as a cached layer in the
> image, regardless of modification in pyproject.toml. So whenever
> someone modifies pyproject.toml and adds new dependencies or modifies the
> existing ones - LAYER 1 will NOT be invalidated, but LAYER 2 (and LAYER 3)
> will - which means that the LAYER 3 pip install will only install new
> dependencies (and it will use latest constraints for that).
>
> LAYER 1 is only modified when:
>
> a) Python base image changes (every few weeks)
> b) Docker scripts change (those scripts that are COPIED before that layer
> - so for example airlfow installation scripts).
> c) DEPENDENCY_EPOCH change -> we can manually bump it to force the
> reinstallation after we remove some dependency, to make sure it is
> regenerated
>
> This has the side effect that when you add or modify a dependency, it is
> very fast - instead of reinstalling all 600+ dependencies, they are already
> installed and you only get dif
> Another side effect of it that the image (between Python base image
> updates or epoch/docker script changes) is that the image gets ever so
> bigger - every time new constraints are update and cache rebuilt and
> constraints updated, the image gets updated with new dependencies in main,
> incrementally adding changed (and only those changed ones) dependencies in
> LAYER 3. So I was comparing an image where LAYER 1 was created some time
> ago with pip (and LAYER 3 got bigger) with pretty much "From the scratch"
> image  where LAYER 1 was "latest deps" and LAYER 3 has almost no updated/
> new dependencies.
>
> That explains the difference. The new image will also get slightly bigger
> in the next few days or weeks, until a new Python base image is released or
> we will update the scripts.
>
> Also - all tests pass and that's most important. The CI image is
> exclusively used to run tests, it's not used in production. The production
> image is still using `pip` (I had some problems with PROD image building
> with uv - because it expects virtualenv. rather than --user installation of
> ours) . We might want to fix it some time in the future (and uv might add
> it as a feature in the meantime) - let's give `uv` some time to settle :).
>
> So - it has no impact on the user-facing side (at all).
>
> Re: dependencies after `uv` was successful have been updated here:
> https://github.com/apache/airflow/commit/fd64235a481adb4aaff1b2f432eaceb9d0b5c53c
> in our constraints 2 hrs ago.
>
> As you can see - the changes are "as expected" - there are a few
> dependencies bumped since yesterday (correctly picked up by uv --highest
> resolution mechanism). The `uv pip freeze` command of uv for now uses
> original, non-canonical names of packages - but the original now
> (underscores instead of dashes) - but that's perfectly fine, those packages
> get canonical names. This will likely get changed in the future.
>
> J.
>
>
>
>
>
>
> On Mon, Feb 26, 2024 at 10:18 AM Scheffler Jens (XC-AS/EAE-ADA-T)
> <Je...@de.bosch.com.invalid> wrote:
>
>> @Jarek, had no time to review PR.
>> If the Docker image is ~400MB smaller, I fear there is a diff. Were you
>> able to dump a file list to inspect the diff?
>> If not I would propose to make it in the PR to understand "why". If there
>> care cache files (only) then in general it would make sense to think about
>> if "cache/garbage" is anyway left in pip/uv which we should clean to shrink
>> images.
>>
>> Mit freundlichen Grüßen / Best regards
>>
>> Jens Scheffler
>>
>> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
>> Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen |
>> GERMANY | http://www.bosch.com/
>> Tel. +49 711 811-91508 | Mobil +49 160 90417410 |
>> Jens.Scheffler@de.bosch.com
>>
>> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
>> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
>> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus
>> Forschner,
>> Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert
>>
>> -----Original Message-----
>> From: Jarek Potiuk <ja...@potiuk.com>
>> Sent: Montag, 26. Februar 2024 08:54
>> To: Amogh Desai <am...@gmail.com>
>> Cc: dev@airflow.apache.org
>> Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows
>>
>> Yep. It all looks good now and I re-ran last intermittently failing job:
>> Final effect of it:
>>
>> * CI image (uncompressed) with uv is slightly smaller (3.5 GB vs. 3.9 GB)
>> * regular code only PRs: same time to incrementally build image ~ 1m
>> * adding/modifying dependency in the PR:: 12 m  -> 6m : 50% improvement
>> * removing dependency/rebuilding things from scratch -> 27m -> 12 m : 55%
>> improvement
>>
>> Depending on the speed of your network, also locally rebuilding your
>> image should be generally much faster in all cases once we merge it and
>> update cache.
>>
>> Also the flaky test turned out to be really just "sometimes running much
>> slower than expected" case - I increased the number of retries and gave the
>> test a bit more time and added better message, so hopefully the flaky test
>> will stop happening now.
>>
>> I think it's a no-brainer :).
>> https://github.com/apache/airflow/pull/37692 waiting for reviews
>>
>> J.
>>
>>
>>
>> On Mon, Feb 26, 2024 at 4:50 AM Amogh Desai <am...@gmail.com>
>> wrote:
>>
>> > Thanks for the superb investigation and effort @Jarek Potiuk
>> > <ja...@potiuk.com>!
>> >
>> > I quite like the performance improvement numbers uv brings in compared
>> > to pip.
>> > I see no reason not to switch to UV in prod images as well.
>> >
>> > I will take a look at the pull request soon.
>> >
>> > Thanks & Regards,
>> > Amogh Desai
>> >
>> > On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >
>> >> I think I will get it green finally:
>> >> https://github.com/apache/airflow/pull/37692.
>> >>
>> >> I know where the test flakiness was from. Generally speaking it
>> >> turned out that there is no free lunch and - of course - cache from
>> >> uv increased our CI image size significantly (by around 1.5G) - and
>> >> it caused much slower test execution (and test became more flaky
>> >> because of that). So after looking at that I decided to disable the
>> >> cache - it's definitely not worth it to increase the size of our
>> >> images that much. We still have significant (50% - 60% improvements -
>> >> not the 60% - 70% like we had with cache), but it's still significant
>> >> enough. Without cache the "upgrade scenario is ~ 40s (so no 4s any
>> >> more) instead of 7m with pip - so this is still a huge improvement
>> >> (image size is even smaller than the one with `pip`).
>> >>
>> >>
>> >> J,
>> >>
>> >>
>> >>
>> >> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >>
>> >> > Some more findings.
>> >> >
>> >> > Overall, I can confirm that with `uv` we will get significant - 60
>> >> > - 70% on build image times. This will impact both CI but also
>> >> > `breeze` local rebuilds.
>> >> >
>> >> > I am getting closer to a mergeable state. I switched to
>> >> > https://g/
>> >> > ithub.com
>> %2Fapache%2Fairflow%2Fpull%2F37692&data=05%7C02%7CJens.Scheffler%
>> 40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555397453%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=QD6nJwgUu5Hncvb3Vu%2F0WCbL2iYXzxu3z8N7xYhcjAg%3D&reserved=0
>> to test "upgrade to latest dependencies" workflow and canary build impact.
>> >> >
>> >> > The PR is getting greener and greener. I have a few last things to
>> >> > address.
>> >> >
>> >> > An interesting story is that a flaky test in CLI
>> >> >
>> >> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::test
>> >> _cli_webserver_background)
>> >> > we had is suddenly significantly more flaky, so I will have to take
>> >> > a
>> >> look
>> >> > at how to finally remove the flakiness from it.
>> >> > This is a good thing because this test had been flaky for quite a
>> >> > while but it was very difficult to reproduce and seems that for
>> >> > some reason
>> >> it is
>> >> > now much easier to reproduce (which also means we will know when we
>> >> > fix
>> >> it0.
>> >> >
>> >> > Looking at stats it seems that a lot  (but not all) of the speed
>> >> > improvement might come with Parallel downloading of dependencies -
>> >> > which are in the works also for pip (
>> >> > https://g/
>> >> > ithub.com%2Fpypa%2Fpip%2Fpull%2F12388&data=05%7C02%7CJens.Scheffler
>> >> > %40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e
>> >> > 4bbb6d648ee58410f4%7C0%7C0%7C638445308555401666%7CUnknown%7CTWFpbGZ
>> >> > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn
>> >> > 0%3D%7C0%7C%7C%7C&sdata=Ycl1VKKK3Rb6iMVLq4kX3OToJXe119GlfUBE8DXK9dc
>> >> > %3D&reserved=0) - though it's not clear how
>> >> much
>> >> > it will help as the Batch Dowloader in pip is involved only after
>> >> > resolution. We will see after it is implemented if it changes things.
>> >> >
>> >> > I am also now switching PROD builds to use uv to see how much we
>> >> > can
>> >> save,
>> >> > but I leave `pip` as default for releases and users, the only
>> >> difference is
>> >> > CI - I've added separate step for `pip` PROD build to compare and
>> >> > to
>> >> make
>> >> > sure it's running fine in CI.
>> >> >
>> >> > The numbers:
>> >> >
>> >> > * for "upgrade to newer dependencies" scenario - uv is WAY faster -
>> >> > as I thought. In the "current" stage of the main it is: ~7m pip, 5 s
>> (!) uv.
>> >> > Here caching of uv makes a huge difference, and while there is some
>> >> work in
>> >> > `pip` and resolvelib (looking at PRs/issues) it's going to be quite
>> >> > some time to get similar results from pip and "upgrade" builds will
>> >> > go down eventually from 12m to 5 m - which is a major improvement -
>> >> > especially
>> >> for
>> >> > elapsed time of CI builds.
>> >> >
>> >> > * from what I see package installation is super-fast in uv.
>> >> > Installing
>> >> 614
>> >> > packages takes (wait for it) 1s (!) where I saw it taking way over
>> >> > a
>> >> minute
>> >> > with `pip`. This will be hard to beat I think with Python vs. rust.
>> >> >
>> >> > Some notes about differences I saw:
>> >> >
>> >> > PIP and UV lead to slightly different resolutions when upgrading.
>> >> > This
>> >> is
>> >> > not a surprise because different heuristics are involved (the
>> >> > resolution algorithm is np-complete
>> >> > (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> >> > research.swtch.com%2Fversion-sat&data=05%7C02%7CJens.Scheffler%40de
>> >> > .bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6
>> >> > d648ee58410f4%7C0%7C0%7C638445308555405774%7CUnknown%7CTWFpbGZsb3d8
>> >> > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>> >> > 7C0%7C%7C%7C&sdata=2yp6zlzYsFMa2Qfua6T62ADSn5q2A8hUNSVSVd3VC8Q%3D&r
>> >> > eserved=0)  and it's very inefficient to run the full resolution,
>> >> > so both pip and uv
>> >> take a
>> >> > little different approach for shortcuts and limiting the possible
>> >> > space
>> >> of
>> >> > solutions. I've done a few PRs limiting (lower-bound) some
>> >> > dependencies
>> >> to
>> >> > bring them closer) - but at the end what we get is "correct" in
>> >> > both
>> >> cases
>> >> > - I continue running `pip check` to make sure that whatever UV
>> >> > finds is also correct according to `pip`. Nothing really major
>> >> > there. There were literally few cases that required some manual
>> >> > adjustments. Nothing unmanageable also in the future, I was doing
>> >> > similar tweaks with `pip`
>> >> as
>> >> > well to help with the resolution.
>> >> >
>> >> > Example of differences (left. first is pip, right, second is uv)
>> >> >
>> >> > < importlib-resources==5.13.0
>> >> > ---
>> >> > > importlib-resources==6.1.1
>> >> >
>> >> > vs.
>> >> >
>> >> > < pycountry==23.12.11
>> >> > ---
>> >> > > pycountry==22.3.5
>> >> >
>> >> > It means that with `uv` we have a newer version of
>> >> > importlib_resources
>> >> but
>> >> > an older version of pycountry.
>> >> >
>> >> > This one I will handle by bumping pycountry in case of facebook
>> >> > provider and bump it to > 23.12 as the old version is 1.5 years old.
>> >> >
>> >> > J.
>> >> >
>> >> >
>> >> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr>
>> >> wrote:
>> >> >
>> >> >> That's impressive! I love this tool, not only for reducing CI time
>> >> >> but also for saving the environment.
>> >> >> Some of the previous improvements were to further parallelize CI
>> >> >> jobs
>> >> to
>> >> >> complete the CI faster, but this tool will help reduce the overall
>> >> time.
>> >> >>
>> >> >> Big +1
>> >> >>
>> >> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com>
>> >> wrote:
>> >> >>
>> >> >> > Hello here.
>> >> >> >
>> >> >> > I have a PR
>> >> >> >
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>> >> >> > 2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F37683&data=05%7C02%7CJe
>> >> >> > ns.Scheffler%40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7
>> >> >> > C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555410125%7
>> >> >> > CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT
>> >> >> > iI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=I8DF0ugyN53LKTOZy8N
>> >> >> > dhKS%2FUKGdZuI9SoOVIwgx9MI%3D&reserved=0 that
>> >> >> implements:
>> >> >> >
>> >> >> > * ability to choose either uv or PIP when building our images
>> >> >> > * CI images are built with uv by default (but you can use
>> >> `--no-use-uv`
>> >> >> as
>> >> >> > a flag and switch back to `pip`
>> >> >> > * PROD images are built with pip by default (but you can us
>> >> `--use-uv`
>> >> >> as a
>> >> >> > flag an switch to uv
>> >> >> >
>> >> >> > The preliminary tests show indeed that uv not only has a much
>> >> >> > faster baseline, but  also their use of caching fits extremely
>> >> >> > well into our strategy of building images and we will get huge
>> >> >> > improvements of our
>> >> CI
>> >> >> > build timing when using uv.
>> >> >> >
>> >> >> > Just for the context - our CI images when built are using a
>> >> >> > caching strategy to optimise for f
>> >> >> >
>> >> >> > 1) fast building when there are no changes (around 1 minute to
>> >> >> > build
>> >> >> with
>> >> >> > pip),
>> >> >> > 2) slower building when someone adds or modifies non-conflicting
>> >> >> dependency
>> >> >> > (around. 8 minutes to build, out of which ~ 6 m is pip
>> >> >> > resolution and
>> >> >> > installation)
>> >> >> > 3) much longer build time when there are conflicting
>> >> >> > dependencies or
>> >> >> when
>> >> >> > we change Dockerfile or scripts or when Python base image
>> >> >> > changes
>> >> >> (around
>> >> >> > 27 minutes build out of which pip resolving is ~ 20m).
>> >> >> >
>> >> >> > Those are all `pip` numbers. Currently `pip` does not use
>> >> >> > resolution caching between the steps. Comparison of some basic
>> >> >> > installation
>> >> steps
>> >> >> from
>> >> >> > initial tests show that UV is way faster:
>> >> >> >
>> >> >> > * Resolving and Installing airflow with [devel-ci] (610
>> >> dependencies):
>> >> >> pip
>> >> >> > ~ 6m, uv ~ 1m 30 s
>> >> >> > * Re-resolving and reinstalling [devel-ci] using local
>> >> pyproject.toml;
>> >> >> pip
>> >> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used
>> >> >> > in
>> >> this
>> >> >> > case.
>> >> >> >
>> >> >> > I have not yet tested well (but I will once they happen) --eager
>> >> >> upgrade of
>> >> >> > dependencies (pip - very much depends but it's often in the
>> >> >> > range of
>> >> 10
>> >> >> > minutes) - I expect it not to take more than 2-3 minutes with uv
>> >> >> >
>> >> >> > So overall it looks like we are looking at those improvements:
>> >> >> >
>> >> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
>> >> >> > (because we are using docker layer caching and pip resolution
>> >> >> > and installation is not used at all)
>> >> >> > 2) Updating dependencies: 8m with pip will probably go down with
>> >> >> > uv
>> >> to ~
>> >> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there
>> >> >> > are no
>> >> >> remote
>> >> >> > changes and cache is used (70% improvement)
>> >> >> > 3) Re-resolving and reinstalling everything 27 m will probably
>> >> >> > go
>> >> down
>> >> >> with
>> >> >> > uv to ~ 9m => 67% improvements.
>> >> >> >
>> >> >> > If those numbers hold and the resolution quality will be
>> >> >> > comparable
>> >> to
>> >> >> > `pip` - then well, it's definitely worth it - and the numbers
>> >> >> > are
>> >> very
>> >> >> > close to what the `uv` authors claimed.
>> >> >> >
>> >> >> > I am impressed :)
>> >> >> >
>> >> >> > J.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <
>> >> amoghdesai.oss@gmail.com>
>> >> >> > wrote:
>> >> >> >
>> >> >> > > I agree with Niko here.
>> >> >> > >
>> >> >> > > If someone is willing to give it a try, we should enable it
>> >> >> > experimentally
>> >> >> > > and give it a stint for a couple of weeks. If we see
>> >> >> > > significant
>> >> >> results,
>> >> >> > > we can adopt it.
>> >> >> > >
>> >> >> > > Thanks & Regards,
>> >> >> > > Amogh Desai
>> >> >> > >
>> >> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
>> >> >> > <onikolas@amazon.com.invalid
>> >> >> > > >
>> >> >> > > wrote:
>> >> >> > >
>> >> >> > > > The Astral folks also seem very focused on it being a
>> >> >> drop-in/compliant
>> >> >> > > > replacement for pip. So I think it's definitely worth
>> >> >> > > > dropping
>> >> it in
>> >> >> > and
>> >> >> > > > seeing if we get the expected performance improvements. If
>> >> >> > > > tests
>> >> >> still
>> >> >> > > pass
>> >> >> > > > and user facing constraints and install instructions remain
>> >> >> unchanged I
>> >> >> > > > don't see why not, if someone is willing to spend the time on
>> it.
>> >> >> Never
>> >> >> > > > mind the extra features it would give us (I, like others, am
>> >> >> > > > also
>> >> >> very
>> >> >> > > > excited about --resolution=lowest, ability).
>> >> >> > > >
>> >> >> > > > ________________________________
>> >> >> > > > From: Andrey Anshin <an...@taragol.is>
>> >> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
>> >> >> > > > To: dev@airflow.apache.org
>> >> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS]
>> >> >> > > > Considering
>> >> >> trying
>> >> >> > > > out uv for our CI workflows
>> >> >> > > >
>> >> >> > > > CAUTION: This email originated from outside of the
>> organization.
>> >> Do
>> >> >> not
>> >> >> > > > click links or open attachments unless you can confirm the
>> >> >> > > > sender
>> >> >> and
>> >> >> > > know
>> >> >> > > > the content is safe.
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un
>> >> >> > > > expéditeur
>> >> >> > externe.
>> >> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si
>> >> vous ne
>> >> >> > > pouvez
>> >> >> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes
>> >> >> > > > pas
>> >> >> certain
>> >> >> > > que
>> >> >> > > > le contenu ne présente aucun risque.
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > > I share Andrey's skepticism. It's just yet another tool
>> >> >> > > > > which
>> >> has
>> >> >> an
>> >> >> > > > unclear
>> >> >> > > > development strategy.
>> >> >> > > >
>> >> >> > > > My point was more about a matter of presentation. If someone
>> >> >> > > > told
>> >> >> you
>> >> >> > > "this
>> >> >> > > > is a new tool, like a killer of previous tools" then you
>> >> >> > > > might
>> >> think
>> >> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...
>> >> >> > > > not
>> >> >> really
>> >> >> > > > interesting". On the other hand if someone told you what in
>> >> >> > > > cases
>> >> >> you
>> >> >> > > might
>> >> >> > > > solve, then this might be a mind changer.
>> >> >> > > >
>> >> >> > > > Especially the promising `--resolution=lowest` option. We
>> >> >> > > > always
>> >> >> want
>> >> >> > to
>> >> >> > > > test something with minimal dependencies because we are not
>> >> >> > > > sure
>> >> >> that
>> >> >> > it
>> >> >> > > > might work with pretty old dependencies, and recently I've
>> >> started
>> >> >> to
>> >> >> > > work
>> >> >> > > > on POC to collect minimal versions of the Airflow and
>> Providers.
>> >> >> And at
>> >> >> > > the
>> >> >> > > > moment when I almost finished it the uv was released. Well
>> >> >> sometimes it
>> >> >> > > is
>> >> >> > > > better to wait a bit and maybe someone would invent the same
>> >> >> > > > solution 😁 and you don't have to spend a personal time.
>> >> >> > > >
>> >> >> > > > So as POC I'm on it, we still need a `pip` and validate some
>> >> stuff
>> >> >> by a
>> >> >> > > pip
>> >> >> > > > because it is only one officially supported way to install
>> >> Airflow
>> >> >> but
>> >> >> > if
>> >> >> > > > something could be improved in the CI then I'm on it, in
>> >> >> > > > most
>> >> cases
>> >> >> it
>> >> >> > > > would be behind of Breeze and many of the contributors might
>> >> >> > > > be
>> >> even
>> >> >> > not
>> >> >> > > > noticed that something changed.
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk
>> >> >> > > > <ja...@potiuk.com>
>> >> >> wrote:
>> >> >> > > >
>> >> >> > > > > Actually - of you read that blog post, the strategy is
>> >> >> > > > > clear -
>> >> >> they
>> >> >> > aim
>> >> >> > > > to
>> >> >> > > > > create a comprehensive packaging tooling and improvnts are
>> >> >> measured
>> >> >> > > > (80-100
>> >> >> > > > > times they claim - I using caching - they (unlike pip) use
>> >> >> > > > > a
>> >> lot
>> >> >> of
>> >> >> > > local
>> >> >> > > > > caching including resolving  dependencies).
>> >> >> > > > >
>> >> >> > > > > So I think both arguments are not valid if you ask me.
>> >> >> > > > >
>> >> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
>> >> >> > kxepal@apache.org
>> >> >> > > >
>> >> >> > > > > napisał:
>> >> >> > > > >
>> >> >> > > > > > I share Andrey's skepticism. It's just yet another tool
>> >> >> > > > > > which
>> >> >> has
>> >> >> > an
>> >> >> > > > > > unclear development strategy. Should you make it a free
>> >> testing
>> >> >> > > suite?
>> >> >> > > > > What
>> >> >> > > > > > project would receive in exchange? A lot of words about
>> >> >> > > > > > being
>> >> >> > faster,
>> >> >> > > > but
>> >> >> > > > > > how much? Are these milliseconds worth to change the
>> >> >> > > > > > stable
>> >> tool
>> >> >> > > with a
>> >> >> > > > > new
>> >> >> > > > > > one? And will it notably improve something?
>> >> >> > > > > >
>> >> >> > > > > > I think it's worth to try it just for fun and provide
>> >> feedback,
>> >> >> but
>> >> >> > > > it'll
>> >> >> > > > > > have to pass a long road to become such stable as pip.
>> >> >> > > > > >
>> >> >> > > > > > --
>> >> >> > > > > > ,,,^..^,,,
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
>> >> jarek@potiuk.com>
>> >> >> > > wrote:
>> >> >> > > > > >
>> >> >> > > > > > > My opinion:
>> >> >> > > > > > >
>> >> >> > > > > > > I think there is a place for a number of such tools.
>> >> >> > > > > > > For a
>> >> >> long
>> >> >> > > time
>> >> >> > > > > the
>> >> >> > > > > > > packaging team and `pip` team have been working not
>> >> >> > > > > > > only on
>> >> >> `pip`
>> >> >> > > > > > > implementation but also (and most importantly) to make
>> >> >> > > > > > > sure
>> >> >> that
>> >> >> > > what
>> >> >> > > > > > `pip`
>> >> >> > > > > > > does is to be the beacon of standardisation of
>> >> >> > > > > > > packaging
>> >> APIs
>> >> >> and
>> >> >> > > > PEPs.
>> >> >> > > > > > It
>> >> >> > > > > > > will never IMHO have a lot of the fancy features that
>> >> >> > > > > > > other
>> >> >> tools
>> >> >> > > > might
>> >> >> > > > > > > provide (like the ones I mentioned). It will always be
>> >> there
>> >> >> to
>> >> >> > > > provide
>> >> >> > > > > > the
>> >> >> > > > > > > robust and solid CLI to run all packaging things, but
>> >> >> > > > > > > there
>> >> >> are
>> >> >> > > > plenty
>> >> >> > > > > of
>> >> >> > > > > > > opportunities to provide improved or modified, or more
>> >> >> > > > > > > (or
>> >> >> less)
>> >> >> > > > > > > opinionated ways of doing things that are addressing
>> >> >> > > > > > > some
>> >> >> cases
>> >> >> > > that
>> >> >> > > > > > `pip`
>> >> >> > > > > > > team simply will not be able or willing to handle,
>> >> preferring
>> >> >> > > "pure"
>> >> >> > > > > > > standard approach vs. implement all the optional things.
>> >> For
>> >> >> > > example
>> >> >> > > > > the
>> >> >> > > > > > > way how pre-releases are handled can be improved to be
>> >> >> > > > > > > more
>> >> >> > > > selective.
>> >> >> > > > > > The
>> >> >> > > > > > > PEP describing it gives the tools an option to add
>> >> >> > > > > > > more
>> >> fancy
>> >> >> > > > > behaviours
>> >> >> > > > > > > (some of which we could find useful in our CI tooling).
>> >> Should
>> >> >> > > `pip`
>> >> >> > > > > > > implement those - I don't think so. It would distract
>> >> >> maintainers
>> >> >> > > > from
>> >> >> > > > > > > other more important things. It is quite ok to use
>> >> >> > > > > > > other
>> >> >> tooling
>> >> >> > in
>> >> >> > > > > > places
>> >> >> > > > > > > like our CI, where they do some parts of the
>> >> >> > > > > > > installation
>> >> >> better.
>> >> >> > > > > > >
>> >> >> > > > > > > For me `pip` is going more into the direction of
>> >> >> > > > > > > `usable
>> >> >> > reference
>> >> >> > > > > > > implementation of package installed` - any standard/
>> >> >> > > > > > > PEP
>> >> will
>> >> >> not
>> >> >> > > > > matter
>> >> >> > > > > > if
>> >> >> > > > > > > `pip` does not implement it. But others might go in
>> >> different
>> >> >> > > > > directions
>> >> >> > > > > > > and implement some less popular features and do it
>> >> >> > > > > > > better,
>> >> >> > faster,
>> >> >> > > > with
>> >> >> > > > > > > greater flexibility. IMHO it's a win-win.
>> >> >> > > > > > >
>> >> >> > > > > > > J.
>> >> >> > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
>> >> >> > > > > andrey.anshin@taragol.is
>> >> >> > > > > > >
>> >> >> > > > > > > wrote:
>> >> >> > > > > > >
>> >> >> > > > > > > > Yesterday my friend shared with me that tool and
>> >> >> > > > > > > > I've
>> >> been
>> >> >> told
>> >> >> > > > that
>> >> >> > > > > > more
>> >> >> > > > > > > > presumably it would be a niche tool. I've been told
>> >> >> > > > > > > > "who
>> >> >> needs
>> >> >> > > yet
>> >> >> > > > > > > another
>> >> >> > > > > > > > installer which stands to resolve all your problems'
>> '.
>> >> >> > > > > > > > I guess I was wrong?
>> >> >> > > > > > > >
>> >> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
>> >> >> jarek@potiuk.com>
>> >> >> > > > wrote:
>> >> >> > > > > > > >
>> >> >> > > > > > > > > Hey everyone,
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Few days ago the ruff creators have released a new
>> >> >> > > > > > > > > tool
>> >> >> uv -
>> >> >> > > > which
>> >> >> > > > > is
>> >> >> > > > > > > an
>> >> >> > > > > > > > > extremely fast (written in rust) and fully
>> >> >> > > > > > > > > featured
>> >> tool
>> >> >> > > > generally
>> >> >> > > > > > > fully
>> >> >> > > > > > > > > compatible with `pip`.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Blog post here:
>> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
>> >> >> > > > > > > > > l=https%3A%2F%2Fastral.sh%2Fblog%2Fuv&data=05%7C02
>> >> >> > > > > > > > > %7CJens.Scheffler%40de.bosch.com%7Cda57d392cf6a479
>> >> >> > > > > > > > > 9ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58410
>> >> >> > > > > > > > > f4%7C0%7C0%7C638445308555414247%7CUnknown%7CTWFpbG
>> >> >> > > > > > > > > Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
>> >> >> > > > > > > > > Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=KTqeTxus
>> >> >> > > > > > > > > gSBxgBClVc8LhjvPCJAhcmlkXM%2FK%2B53EzYM%3D&reserve
>> >> >> > > > > > > > > d=0
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > It looks like It has a number of things that would
>> >> >> > > > > > > > > make
>> >> >> our
>> >> >> > CI
>> >> >> > > > > cases
>> >> >> > > > > > > and
>> >> >> > > > > > > > > tooling quite a bit faster and better including a
>> >> >> > > > > > > > > few
>> >> >> things
>> >> >> > > > that I
>> >> >> > > > > > > have
>> >> >> > > > > > > > > implemented some workarounds for and some that I
>> >> >> > > > > > > > > have
>> >> not
>> >> >> > > > > > > > > implemented because `pip` had no good solution.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > I looked at the docs and it solves some problems
>> >> >> > > > > > > > > that
>> >> are
>> >> >> > > > currently
>> >> >> > > > > > > > > difficult or impossible to handle with `pip`:
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > * ability to use overrides (which are constraints
>> >> >> > > > > > > > > on
>> >> >> > steroids -
>> >> >> > > > > > > allowing
>> >> >> > > > > > > > to
>> >> >> > > > > > > > > override limits specified by the packages - this
>> >> >> > > > > > > > > will
>> >> be
>> >> >> very
>> >> >> > > > > useful
>> >> >> > > > > > to
>> >> >> > > > > > > > > better handle our cases with "chicken-egg"
>> >> >> > > > > > > > > providers
>> >> (for
>> >> >> > > example
>> >> >> > > > > > like
>> >> >> > > > > > > we
>> >> >> > > > > > > > > had in FAB) where we have pre-release packages
>> >> depending
>> >> >> on
>> >> >> > > each
>> >> >> > > > > > other
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > * different resolution strategies including
>> >> >> > --resolution=lowest
>> >> >> > > > > which
>> >> >> > > > > > > > will
>> >> >> > > > > > > > > finally allow us to see whether airflow's lower
>> >> >> > > > > > > > > bounds
>> >> are
>> >> >> > > still
>> >> >> > > > > > > holding
>> >> >> > > > > > > > > (i.e. - will our test still pass if we use the
>> >> >> > > > > > > > > lowest
>> >> >> > supported
>> >> >> > > > > > version
>> >> >> > > > > > > > of
>> >> >> > > > > > > > > our dependencies?  this is something i wanted to
>> >> >> > > > > > > > > do for
>> >> >> quite
>> >> >> > > > some
>> >> >> > > > > > time
>> >> >> > > > > > > > and
>> >> >> > > > > > > > > recorded an issue for that -
>> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
>> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fis
>> >> >> > > > > > > > > sues%2F35549&data=05%7C02%7CJens.Scheffler%40de.bo
>> >> >> > > > > > > > > sch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51
>> >> >> > > > > > > > > e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555
>> >> >> > > > > > > > > 418852%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDA
>> >> >> > > > > > > > > iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>> >> >> > > > > > > > > 0%7C%7C%7C&sdata=Nz7du2MmavpWhHcFFfd8Qj2SbKWZcmXxs
>> >> >> > > > > > > > > OlfMGgftwQ%3D&reserved=0 but lack of tooling
>> >> >> > > > > > > > > support made it a wish, with
>> >> >> > > > > > `--resolution=lowest`
>> >> >> > > > > > > it
>> >> >> > > > > > > > > seems like super-easy thing to do.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > * It is said to be many, many times faster - with
>> >> better
>> >> >> > > caching
>> >> >> > > > > and
>> >> >> > > > > > > > > resolution speeds (similarly like with ruff they
>> >> >> > > > > > > > > claim
>> >> >> orders
>> >> >> > > of
>> >> >> > > > > > > > magnitude
>> >> >> > > > > > > > > speedups in a number of cases). We can likely make
>> >> >> > > > > > > > > very
>> >> >> good
>> >> >> > > use
>> >> >> > > > of
>> >> >> > > > > > it
>> >> >> > > > > > > > and
>> >> >> > > > > > > > > speed up some parts of our CI workflow
>> significantly.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > I might likely do some experimenting with uv in
>> >> >> > > > > > > > > our
>> >> >> > toolchain,
>> >> >> > > > but
>> >> >> > > > > > > wanted
>> >> >> > > > > > > > > to make sure we are all aware of it - and ask if
>> >> someone
>> >> >> has
>> >> >> > > > > > something
>> >> >> > > > > > > > > against it (and maybe someone would like to do
>> >> >> > > > > > > > > some
>> >> work
>> >> >> > there
>> >> >> > > > > trying
>> >> >> > > > > > > it
>> >> >> > > > > > > > > out - I will be happy to guide others with the
>> >> dev/tooling
>> >> >> > > > mindset
>> >> >> > > > > > and
>> >> >> > > > > > > > > incline to do some changes there/review PRs and
>> >> cooperate
>> >> >> on
>> >> >> > > > > testing
>> >> >> > > > > > > > those
>> >> >> > > > > > > > > things.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > It's not a user-facing change, and I do not think
>> >> >> > > > > > > > > we
>> >> want
>> >> >> to
>> >> >> > > get
>> >> >> > > > > rid
>> >> >> > > > > > of
>> >> >> > > > > > > > > `pip` as an installation tool in general (in our
>> >> >> > > > > > > > > images
>> >> >> and
>> >> >> > > user
>> >> >> > > > > > facing
>> >> >> > > > > > > > > side) - it's mostly an internal CI tooling
>> >> >> > > > > > > > > improvement
>> >> I
>> >> >> am
>> >> >> > > > > thinking
>> >> >> > > > > > > of.
>> >> >> > > > > > > > > Maybe at some point in time we can recommend it
>> >> >> > > > > > > > > also
>> >> for
>> >> >> > > > > development
>> >> >> > > > > > > > > workflows, and maybe someday it will gain enough
>> >> >> popularity
>> >> >> > to
>> >> >> > > > > think
>> >> >> > > > > > > > about
>> >> >> > > > > > > > > recommending it to our users, but definitely not
>> >> >> > > > > > > > > now
>> >> nor
>> >> >> in
>> >> >> > > even
>> >> >> > > > > > > mid-term
>> >> >> > > > > > > > > future.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Let me know what you think.
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > Repo here:
>> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
>> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fastral-sh%2Fuv&data=0
>> >> >> > > > > > > > > 5%7C02%7CJens.Scheffler%40de.bosch.com%7Cda57d392c
>> >> >> > > > > > > > > f6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648e
>> >> >> > > > > > > > > e58410f4%7C0%7C0%7C638445308555424433%7CUnknown%7C
>> >> >> > > > > > > > > TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC
>> >> >> > > > > > > > > JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Ln
>> >> >> > > > > > > > >
>> XRuNo6aJwsLPWwbSJrls47%2BfqH2JSMpyt61h%2F0e1g%3D&reserved=0
>> >> >> > > > > > > > >
>> >> >> > > > > > > > > J.
>> >> >> > > > > > > > >
>> >> >> > > > > > > >
>> >> >> > > > > > >
>> >> >> > > > > >
>> >> >> > > > >
>> >> >> > > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
>> For additional commands, e-mail: dev-help@airflow.apache.org
>>
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Yes. The difference was because of caching. I forgot to mention that. This
is due to the way our CI image docker optimisation works

The way how the image is constructed is that it first installs dependencies
from https://github.com/apache/airflow to save on any future re-installs -
using docker caching mechanism (it works the same for `pip` and `uv pip`.

It works roughly like this:

LAYER 1: pip install "apache-airflow[devel-ci] @
https://github.com/apache/airflow/archive/main.tar.gz <- here we install
all airflow dependencies in the main AT THE MOMENT THIS LAYER WAS BUILT
LAYER 2: COPY pyproject.toml
LAYER 3: pip install ".[devel-ci]" --constraints https://.....

It is a very nice way of caching and speeding up adding new dependencies I
introduced years ago that works very nicely for us for remote builds (so
local breeze builds are making use of it as well) - this means that ALL
dependencies of airflow will be pre-installed as a cached layer in the
image, regardless of modification in pyproject.toml. So whenever
someone modifies pyproject.toml and adds new dependencies or modifies the
existing ones - LAYER 1 will NOT be invalidated, but LAYER 2 (and LAYER 3)
will - which means that the LAYER 3 pip install will only install new
dependencies (and it will use latest constraints for that).

LAYER 1 is only modified when:

a) Python base image changes (every few weeks)
b) Docker scripts change (those scripts that are COPIED before that layer -
so for example airlfow installation scripts).
c) DEPENDENCY_EPOCH change -> we can manually bump it to force the
reinstallation after we remove some dependency, to make sure it is
regenerated

This has the side effect that when you add or modify a dependency, it is
very fast - instead of reinstalling all 600+ dependencies, they are already
installed and you only get dif
Another side effect of it that the image (between Python base image updates
or epoch/docker script changes) is that the image gets ever so bigger -
every time new constraints are update and cache rebuilt and constraints
updated, the image gets updated with new dependencies in main,
incrementally adding changed (and only those changed ones) dependencies in
LAYER 3. So I was comparing an image where LAYER 1 was created some time
ago with pip (and LAYER 3 got bigger) with pretty much "From the scratch"
image  where LAYER 1 was "latest deps" and LAYER 3 has almost no updated/
new dependencies.

That explains the difference. The new image will also get slightly bigger
in the next few days or weeks, until a new Python base image is released or
we will update the scripts.

Also - all tests pass and that's most important. The CI image is
exclusively used to run tests, it's not used in production. The production
image is still using `pip` (I had some problems with PROD image building
with uv - because it expects virtualenv. rather than --user installation of
ours) . We might want to fix it some time in the future (and uv might add
it as a feature in the meantime) - let's give `uv` some time to settle :).

So - it has no impact on the user-facing side (at all).

Re: dependencies after `uv` was successful have been updated here:
https://github.com/apache/airflow/commit/fd64235a481adb4aaff1b2f432eaceb9d0b5c53c
in our constraints 2 hrs ago.

As you can see - the changes are "as expected" - there are a few
dependencies bumped since yesterday (correctly picked up by uv --highest
resolution mechanism). The `uv pip freeze` command of uv for now uses
original, non-canonical names of packages - but the original now
(underscores instead of dashes) - but that's perfectly fine, those packages
get canonical names. This will likely get changed in the future.

J.






On Mon, Feb 26, 2024 at 10:18 AM Scheffler Jens (XC-AS/EAE-ADA-T)
<Je...@de.bosch.com.invalid> wrote:

> @Jarek, had no time to review PR.
> If the Docker image is ~400MB smaller, I fear there is a diff. Were you
> able to dump a file list to inspect the diff?
> If not I would propose to make it in the PR to understand "why". If there
> care cache files (only) then in general it would make sense to think about
> if "cache/garbage" is anyway left in pip/uv which we should clean to shrink
> images.
>
> Mit freundlichen Grüßen / Best regards
>
> Jens Scheffler
>
> Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
> Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen |
> GERMANY | http://www.bosch.com/
> Tel. +49 711 811-91508 | Mobil +49 160 90417410 |
> Jens.Scheffler@de.bosch.com
>
> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
> Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
> Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus
> Forschner,
> Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert
>
> -----Original Message-----
> From: Jarek Potiuk <ja...@potiuk.com>
> Sent: Montag, 26. Februar 2024 08:54
> To: Amogh Desai <am...@gmail.com>
> Cc: dev@airflow.apache.org
> Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows
>
> Yep. It all looks good now and I re-ran last intermittently failing job:
> Final effect of it:
>
> * CI image (uncompressed) with uv is slightly smaller (3.5 GB vs. 3.9 GB)
> * regular code only PRs: same time to incrementally build image ~ 1m
> * adding/modifying dependency in the PR:: 12 m  -> 6m : 50% improvement
> * removing dependency/rebuilding things from scratch -> 27m -> 12 m : 55%
> improvement
>
> Depending on the speed of your network, also locally rebuilding your image
> should be generally much faster in all cases once we merge it and update
> cache.
>
> Also the flaky test turned out to be really just "sometimes running much
> slower than expected" case - I increased the number of retries and gave the
> test a bit more time and added better message, so hopefully the flaky test
> will stop happening now.
>
> I think it's a no-brainer :).
> https://github.com/apache/airflow/pull/37692 waiting for reviews
>
> J.
>
>
>
> On Mon, Feb 26, 2024 at 4:50 AM Amogh Desai <am...@gmail.com>
> wrote:
>
> > Thanks for the superb investigation and effort @Jarek Potiuk
> > <ja...@potiuk.com>!
> >
> > I quite like the performance improvement numbers uv brings in compared
> > to pip.
> > I see no reason not to switch to UV in prod images as well.
> >
> > I will take a look at the pull request soon.
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> > On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> >> I think I will get it green finally:
> >> https://github.com/apache/airflow/pull/37692.
> >>
> >> I know where the test flakiness was from. Generally speaking it
> >> turned out that there is no free lunch and - of course - cache from
> >> uv increased our CI image size significantly (by around 1.5G) - and
> >> it caused much slower test execution (and test became more flaky
> >> because of that). So after looking at that I decided to disable the
> >> cache - it's definitely not worth it to increase the size of our
> >> images that much. We still have significant (50% - 60% improvements -
> >> not the 60% - 70% like we had with cache), but it's still significant
> >> enough. Without cache the "upgrade scenario is ~ 40s (so no 4s any
> >> more) instead of 7m with pip - so this is still a huge improvement
> >> (image size is even smaller than the one with `pip`).
> >>
> >>
> >> J,
> >>
> >>
> >>
> >> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >>
> >> > Some more findings.
> >> >
> >> > Overall, I can confirm that with `uv` we will get significant - 60
> >> > - 70% on build image times. This will impact both CI but also
> >> > `breeze` local rebuilds.
> >> >
> >> > I am getting closer to a mergeable state. I switched to
> >> > https://g/
> >> > ithub.com
> %2Fapache%2Fairflow%2Fpull%2F37692&data=05%7C02%7CJens.Scheffler%
> 40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555397453%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=QD6nJwgUu5Hncvb3Vu%2F0WCbL2iYXzxu3z8N7xYhcjAg%3D&reserved=0
> to test "upgrade to latest dependencies" workflow and canary build impact.
> >> >
> >> > The PR is getting greener and greener. I have a few last things to
> >> > address.
> >> >
> >> > An interesting story is that a flaky test in CLI
> >> >
> >> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::test
> >> _cli_webserver_background)
> >> > we had is suddenly significantly more flaky, so I will have to take
> >> > a
> >> look
> >> > at how to finally remove the flakiness from it.
> >> > This is a good thing because this test had been flaky for quite a
> >> > while but it was very difficult to reproduce and seems that for
> >> > some reason
> >> it is
> >> > now much easier to reproduce (which also means we will know when we
> >> > fix
> >> it0.
> >> >
> >> > Looking at stats it seems that a lot  (but not all) of the speed
> >> > improvement might come with Parallel downloading of dependencies -
> >> > which are in the works also for pip (
> >> > https://g/
> >> > ithub.com%2Fpypa%2Fpip%2Fpull%2F12388&data=05%7C02%7CJens.Scheffler
> >> > %40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e
> >> > 4bbb6d648ee58410f4%7C0%7C0%7C638445308555401666%7CUnknown%7CTWFpbGZ
> >> > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn
> >> > 0%3D%7C0%7C%7C%7C&sdata=Ycl1VKKK3Rb6iMVLq4kX3OToJXe119GlfUBE8DXK9dc
> >> > %3D&reserved=0) - though it's not clear how
> >> much
> >> > it will help as the Batch Dowloader in pip is involved only after
> >> > resolution. We will see after it is implemented if it changes things.
> >> >
> >> > I am also now switching PROD builds to use uv to see how much we
> >> > can
> >> save,
> >> > but I leave `pip` as default for releases and users, the only
> >> difference is
> >> > CI - I've added separate step for `pip` PROD build to compare and
> >> > to
> >> make
> >> > sure it's running fine in CI.
> >> >
> >> > The numbers:
> >> >
> >> > * for "upgrade to newer dependencies" scenario - uv is WAY faster -
> >> > as I thought. In the "current" stage of the main it is: ~7m pip, 5 s
> (!) uv.
> >> > Here caching of uv makes a huge difference, and while there is some
> >> work in
> >> > `pip` and resolvelib (looking at PRs/issues) it's going to be quite
> >> > some time to get similar results from pip and "upgrade" builds will
> >> > go down eventually from 12m to 5 m - which is a major improvement -
> >> > especially
> >> for
> >> > elapsed time of CI builds.
> >> >
> >> > * from what I see package installation is super-fast in uv.
> >> > Installing
> >> 614
> >> > packages takes (wait for it) 1s (!) where I saw it taking way over
> >> > a
> >> minute
> >> > with `pip`. This will be hard to beat I think with Python vs. rust.
> >> >
> >> > Some notes about differences I saw:
> >> >
> >> > PIP and UV lead to slightly different resolutions when upgrading.
> >> > This
> >> is
> >> > not a surprise because different heuristics are involved (the
> >> > resolution algorithm is np-complete
> >> > (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> >> > research.swtch.com%2Fversion-sat&data=05%7C02%7CJens.Scheffler%40de
> >> > .bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6
> >> > d648ee58410f4%7C0%7C0%7C638445308555405774%7CUnknown%7CTWFpbGZsb3d8
> >> > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> >> > 7C0%7C%7C%7C&sdata=2yp6zlzYsFMa2Qfua6T62ADSn5q2A8hUNSVSVd3VC8Q%3D&r
> >> > eserved=0)  and it's very inefficient to run the full resolution,
> >> > so both pip and uv
> >> take a
> >> > little different approach for shortcuts and limiting the possible
> >> > space
> >> of
> >> > solutions. I've done a few PRs limiting (lower-bound) some
> >> > dependencies
> >> to
> >> > bring them closer) - but at the end what we get is "correct" in
> >> > both
> >> cases
> >> > - I continue running `pip check` to make sure that whatever UV
> >> > finds is also correct according to `pip`. Nothing really major
> >> > there. There were literally few cases that required some manual
> >> > adjustments. Nothing unmanageable also in the future, I was doing
> >> > similar tweaks with `pip`
> >> as
> >> > well to help with the resolution.
> >> >
> >> > Example of differences (left. first is pip, right, second is uv)
> >> >
> >> > < importlib-resources==5.13.0
> >> > ---
> >> > > importlib-resources==6.1.1
> >> >
> >> > vs.
> >> >
> >> > < pycountry==23.12.11
> >> > ---
> >> > > pycountry==22.3.5
> >> >
> >> > It means that with `uv` we have a newer version of
> >> > importlib_resources
> >> but
> >> > an older version of pycountry.
> >> >
> >> > This one I will handle by bumping pycountry in case of facebook
> >> > provider and bump it to > 23.12 as the old version is 1.5 years old.
> >> >
> >> > J.
> >> >
> >> >
> >> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr>
> >> wrote:
> >> >
> >> >> That's impressive! I love this tool, not only for reducing CI time
> >> >> but also for saving the environment.
> >> >> Some of the previous improvements were to further parallelize CI
> >> >> jobs
> >> to
> >> >> complete the CI faster, but this tool will help reduce the overall
> >> time.
> >> >>
> >> >> Big +1
> >> >>
> >> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com>
> >> wrote:
> >> >>
> >> >> > Hello here.
> >> >> >
> >> >> > I have a PR
> >> >> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%25
> >> >> > 2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F37683&data=05%7C02%7CJe
> >> >> > ns.Scheffler%40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7
> >> >> > C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555410125%7
> >> >> > CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT
> >> >> > iI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=I8DF0ugyN53LKTOZy8N
> >> >> > dhKS%2FUKGdZuI9SoOVIwgx9MI%3D&reserved=0 that
> >> >> implements:
> >> >> >
> >> >> > * ability to choose either uv or PIP when building our images
> >> >> > * CI images are built with uv by default (but you can use
> >> `--no-use-uv`
> >> >> as
> >> >> > a flag and switch back to `pip`
> >> >> > * PROD images are built with pip by default (but you can us
> >> `--use-uv`
> >> >> as a
> >> >> > flag an switch to uv
> >> >> >
> >> >> > The preliminary tests show indeed that uv not only has a much
> >> >> > faster baseline, but  also their use of caching fits extremely
> >> >> > well into our strategy of building images and we will get huge
> >> >> > improvements of our
> >> CI
> >> >> > build timing when using uv.
> >> >> >
> >> >> > Just for the context - our CI images when built are using a
> >> >> > caching strategy to optimise for f
> >> >> >
> >> >> > 1) fast building when there are no changes (around 1 minute to
> >> >> > build
> >> >> with
> >> >> > pip),
> >> >> > 2) slower building when someone adds or modifies non-conflicting
> >> >> dependency
> >> >> > (around. 8 minutes to build, out of which ~ 6 m is pip
> >> >> > resolution and
> >> >> > installation)
> >> >> > 3) much longer build time when there are conflicting
> >> >> > dependencies or
> >> >> when
> >> >> > we change Dockerfile or scripts or when Python base image
> >> >> > changes
> >> >> (around
> >> >> > 27 minutes build out of which pip resolving is ~ 20m).
> >> >> >
> >> >> > Those are all `pip` numbers. Currently `pip` does not use
> >> >> > resolution caching between the steps. Comparison of some basic
> >> >> > installation
> >> steps
> >> >> from
> >> >> > initial tests show that UV is way faster:
> >> >> >
> >> >> > * Resolving and Installing airflow with [devel-ci] (610
> >> dependencies):
> >> >> pip
> >> >> > ~ 6m, uv ~ 1m 30 s
> >> >> > * Re-resolving and reinstalling [devel-ci] using local
> >> pyproject.toml;
> >> >> pip
> >> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used
> >> >> > in
> >> this
> >> >> > case.
> >> >> >
> >> >> > I have not yet tested well (but I will once they happen) --eager
> >> >> upgrade of
> >> >> > dependencies (pip - very much depends but it's often in the
> >> >> > range of
> >> 10
> >> >> > minutes) - I expect it not to take more than 2-3 minutes with uv
> >> >> >
> >> >> > So overall it looks like we are looking at those improvements:
> >> >> >
> >> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
> >> >> > (because we are using docker layer caching and pip resolution
> >> >> > and installation is not used at all)
> >> >> > 2) Updating dependencies: 8m with pip will probably go down with
> >> >> > uv
> >> to ~
> >> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there
> >> >> > are no
> >> >> remote
> >> >> > changes and cache is used (70% improvement)
> >> >> > 3) Re-resolving and reinstalling everything 27 m will probably
> >> >> > go
> >> down
> >> >> with
> >> >> > uv to ~ 9m => 67% improvements.
> >> >> >
> >> >> > If those numbers hold and the resolution quality will be
> >> >> > comparable
> >> to
> >> >> > `pip` - then well, it's definitely worth it - and the numbers
> >> >> > are
> >> very
> >> >> > close to what the `uv` authors claimed.
> >> >> >
> >> >> > I am impressed :)
> >> >> >
> >> >> > J.
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <
> >> amoghdesai.oss@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> > > I agree with Niko here.
> >> >> > >
> >> >> > > If someone is willing to give it a try, we should enable it
> >> >> > experimentally
> >> >> > > and give it a stint for a couple of weeks. If we see
> >> >> > > significant
> >> >> results,
> >> >> > > we can adopt it.
> >> >> > >
> >> >> > > Thanks & Regards,
> >> >> > > Amogh Desai
> >> >> > >
> >> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
> >> >> > <onikolas@amazon.com.invalid
> >> >> > > >
> >> >> > > wrote:
> >> >> > >
> >> >> > > > The Astral folks also seem very focused on it being a
> >> >> drop-in/compliant
> >> >> > > > replacement for pip. So I think it's definitely worth
> >> >> > > > dropping
> >> it in
> >> >> > and
> >> >> > > > seeing if we get the expected performance improvements. If
> >> >> > > > tests
> >> >> still
> >> >> > > pass
> >> >> > > > and user facing constraints and install instructions remain
> >> >> unchanged I
> >> >> > > > don't see why not, if someone is willing to spend the time on
> it.
> >> >> Never
> >> >> > > > mind the extra features it would give us (I, like others, am
> >> >> > > > also
> >> >> very
> >> >> > > > excited about --resolution=lowest, ability).
> >> >> > > >
> >> >> > > > ________________________________
> >> >> > > > From: Andrey Anshin <an...@taragol.is>
> >> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
> >> >> > > > To: dev@airflow.apache.org
> >> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS]
> >> >> > > > Considering
> >> >> trying
> >> >> > > > out uv for our CI workflows
> >> >> > > >
> >> >> > > > CAUTION: This email originated from outside of the
> organization.
> >> Do
> >> >> not
> >> >> > > > click links or open attachments unless you can confirm the
> >> >> > > > sender
> >> >> and
> >> >> > > know
> >> >> > > > the content is safe.
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un
> >> >> > > > expéditeur
> >> >> > externe.
> >> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si
> >> vous ne
> >> >> > > pouvez
> >> >> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes
> >> >> > > > pas
> >> >> certain
> >> >> > > que
> >> >> > > > le contenu ne présente aucun risque.
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > > > I share Andrey's skepticism. It's just yet another tool
> >> >> > > > > which
> >> has
> >> >> an
> >> >> > > > unclear
> >> >> > > > development strategy.
> >> >> > > >
> >> >> > > > My point was more about a matter of presentation. If someone
> >> >> > > > told
> >> >> you
> >> >> > > "this
> >> >> > > > is a new tool, like a killer of previous tools" then you
> >> >> > > > might
> >> think
> >> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...
> >> >> > > > not
> >> >> really
> >> >> > > > interesting". On the other hand if someone told you what in
> >> >> > > > cases
> >> >> you
> >> >> > > might
> >> >> > > > solve, then this might be a mind changer.
> >> >> > > >
> >> >> > > > Especially the promising `--resolution=lowest` option. We
> >> >> > > > always
> >> >> want
> >> >> > to
> >> >> > > > test something with minimal dependencies because we are not
> >> >> > > > sure
> >> >> that
> >> >> > it
> >> >> > > > might work with pretty old dependencies, and recently I've
> >> started
> >> >> to
> >> >> > > work
> >> >> > > > on POC to collect minimal versions of the Airflow and
> Providers.
> >> >> And at
> >> >> > > the
> >> >> > > > moment when I almost finished it the uv was released. Well
> >> >> sometimes it
> >> >> > > is
> >> >> > > > better to wait a bit and maybe someone would invent the same
> >> >> > > > solution 😁 and you don't have to spend a personal time.
> >> >> > > >
> >> >> > > > So as POC I'm on it, we still need a `pip` and validate some
> >> stuff
> >> >> by a
> >> >> > > pip
> >> >> > > > because it is only one officially supported way to install
> >> Airflow
> >> >> but
> >> >> > if
> >> >> > > > something could be improved in the CI then I'm on it, in
> >> >> > > > most
> >> cases
> >> >> it
> >> >> > > > would be behind of Breeze and many of the contributors might
> >> >> > > > be
> >> even
> >> >> > not
> >> >> > > > noticed that something changed.
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk
> >> >> > > > <ja...@potiuk.com>
> >> >> wrote:
> >> >> > > >
> >> >> > > > > Actually - of you read that blog post, the strategy is
> >> >> > > > > clear -
> >> >> they
> >> >> > aim
> >> >> > > > to
> >> >> > > > > create a comprehensive packaging tooling and improvnts are
> >> >> measured
> >> >> > > > (80-100
> >> >> > > > > times they claim - I using caching - they (unlike pip) use
> >> >> > > > > a
> >> lot
> >> >> of
> >> >> > > local
> >> >> > > > > caching including resolving  dependencies).
> >> >> > > > >
> >> >> > > > > So I think both arguments are not valid if you ask me.
> >> >> > > > >
> >> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
> >> >> > kxepal@apache.org
> >> >> > > >
> >> >> > > > > napisał:
> >> >> > > > >
> >> >> > > > > > I share Andrey's skepticism. It's just yet another tool
> >> >> > > > > > which
> >> >> has
> >> >> > an
> >> >> > > > > > unclear development strategy. Should you make it a free
> >> testing
> >> >> > > suite?
> >> >> > > > > What
> >> >> > > > > > project would receive in exchange? A lot of words about
> >> >> > > > > > being
> >> >> > faster,
> >> >> > > > but
> >> >> > > > > > how much? Are these milliseconds worth to change the
> >> >> > > > > > stable
> >> tool
> >> >> > > with a
> >> >> > > > > new
> >> >> > > > > > one? And will it notably improve something?
> >> >> > > > > >
> >> >> > > > > > I think it's worth to try it just for fun and provide
> >> feedback,
> >> >> but
> >> >> > > > it'll
> >> >> > > > > > have to pass a long road to become such stable as pip.
> >> >> > > > > >
> >> >> > > > > > --
> >> >> > > > > > ,,,^..^,,,
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
> >> jarek@potiuk.com>
> >> >> > > wrote:
> >> >> > > > > >
> >> >> > > > > > > My opinion:
> >> >> > > > > > >
> >> >> > > > > > > I think there is a place for a number of such tools.
> >> >> > > > > > > For a
> >> >> long
> >> >> > > time
> >> >> > > > > the
> >> >> > > > > > > packaging team and `pip` team have been working not
> >> >> > > > > > > only on
> >> >> `pip`
> >> >> > > > > > > implementation but also (and most importantly) to make
> >> >> > > > > > > sure
> >> >> that
> >> >> > > what
> >> >> > > > > > `pip`
> >> >> > > > > > > does is to be the beacon of standardisation of
> >> >> > > > > > > packaging
> >> APIs
> >> >> and
> >> >> > > > PEPs.
> >> >> > > > > > It
> >> >> > > > > > > will never IMHO have a lot of the fancy features that
> >> >> > > > > > > other
> >> >> tools
> >> >> > > > might
> >> >> > > > > > > provide (like the ones I mentioned). It will always be
> >> there
> >> >> to
> >> >> > > > provide
> >> >> > > > > > the
> >> >> > > > > > > robust and solid CLI to run all packaging things, but
> >> >> > > > > > > there
> >> >> are
> >> >> > > > plenty
> >> >> > > > > of
> >> >> > > > > > > opportunities to provide improved or modified, or more
> >> >> > > > > > > (or
> >> >> less)
> >> >> > > > > > > opinionated ways of doing things that are addressing
> >> >> > > > > > > some
> >> >> cases
> >> >> > > that
> >> >> > > > > > `pip`
> >> >> > > > > > > team simply will not be able or willing to handle,
> >> preferring
> >> >> > > "pure"
> >> >> > > > > > > standard approach vs. implement all the optional things.
> >> For
> >> >> > > example
> >> >> > > > > the
> >> >> > > > > > > way how pre-releases are handled can be improved to be
> >> >> > > > > > > more
> >> >> > > > selective.
> >> >> > > > > > The
> >> >> > > > > > > PEP describing it gives the tools an option to add
> >> >> > > > > > > more
> >> fancy
> >> >> > > > > behaviours
> >> >> > > > > > > (some of which we could find useful in our CI tooling).
> >> Should
> >> >> > > `pip`
> >> >> > > > > > > implement those - I don't think so. It would distract
> >> >> maintainers
> >> >> > > > from
> >> >> > > > > > > other more important things. It is quite ok to use
> >> >> > > > > > > other
> >> >> tooling
> >> >> > in
> >> >> > > > > > places
> >> >> > > > > > > like our CI, where they do some parts of the
> >> >> > > > > > > installation
> >> >> better.
> >> >> > > > > > >
> >> >> > > > > > > For me `pip` is going more into the direction of
> >> >> > > > > > > `usable
> >> >> > reference
> >> >> > > > > > > implementation of package installed` - any standard/
> >> >> > > > > > > PEP
> >> will
> >> >> not
> >> >> > > > > matter
> >> >> > > > > > if
> >> >> > > > > > > `pip` does not implement it. But others might go in
> >> different
> >> >> > > > > directions
> >> >> > > > > > > and implement some less popular features and do it
> >> >> > > > > > > better,
> >> >> > faster,
> >> >> > > > with
> >> >> > > > > > > greater flexibility. IMHO it's a win-win.
> >> >> > > > > > >
> >> >> > > > > > > J.
> >> >> > > > > > >
> >> >> > > > > > >
> >> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> >> >> > > > > andrey.anshin@taragol.is
> >> >> > > > > > >
> >> >> > > > > > > wrote:
> >> >> > > > > > >
> >> >> > > > > > > > Yesterday my friend shared with me that tool and
> >> >> > > > > > > > I've
> >> been
> >> >> told
> >> >> > > > that
> >> >> > > > > > more
> >> >> > > > > > > > presumably it would be a niche tool. I've been told
> >> >> > > > > > > > "who
> >> >> needs
> >> >> > > yet
> >> >> > > > > > > another
> >> >> > > > > > > > installer which stands to resolve all your problems' '.
> >> >> > > > > > > > I guess I was wrong?
> >> >> > > > > > > >
> >> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
> >> >> jarek@potiuk.com>
> >> >> > > > wrote:
> >> >> > > > > > > >
> >> >> > > > > > > > > Hey everyone,
> >> >> > > > > > > > >
> >> >> > > > > > > > > Few days ago the ruff creators have released a new
> >> >> > > > > > > > > tool
> >> >> uv -
> >> >> > > > which
> >> >> > > > > is
> >> >> > > > > > > an
> >> >> > > > > > > > > extremely fast (written in rust) and fully
> >> >> > > > > > > > > featured
> >> tool
> >> >> > > > generally
> >> >> > > > > > > fully
> >> >> > > > > > > > > compatible with `pip`.
> >> >> > > > > > > > >
> >> >> > > > > > > > > Blog post here:
> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
> >> >> > > > > > > > > l=https%3A%2F%2Fastral.sh%2Fblog%2Fuv&data=05%7C02
> >> >> > > > > > > > > %7CJens.Scheffler%40de.bosch.com%7Cda57d392cf6a479
> >> >> > > > > > > > > 9ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58410
> >> >> > > > > > > > > f4%7C0%7C0%7C638445308555414247%7CUnknown%7CTWFpbG
> >> >> > > > > > > > > Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
> >> >> > > > > > > > > Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=KTqeTxus
> >> >> > > > > > > > > gSBxgBClVc8LhjvPCJAhcmlkXM%2FK%2B53EzYM%3D&reserve
> >> >> > > > > > > > > d=0
> >> >> > > > > > > > >
> >> >> > > > > > > > > It looks like It has a number of things that would
> >> >> > > > > > > > > make
> >> >> our
> >> >> > CI
> >> >> > > > > cases
> >> >> > > > > > > and
> >> >> > > > > > > > > tooling quite a bit faster and better including a
> >> >> > > > > > > > > few
> >> >> things
> >> >> > > > that I
> >> >> > > > > > > have
> >> >> > > > > > > > > implemented some workarounds for and some that I
> >> >> > > > > > > > > have
> >> not
> >> >> > > > > > > > > implemented because `pip` had no good solution.
> >> >> > > > > > > > >
> >> >> > > > > > > > > I looked at the docs and it solves some problems
> >> >> > > > > > > > > that
> >> are
> >> >> > > > currently
> >> >> > > > > > > > > difficult or impossible to handle with `pip`:
> >> >> > > > > > > > >
> >> >> > > > > > > > > * ability to use overrides (which are constraints
> >> >> > > > > > > > > on
> >> >> > steroids -
> >> >> > > > > > > allowing
> >> >> > > > > > > > to
> >> >> > > > > > > > > override limits specified by the packages - this
> >> >> > > > > > > > > will
> >> be
> >> >> very
> >> >> > > > > useful
> >> >> > > > > > to
> >> >> > > > > > > > > better handle our cases with "chicken-egg"
> >> >> > > > > > > > > providers
> >> (for
> >> >> > > example
> >> >> > > > > > like
> >> >> > > > > > > we
> >> >> > > > > > > > > had in FAB) where we have pre-release packages
> >> depending
> >> >> on
> >> >> > > each
> >> >> > > > > > other
> >> >> > > > > > > > >
> >> >> > > > > > > > > * different resolution strategies including
> >> >> > --resolution=lowest
> >> >> > > > > which
> >> >> > > > > > > > will
> >> >> > > > > > > > > finally allow us to see whether airflow's lower
> >> >> > > > > > > > > bounds
> >> are
> >> >> > > still
> >> >> > > > > > > holding
> >> >> > > > > > > > > (i.e. - will our test still pass if we use the
> >> >> > > > > > > > > lowest
> >> >> > supported
> >> >> > > > > > version
> >> >> > > > > > > > of
> >> >> > > > > > > > > our dependencies?  this is something i wanted to
> >> >> > > > > > > > > do for
> >> >> quite
> >> >> > > > some
> >> >> > > > > > time
> >> >> > > > > > > > and
> >> >> > > > > > > > > recorded an issue for that -
> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fis
> >> >> > > > > > > > > sues%2F35549&data=05%7C02%7CJens.Scheffler%40de.bo
> >> >> > > > > > > > > sch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51
> >> >> > > > > > > > > e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555
> >> >> > > > > > > > > 418852%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDA
> >> >> > > > > > > > > iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> >> >> > > > > > > > > 0%7C%7C%7C&sdata=Nz7du2MmavpWhHcFFfd8Qj2SbKWZcmXxs
> >> >> > > > > > > > > OlfMGgftwQ%3D&reserved=0 but lack of tooling
> >> >> > > > > > > > > support made it a wish, with
> >> >> > > > > > `--resolution=lowest`
> >> >> > > > > > > it
> >> >> > > > > > > > > seems like super-easy thing to do.
> >> >> > > > > > > > >
> >> >> > > > > > > > > * It is said to be many, many times faster - with
> >> better
> >> >> > > caching
> >> >> > > > > and
> >> >> > > > > > > > > resolution speeds (similarly like with ruff they
> >> >> > > > > > > > > claim
> >> >> orders
> >> >> > > of
> >> >> > > > > > > > magnitude
> >> >> > > > > > > > > speedups in a number of cases). We can likely make
> >> >> > > > > > > > > very
> >> >> good
> >> >> > > use
> >> >> > > > of
> >> >> > > > > > it
> >> >> > > > > > > > and
> >> >> > > > > > > > > speed up some parts of our CI workflow significantly.
> >> >> > > > > > > > >
> >> >> > > > > > > > > I might likely do some experimenting with uv in
> >> >> > > > > > > > > our
> >> >> > toolchain,
> >> >> > > > but
> >> >> > > > > > > wanted
> >> >> > > > > > > > > to make sure we are all aware of it - and ask if
> >> someone
> >> >> has
> >> >> > > > > > something
> >> >> > > > > > > > > against it (and maybe someone would like to do
> >> >> > > > > > > > > some
> >> work
> >> >> > there
> >> >> > > > > trying
> >> >> > > > > > > it
> >> >> > > > > > > > > out - I will be happy to guide others with the
> >> dev/tooling
> >> >> > > > mindset
> >> >> > > > > > and
> >> >> > > > > > > > > incline to do some changes there/review PRs and
> >> cooperate
> >> >> on
> >> >> > > > > testing
> >> >> > > > > > > > those
> >> >> > > > > > > > > things.
> >> >> > > > > > > > >
> >> >> > > > > > > > > It's not a user-facing change, and I do not think
> >> >> > > > > > > > > we
> >> want
> >> >> to
> >> >> > > get
> >> >> > > > > rid
> >> >> > > > > > of
> >> >> > > > > > > > > `pip` as an installation tool in general (in our
> >> >> > > > > > > > > images
> >> >> and
> >> >> > > user
> >> >> > > > > > facing
> >> >> > > > > > > > > side) - it's mostly an internal CI tooling
> >> >> > > > > > > > > improvement
> >> I
> >> >> am
> >> >> > > > > thinking
> >> >> > > > > > > of.
> >> >> > > > > > > > > Maybe at some point in time we can recommend it
> >> >> > > > > > > > > also
> >> for
> >> >> > > > > development
> >> >> > > > > > > > > workflows, and maybe someday it will gain enough
> >> >> popularity
> >> >> > to
> >> >> > > > > think
> >> >> > > > > > > > about
> >> >> > > > > > > > > recommending it to our users, but definitely not
> >> >> > > > > > > > > now
> >> nor
> >> >> in
> >> >> > > even
> >> >> > > > > > > mid-term
> >> >> > > > > > > > > future.
> >> >> > > > > > > > >
> >> >> > > > > > > > > Let me know what you think.
> >> >> > > > > > > > >
> >> >> > > > > > > > > Repo here:
> >> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
> >> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fastral-sh%2Fuv&data=0
> >> >> > > > > > > > > 5%7C02%7CJens.Scheffler%40de.bosch.com%7Cda57d392c
> >> >> > > > > > > > > f6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648e
> >> >> > > > > > > > > e58410f4%7C0%7C0%7C638445308555424433%7CUnknown%7C
> >> >> > > > > > > > > TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC
> >> >> > > > > > > > > JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Ln
> >> >> > > > > > > > >
> XRuNo6aJwsLPWwbSJrls47%2BfqH2JSMpyt61h%2F0e1g%3D&reserved=0
> >> >> > > > > > > > >
> >> >> > > > > > > > > J.
> >> >> > > > > > > > >
> >> >> > > > > > > >
> >> >> > > > > > >
> >> >> > > > > >
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> For additional commands, e-mail: dev-help@airflow.apache.org
>

RE: [DISCUSS] Considering trying out uv for our CI workflows

Posted by "Scheffler Jens (XC-AS/EAE-ADA-T)" <Je...@de.bosch.com.INVALID>.
@Jarek, had no time to review PR.
If the Docker image is ~400MB smaller, I fear there is a diff. Were you able to dump a file list to inspect the diff?
If not I would propose to make it in the PR to understand "why". If there care cache files (only) then in general it would make sense to think about if "cache/garbage" is anyway left in pip/uv which we should clean to shrink images.

Mit freundlichen Grüßen / Best regards

Jens Scheffler

Alliance: Enabler - Tech Lead (XC-AS/EAE-ADA-T)
Robert Bosch GmbH | Hessbruehlstraße 21 | 70565 Stuttgart-Vaihingen | GERMANY | http://www.bosch.com/
Tel. +49 711 811-91508 | Mobil +49 160 90417410 | Jens.Scheffler@de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Prof. Dr. Stefan Asenkerschbaumer;
Geschäftsführung: Dr. Stefan Hartung, Dr. Christian Fischer, Dr. Markus Forschner,
Stefan Grosch, Dr. Markus Heyn, Dr. Frank Meyer, Dr. Tanja Rückert

-----Original Message-----
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Montag, 26. Februar 2024 08:54
To: Amogh Desai <am...@gmail.com>
Cc: dev@airflow.apache.org
Subject: Re: [DISCUSS] Considering trying out uv for our CI workflows

Yep. It all looks good now and I re-ran last intermittently failing job:
Final effect of it:

* CI image (uncompressed) with uv is slightly smaller (3.5 GB vs. 3.9 GB)
* regular code only PRs: same time to incrementally build image ~ 1m
* adding/modifying dependency in the PR:: 12 m  -> 6m : 50% improvement
* removing dependency/rebuilding things from scratch -> 27m -> 12 m : 55% improvement

Depending on the speed of your network, also locally rebuilding your image should be generally much faster in all cases once we merge it and update cache.

Also the flaky test turned out to be really just "sometimes running much slower than expected" case - I increased the number of retries and gave the test a bit more time and added better message, so hopefully the flaky test will stop happening now.

I think it's a no-brainer :).
https://github.com/apache/airflow/pull/37692 waiting for reviews

J.



On Mon, Feb 26, 2024 at 4:50 AM Amogh Desai <am...@gmail.com>
wrote:

> Thanks for the superb investigation and effort @Jarek Potiuk
> <ja...@potiuk.com>!
>
> I quite like the performance improvement numbers uv brings in compared
> to pip.
> I see no reason not to switch to UV in prod images as well.
>
> I will take a look at the pull request soon.
>
> Thanks & Regards,
> Amogh Desai
>
> On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> I think I will get it green finally:
>> https://github.com/apache/airflow/pull/37692.
>>
>> I know where the test flakiness was from. Generally speaking it
>> turned out that there is no free lunch and - of course - cache from
>> uv increased our CI image size significantly (by around 1.5G) - and
>> it caused much slower test execution (and test became more flaky
>> because of that). So after looking at that I decided to disable the
>> cache - it's definitely not worth it to increase the size of our
>> images that much. We still have significant (50% - 60% improvements -
>> not the 60% - 70% like we had with cache), but it's still significant
>> enough. Without cache the "upgrade scenario is ~ 40s (so no 4s any
>> more) instead of 7m with pip - so this is still a huge improvement
>> (image size is even smaller than the one with `pip`).
>>
>>
>> J,
>>
>>
>>
>> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > Some more findings.
>> >
>> > Overall, I can confirm that with `uv` we will get significant - 60
>> > - 70% on build image times. This will impact both CI but also
>> > `breeze` local rebuilds.
>> >
>> > I am getting closer to a mergeable state. I switched to
>> > https://g/
>> > ithub.com%2Fapache%2Fairflow%2Fpull%2F37692&data=05%7C02%7CJens.Scheffler%40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555397453%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=QD6nJwgUu5Hncvb3Vu%2F0WCbL2iYXzxu3z8N7xYhcjAg%3D&reserved=0 to test "upgrade to latest dependencies" workflow and canary build impact.
>> >
>> > The PR is getting greener and greener. I have a few last things to
>> > address.
>> >
>> > An interesting story is that a flaky test in CLI
>> >
>> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::test
>> _cli_webserver_background)
>> > we had is suddenly significantly more flaky, so I will have to take
>> > a
>> look
>> > at how to finally remove the flakiness from it.
>> > This is a good thing because this test had been flaky for quite a
>> > while but it was very difficult to reproduce and seems that for
>> > some reason
>> it is
>> > now much easier to reproduce (which also means we will know when we
>> > fix
>> it0.
>> >
>> > Looking at stats it seems that a lot  (but not all) of the speed
>> > improvement might come with Parallel downloading of dependencies -
>> > which are in the works also for pip (
>> > https://g/
>> > ithub.com%2Fpypa%2Fpip%2Fpull%2F12388&data=05%7C02%7CJens.Scheffler
>> > %40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e
>> > 4bbb6d648ee58410f4%7C0%7C0%7C638445308555401666%7CUnknown%7CTWFpbGZ
>> > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn
>> > 0%3D%7C0%7C%7C%7C&sdata=Ycl1VKKK3Rb6iMVLq4kX3OToJXe119GlfUBE8DXK9dc
>> > %3D&reserved=0) - though it's not clear how
>> much
>> > it will help as the Batch Dowloader in pip is involved only after
>> > resolution. We will see after it is implemented if it changes things.
>> >
>> > I am also now switching PROD builds to use uv to see how much we
>> > can
>> save,
>> > but I leave `pip` as default for releases and users, the only
>> difference is
>> > CI - I've added separate step for `pip` PROD build to compare and
>> > to
>> make
>> > sure it's running fine in CI.
>> >
>> > The numbers:
>> >
>> > * for "upgrade to newer dependencies" scenario - uv is WAY faster -
>> > as I thought. In the "current" stage of the main it is: ~7m pip, 5 s (!) uv.
>> > Here caching of uv makes a huge difference, and while there is some
>> work in
>> > `pip` and resolvelib (looking at PRs/issues) it's going to be quite
>> > some time to get similar results from pip and "upgrade" builds will
>> > go down eventually from 12m to 5 m - which is a major improvement -
>> > especially
>> for
>> > elapsed time of CI builds.
>> >
>> > * from what I see package installation is super-fast in uv.
>> > Installing
>> 614
>> > packages takes (wait for it) 1s (!) where I saw it taking way over
>> > a
>> minute
>> > with `pip`. This will be hard to beat I think with Python vs. rust.
>> >
>> > Some notes about differences I saw:
>> >
>> > PIP and UV lead to slightly different resolutions when upgrading.
>> > This
>> is
>> > not a surprise because different heuristics are involved (the
>> > resolution algorithm is np-complete
>> > (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> > research.swtch.com%2Fversion-sat&data=05%7C02%7CJens.Scheffler%40de
>> > .bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6
>> > d648ee58410f4%7C0%7C0%7C638445308555405774%7CUnknown%7CTWFpbGZsb3d8
>> > eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>> > 7C0%7C%7C%7C&sdata=2yp6zlzYsFMa2Qfua6T62ADSn5q2A8hUNSVSVd3VC8Q%3D&r
>> > eserved=0)  and it's very inefficient to run the full resolution,
>> > so both pip and uv
>> take a
>> > little different approach for shortcuts and limiting the possible
>> > space
>> of
>> > solutions. I've done a few PRs limiting (lower-bound) some
>> > dependencies
>> to
>> > bring them closer) - but at the end what we get is "correct" in
>> > both
>> cases
>> > - I continue running `pip check` to make sure that whatever UV
>> > finds is also correct according to `pip`. Nothing really major
>> > there. There were literally few cases that required some manual
>> > adjustments. Nothing unmanageable also in the future, I was doing
>> > similar tweaks with `pip`
>> as
>> > well to help with the resolution.
>> >
>> > Example of differences (left. first is pip, right, second is uv)
>> >
>> > < importlib-resources==5.13.0
>> > ---
>> > > importlib-resources==6.1.1
>> >
>> > vs.
>> >
>> > < pycountry==23.12.11
>> > ---
>> > > pycountry==22.3.5
>> >
>> > It means that with `uv` we have a newer version of
>> > importlib_resources
>> but
>> > an older version of pycountry.
>> >
>> > This one I will handle by bumping pycountry in case of facebook
>> > provider and bump it to > 23.12 as the old version is 1.5 years old.
>> >
>> > J.
>> >
>> >
>> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr>
>> wrote:
>> >
>> >> That's impressive! I love this tool, not only for reducing CI time
>> >> but also for saving the environment.
>> >> Some of the previous improvements were to further parallelize CI
>> >> jobs
>> to
>> >> complete the CI faster, but this tool will help reduce the overall
>> time.
>> >>
>> >> Big +1
>> >>
>> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >>
>> >> > Hello here.
>> >> >
>> >> > I have a PR
>> >> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>> >> > 2Fgithub.com%2Fapache%2Fairflow%2Fpull%2F37683&data=05%7C02%7CJe
>> >> > ns.Scheffler%40de.bosch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7
>> >> > C0ae51e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555410125%7
>> >> > CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBT
>> >> > iI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=I8DF0ugyN53LKTOZy8N
>> >> > dhKS%2FUKGdZuI9SoOVIwgx9MI%3D&reserved=0 that
>> >> implements:
>> >> >
>> >> > * ability to choose either uv or PIP when building our images
>> >> > * CI images are built with uv by default (but you can use
>> `--no-use-uv`
>> >> as
>> >> > a flag and switch back to `pip`
>> >> > * PROD images are built with pip by default (but you can us
>> `--use-uv`
>> >> as a
>> >> > flag an switch to uv
>> >> >
>> >> > The preliminary tests show indeed that uv not only has a much
>> >> > faster baseline, but  also their use of caching fits extremely
>> >> > well into our strategy of building images and we will get huge
>> >> > improvements of our
>> CI
>> >> > build timing when using uv.
>> >> >
>> >> > Just for the context - our CI images when built are using a
>> >> > caching strategy to optimise for f
>> >> >
>> >> > 1) fast building when there are no changes (around 1 minute to
>> >> > build
>> >> with
>> >> > pip),
>> >> > 2) slower building when someone adds or modifies non-conflicting
>> >> dependency
>> >> > (around. 8 minutes to build, out of which ~ 6 m is pip
>> >> > resolution and
>> >> > installation)
>> >> > 3) much longer build time when there are conflicting
>> >> > dependencies or
>> >> when
>> >> > we change Dockerfile or scripts or when Python base image
>> >> > changes
>> >> (around
>> >> > 27 minutes build out of which pip resolving is ~ 20m).
>> >> >
>> >> > Those are all `pip` numbers. Currently `pip` does not use
>> >> > resolution caching between the steps. Comparison of some basic
>> >> > installation
>> steps
>> >> from
>> >> > initial tests show that UV is way faster:
>> >> >
>> >> > * Resolving and Installing airflow with [devel-ci] (610
>> dependencies):
>> >> pip
>> >> > ~ 6m, uv ~ 1m 30 s
>> >> > * Re-resolving and reinstalling [devel-ci] using local
>> pyproject.toml;
>> >> pip
>> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used
>> >> > in
>> this
>> >> > case.
>> >> >
>> >> > I have not yet tested well (but I will once they happen) --eager
>> >> upgrade of
>> >> > dependencies (pip - very much depends but it's often in the
>> >> > range of
>> 10
>> >> > minutes) - I expect it not to take more than 2-3 minutes with uv
>> >> >
>> >> > So overall it looks like we are looking at those improvements:
>> >> >
>> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
>> >> > (because we are using docker layer caching and pip resolution
>> >> > and installation is not used at all)
>> >> > 2) Updating dependencies: 8m with pip will probably go down with
>> >> > uv
>> to ~
>> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there
>> >> > are no
>> >> remote
>> >> > changes and cache is used (70% improvement)
>> >> > 3) Re-resolving and reinstalling everything 27 m will probably
>> >> > go
>> down
>> >> with
>> >> > uv to ~ 9m => 67% improvements.
>> >> >
>> >> > If those numbers hold and the resolution quality will be
>> >> > comparable
>> to
>> >> > `pip` - then well, it's definitely worth it - and the numbers
>> >> > are
>> very
>> >> > close to what the `uv` authors claimed.
>> >> >
>> >> > I am impressed :)
>> >> >
>> >> > J.
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <
>> amoghdesai.oss@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > I agree with Niko here.
>> >> > >
>> >> > > If someone is willing to give it a try, we should enable it
>> >> > experimentally
>> >> > > and give it a stint for a couple of weeks. If we see
>> >> > > significant
>> >> results,
>> >> > > we can adopt it.
>> >> > >
>> >> > > Thanks & Regards,
>> >> > > Amogh Desai
>> >> > >
>> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
>> >> > <onikolas@amazon.com.invalid
>> >> > > >
>> >> > > wrote:
>> >> > >
>> >> > > > The Astral folks also seem very focused on it being a
>> >> drop-in/compliant
>> >> > > > replacement for pip. So I think it's definitely worth
>> >> > > > dropping
>> it in
>> >> > and
>> >> > > > seeing if we get the expected performance improvements. If
>> >> > > > tests
>> >> still
>> >> > > pass
>> >> > > > and user facing constraints and install instructions remain
>> >> unchanged I
>> >> > > > don't see why not, if someone is willing to spend the time on it.
>> >> Never
>> >> > > > mind the extra features it would give us (I, like others, am
>> >> > > > also
>> >> very
>> >> > > > excited about --resolution=lowest, ability).
>> >> > > >
>> >> > > > ________________________________
>> >> > > > From: Andrey Anshin <an...@taragol.is>
>> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
>> >> > > > To: dev@airflow.apache.org
>> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS]
>> >> > > > Considering
>> >> trying
>> >> > > > out uv for our CI workflows
>> >> > > >
>> >> > > > CAUTION: This email originated from outside of the organization.
>> Do
>> >> not
>> >> > > > click links or open attachments unless you can confirm the
>> >> > > > sender
>> >> and
>> >> > > know
>> >> > > > the content is safe.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un
>> >> > > > expéditeur
>> >> > externe.
>> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si
>> vous ne
>> >> > > pouvez
>> >> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes
>> >> > > > pas
>> >> certain
>> >> > > que
>> >> > > > le contenu ne présente aucun risque.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > > I share Andrey's skepticism. It's just yet another tool
>> >> > > > > which
>> has
>> >> an
>> >> > > > unclear
>> >> > > > development strategy.
>> >> > > >
>> >> > > > My point was more about a matter of presentation. If someone
>> >> > > > told
>> >> you
>> >> > > "this
>> >> > > > is a new tool, like a killer of previous tools" then you
>> >> > > > might
>> think
>> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...
>> >> > > > not
>> >> really
>> >> > > > interesting". On the other hand if someone told you what in
>> >> > > > cases
>> >> you
>> >> > > might
>> >> > > > solve, then this might be a mind changer.
>> >> > > >
>> >> > > > Especially the promising `--resolution=lowest` option. We
>> >> > > > always
>> >> want
>> >> > to
>> >> > > > test something with minimal dependencies because we are not
>> >> > > > sure
>> >> that
>> >> > it
>> >> > > > might work with pretty old dependencies, and recently I've
>> started
>> >> to
>> >> > > work
>> >> > > > on POC to collect minimal versions of the Airflow and Providers.
>> >> And at
>> >> > > the
>> >> > > > moment when I almost finished it the uv was released. Well
>> >> sometimes it
>> >> > > is
>> >> > > > better to wait a bit and maybe someone would invent the same
>> >> > > > solution 😁 and you don't have to spend a personal time.
>> >> > > >
>> >> > > > So as POC I'm on it, we still need a `pip` and validate some
>> stuff
>> >> by a
>> >> > > pip
>> >> > > > because it is only one officially supported way to install
>> Airflow
>> >> but
>> >> > if
>> >> > > > something could be improved in the CI then I'm on it, in
>> >> > > > most
>> cases
>> >> it
>> >> > > > would be behind of Breeze and many of the contributors might
>> >> > > > be
>> even
>> >> > not
>> >> > > > noticed that something changed.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk
>> >> > > > <ja...@potiuk.com>
>> >> wrote:
>> >> > > >
>> >> > > > > Actually - of you read that blog post, the strategy is
>> >> > > > > clear -
>> >> they
>> >> > aim
>> >> > > > to
>> >> > > > > create a comprehensive packaging tooling and improvnts are
>> >> measured
>> >> > > > (80-100
>> >> > > > > times they claim - I using caching - they (unlike pip) use
>> >> > > > > a
>> lot
>> >> of
>> >> > > local
>> >> > > > > caching including resolving  dependencies).
>> >> > > > >
>> >> > > > > So I think both arguments are not valid if you ask me.
>> >> > > > >
>> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
>> >> > kxepal@apache.org
>> >> > > >
>> >> > > > > napisał:
>> >> > > > >
>> >> > > > > > I share Andrey's skepticism. It's just yet another tool
>> >> > > > > > which
>> >> has
>> >> > an
>> >> > > > > > unclear development strategy. Should you make it a free
>> testing
>> >> > > suite?
>> >> > > > > What
>> >> > > > > > project would receive in exchange? A lot of words about
>> >> > > > > > being
>> >> > faster,
>> >> > > > but
>> >> > > > > > how much? Are these milliseconds worth to change the
>> >> > > > > > stable
>> tool
>> >> > > with a
>> >> > > > > new
>> >> > > > > > one? And will it notably improve something?
>> >> > > > > >
>> >> > > > > > I think it's worth to try it just for fun and provide
>> feedback,
>> >> but
>> >> > > > it'll
>> >> > > > > > have to pass a long road to become such stable as pip.
>> >> > > > > >
>> >> > > > > > --
>> >> > > > > > ,,,^..^,,,
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
>> jarek@potiuk.com>
>> >> > > wrote:
>> >> > > > > >
>> >> > > > > > > My opinion:
>> >> > > > > > >
>> >> > > > > > > I think there is a place for a number of such tools.
>> >> > > > > > > For a
>> >> long
>> >> > > time
>> >> > > > > the
>> >> > > > > > > packaging team and `pip` team have been working not
>> >> > > > > > > only on
>> >> `pip`
>> >> > > > > > > implementation but also (and most importantly) to make
>> >> > > > > > > sure
>> >> that
>> >> > > what
>> >> > > > > > `pip`
>> >> > > > > > > does is to be the beacon of standardisation of
>> >> > > > > > > packaging
>> APIs
>> >> and
>> >> > > > PEPs.
>> >> > > > > > It
>> >> > > > > > > will never IMHO have a lot of the fancy features that
>> >> > > > > > > other
>> >> tools
>> >> > > > might
>> >> > > > > > > provide (like the ones I mentioned). It will always be
>> there
>> >> to
>> >> > > > provide
>> >> > > > > > the
>> >> > > > > > > robust and solid CLI to run all packaging things, but
>> >> > > > > > > there
>> >> are
>> >> > > > plenty
>> >> > > > > of
>> >> > > > > > > opportunities to provide improved or modified, or more
>> >> > > > > > > (or
>> >> less)
>> >> > > > > > > opinionated ways of doing things that are addressing
>> >> > > > > > > some
>> >> cases
>> >> > > that
>> >> > > > > > `pip`
>> >> > > > > > > team simply will not be able or willing to handle,
>> preferring
>> >> > > "pure"
>> >> > > > > > > standard approach vs. implement all the optional things.
>> For
>> >> > > example
>> >> > > > > the
>> >> > > > > > > way how pre-releases are handled can be improved to be
>> >> > > > > > > more
>> >> > > > selective.
>> >> > > > > > The
>> >> > > > > > > PEP describing it gives the tools an option to add
>> >> > > > > > > more
>> fancy
>> >> > > > > behaviours
>> >> > > > > > > (some of which we could find useful in our CI tooling).
>> Should
>> >> > > `pip`
>> >> > > > > > > implement those - I don't think so. It would distract
>> >> maintainers
>> >> > > > from
>> >> > > > > > > other more important things. It is quite ok to use
>> >> > > > > > > other
>> >> tooling
>> >> > in
>> >> > > > > > places
>> >> > > > > > > like our CI, where they do some parts of the
>> >> > > > > > > installation
>> >> better.
>> >> > > > > > >
>> >> > > > > > > For me `pip` is going more into the direction of
>> >> > > > > > > `usable
>> >> > reference
>> >> > > > > > > implementation of package installed` - any standard/
>> >> > > > > > > PEP
>> will
>> >> not
>> >> > > > > matter
>> >> > > > > > if
>> >> > > > > > > `pip` does not implement it. But others might go in
>> different
>> >> > > > > directions
>> >> > > > > > > and implement some less popular features and do it
>> >> > > > > > > better,
>> >> > faster,
>> >> > > > with
>> >> > > > > > > greater flexibility. IMHO it's a win-win.
>> >> > > > > > >
>> >> > > > > > > J.
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
>> >> > > > > andrey.anshin@taragol.is
>> >> > > > > > >
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Yesterday my friend shared with me that tool and
>> >> > > > > > > > I've
>> been
>> >> told
>> >> > > > that
>> >> > > > > > more
>> >> > > > > > > > presumably it would be a niche tool. I've been told
>> >> > > > > > > > "who
>> >> needs
>> >> > > yet
>> >> > > > > > > another
>> >> > > > > > > > installer which stands to resolve all your problems' '.
>> >> > > > > > > > I guess I was wrong?
>> >> > > > > > > >
>> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
>> >> jarek@potiuk.com>
>> >> > > > wrote:
>> >> > > > > > > >
>> >> > > > > > > > > Hey everyone,
>> >> > > > > > > > >
>> >> > > > > > > > > Few days ago the ruff creators have released a new
>> >> > > > > > > > > tool
>> >> uv -
>> >> > > > which
>> >> > > > > is
>> >> > > > > > > an
>> >> > > > > > > > > extremely fast (written in rust) and fully
>> >> > > > > > > > > featured
>> tool
>> >> > > > generally
>> >> > > > > > > fully
>> >> > > > > > > > > compatible with `pip`.
>> >> > > > > > > > >
>> >> > > > > > > > > Blog post here:
>> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
>> >> > > > > > > > > l=https%3A%2F%2Fastral.sh%2Fblog%2Fuv&data=05%7C02
>> >> > > > > > > > > %7CJens.Scheffler%40de.bosch.com%7Cda57d392cf6a479
>> >> > > > > > > > > 9ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648ee58410
>> >> > > > > > > > > f4%7C0%7C0%7C638445308555414247%7CUnknown%7CTWFpbG
>> >> > > > > > > > > Zsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
>> >> > > > > > > > > Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=KTqeTxus
>> >> > > > > > > > > gSBxgBClVc8LhjvPCJAhcmlkXM%2FK%2B53EzYM%3D&reserve
>> >> > > > > > > > > d=0
>> >> > > > > > > > >
>> >> > > > > > > > > It looks like It has a number of things that would
>> >> > > > > > > > > make
>> >> our
>> >> > CI
>> >> > > > > cases
>> >> > > > > > > and
>> >> > > > > > > > > tooling quite a bit faster and better including a
>> >> > > > > > > > > few
>> >> things
>> >> > > > that I
>> >> > > > > > > have
>> >> > > > > > > > > implemented some workarounds for and some that I
>> >> > > > > > > > > have
>> not
>> >> > > > > > > > > implemented because `pip` had no good solution.
>> >> > > > > > > > >
>> >> > > > > > > > > I looked at the docs and it solves some problems
>> >> > > > > > > > > that
>> are
>> >> > > > currently
>> >> > > > > > > > > difficult or impossible to handle with `pip`:
>> >> > > > > > > > >
>> >> > > > > > > > > * ability to use overrides (which are constraints
>> >> > > > > > > > > on
>> >> > steroids -
>> >> > > > > > > allowing
>> >> > > > > > > > to
>> >> > > > > > > > > override limits specified by the packages - this
>> >> > > > > > > > > will
>> be
>> >> very
>> >> > > > > useful
>> >> > > > > > to
>> >> > > > > > > > > better handle our cases with "chicken-egg"
>> >> > > > > > > > > providers
>> (for
>> >> > > example
>> >> > > > > > like
>> >> > > > > > > we
>> >> > > > > > > > > had in FAB) where we have pre-release packages
>> depending
>> >> on
>> >> > > each
>> >> > > > > > other
>> >> > > > > > > > >
>> >> > > > > > > > > * different resolution strategies including
>> >> > --resolution=lowest
>> >> > > > > which
>> >> > > > > > > > will
>> >> > > > > > > > > finally allow us to see whether airflow's lower
>> >> > > > > > > > > bounds
>> are
>> >> > > still
>> >> > > > > > > holding
>> >> > > > > > > > > (i.e. - will our test still pass if we use the
>> >> > > > > > > > > lowest
>> >> > supported
>> >> > > > > > version
>> >> > > > > > > > of
>> >> > > > > > > > > our dependencies?  this is something i wanted to
>> >> > > > > > > > > do for
>> >> quite
>> >> > > > some
>> >> > > > > > time
>> >> > > > > > > > and
>> >> > > > > > > > > recorded an issue for that -
>> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
>> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fis
>> >> > > > > > > > > sues%2F35549&data=05%7C02%7CJens.Scheffler%40de.bo
>> >> > > > > > > > > sch.com%7Cda57d392cf6a4799ce6608dc36a01ff7%7C0ae51
>> >> > > > > > > > > e1907c84e4bbb6d648ee58410f4%7C0%7C0%7C638445308555
>> >> > > > > > > > > 418852%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDA
>> >> > > > > > > > > iLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>> >> > > > > > > > > 0%7C%7C%7C&sdata=Nz7du2MmavpWhHcFFfd8Qj2SbKWZcmXxs
>> >> > > > > > > > > OlfMGgftwQ%3D&reserved=0 but lack of tooling
>> >> > > > > > > > > support made it a wish, with
>> >> > > > > > `--resolution=lowest`
>> >> > > > > > > it
>> >> > > > > > > > > seems like super-easy thing to do.
>> >> > > > > > > > >
>> >> > > > > > > > > * It is said to be many, many times faster - with
>> better
>> >> > > caching
>> >> > > > > and
>> >> > > > > > > > > resolution speeds (similarly like with ruff they
>> >> > > > > > > > > claim
>> >> orders
>> >> > > of
>> >> > > > > > > > magnitude
>> >> > > > > > > > > speedups in a number of cases). We can likely make
>> >> > > > > > > > > very
>> >> good
>> >> > > use
>> >> > > > of
>> >> > > > > > it
>> >> > > > > > > > and
>> >> > > > > > > > > speed up some parts of our CI workflow significantly.
>> >> > > > > > > > >
>> >> > > > > > > > > I might likely do some experimenting with uv in
>> >> > > > > > > > > our
>> >> > toolchain,
>> >> > > > but
>> >> > > > > > > wanted
>> >> > > > > > > > > to make sure we are all aware of it - and ask if
>> someone
>> >> has
>> >> > > > > > something
>> >> > > > > > > > > against it (and maybe someone would like to do
>> >> > > > > > > > > some
>> work
>> >> > there
>> >> > > > > trying
>> >> > > > > > > it
>> >> > > > > > > > > out - I will be happy to guide others with the
>> dev/tooling
>> >> > > > mindset
>> >> > > > > > and
>> >> > > > > > > > > incline to do some changes there/review PRs and
>> cooperate
>> >> on
>> >> > > > > testing
>> >> > > > > > > > those
>> >> > > > > > > > > things.
>> >> > > > > > > > >
>> >> > > > > > > > > It's not a user-facing change, and I do not think
>> >> > > > > > > > > we
>> want
>> >> to
>> >> > > get
>> >> > > > > rid
>> >> > > > > > of
>> >> > > > > > > > > `pip` as an installation tool in general (in our
>> >> > > > > > > > > images
>> >> and
>> >> > > user
>> >> > > > > > facing
>> >> > > > > > > > > side) - it's mostly an internal CI tooling
>> >> > > > > > > > > improvement
>> I
>> >> am
>> >> > > > > thinking
>> >> > > > > > > of.
>> >> > > > > > > > > Maybe at some point in time we can recommend it
>> >> > > > > > > > > also
>> for
>> >> > > > > development
>> >> > > > > > > > > workflows, and maybe someday it will gain enough
>> >> popularity
>> >> > to
>> >> > > > > think
>> >> > > > > > > > about
>> >> > > > > > > > > recommending it to our users, but definitely not
>> >> > > > > > > > > now
>> nor
>> >> in
>> >> > > even
>> >> > > > > > > mid-term
>> >> > > > > > > > > future.
>> >> > > > > > > > >
>> >> > > > > > > > > Let me know what you think.
>> >> > > > > > > > >
>> >> > > > > > > > > Repo here:
>> >> > > > > > > > > https://eur03.safelinks.protection.outlook.com/?ur
>> >> > > > > > > > > l=https%3A%2F%2Fgithub.com%2Fastral-sh%2Fuv&data=0
>> >> > > > > > > > > 5%7C02%7CJens.Scheffler%40de.bosch.com%7Cda57d392c
>> >> > > > > > > > > f6a4799ce6608dc36a01ff7%7C0ae51e1907c84e4bbb6d648e
>> >> > > > > > > > > e58410f4%7C0%7C0%7C638445308555424433%7CUnknown%7C
>> >> > > > > > > > > TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC
>> >> > > > > > > > > JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Ln
>> >> > > > > > > > > XRuNo6aJwsLPWwbSJrls47%2BfqH2JSMpyt61h%2F0e1g%3D&reserved=0
>> >> > > > > > > > >
>> >> > > > > > > > > J.
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Yep. It all looks good now and I re-ran last intermittently failing job:
Final effect of it:

* CI image (uncompressed) with uv is slightly smaller (3.5 GB vs. 3.9 GB)
* regular code only PRs: same time to incrementally build image ~ 1m
* adding/modifying dependency in the PR:: 12 m  -> 6m : 50% improvement
* removing dependency/rebuilding things from scratch -> 27m -> 12 m : 55%
improvement

Depending on the speed of your network, also locally rebuilding your image
should be generally much faster in all cases once we merge it and update
cache.

Also the flaky test turned out to be really just "sometimes running much
slower than expected" case - I increased the number of retries and gave the
test a bit more time and added better message, so hopefully the flaky test
will stop happening now.

I think it's a no-brainer :).
https://github.com/apache/airflow/pull/37692 waiting
for reviews

J.



On Mon, Feb 26, 2024 at 4:50 AM Amogh Desai <am...@gmail.com>
wrote:

> Thanks for the superb investigation and effort @Jarek Potiuk
> <ja...@potiuk.com>!
>
> I quite like the performance improvement numbers uv brings in compared to
> pip.
> I see no reason not to switch to UV in prod images as well.
>
> I will take a look at the pull request soon.
>
> Thanks & Regards,
> Amogh Desai
>
> On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> I think I will get it green finally:
>> https://github.com/apache/airflow/pull/37692.
>>
>> I know where the test flakiness was from. Generally speaking it turned out
>> that there is no free lunch and - of course - cache from uv increased our
>> CI image size significantly (by around 1.5G) - and it caused much slower
>> test execution (and test became more flaky because of that). So after
>> looking at that I decided to disable the cache - it's definitely not worth
>> it to increase the size of our images that much. We still have
>> significant (50% - 60% improvements - not the 60% - 70% like we had with
>> cache), but it's still significant enough. Without cache the "upgrade
>> scenario is ~ 40s (so no 4s any more) instead of 7m with pip - so this is
>> still a huge improvement (image size is even smaller than the one with
>> `pip`).
>>
>>
>> J,
>>
>>
>>
>> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > Some more findings.
>> >
>> > Overall, I can confirm that with `uv` we will get significant - 60 - 70%
>> > on build image times. This will impact both CI but also `breeze` local
>> > rebuilds.
>> >
>> > I am getting closer to a mergeable state. I switched to
>> > https://github.com/apache/airflow/pull/37692 to test "upgrade to latest
>> > dependencies" workflow and canary build impact.
>> >
>> > The PR is getting greener and greener. I have a few last things to
>> > address.
>> >
>> > An interesting story is that a flaky test in CLI
>> >
>> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::test_cli_webserver_background)
>> > we had is suddenly significantly more flaky, so I will have to take a
>> look
>> > at how to finally remove the flakiness from it.
>> > This is a good thing because this test had been flaky for quite a while
>> > but it was very difficult to reproduce and seems that for some reason
>> it is
>> > now much easier to reproduce (which also means we will know when we fix
>> it0.
>> >
>> > Looking at stats it seems that a lot  (but not all) of the speed
>> > improvement might come with Parallel downloading of dependencies -
>> > which are in the works also for pip (
>> > https://github.com/pypa/pip/pull/12388) - though it's not clear how
>> much
>> > it will help as the Batch Dowloader in pip is involved only after
>> > resolution. We will see after it is implemented if it changes things.
>> >
>> > I am also now switching PROD builds to use uv to see how much we can
>> save,
>> > but I leave `pip` as default for releases and users, the only
>> difference is
>> > CI - I've added separate step for `pip` PROD build to compare and to
>> make
>> > sure it's running fine in CI.
>> >
>> > The numbers:
>> >
>> > * for "upgrade to newer dependencies" scenario - uv is WAY faster - as I
>> > thought. In the "current" stage of the main it is: ~7m pip, 5 s (!) uv.
>> > Here caching of uv makes a huge difference, and while there is some
>> work in
>> > `pip` and resolvelib (looking at PRs/issues) it's going to be quite some
>> > time to get similar results from pip and "upgrade" builds will go down
>> > eventually from 12m to 5 m - which is a major improvement - especially
>> for
>> > elapsed time of CI builds.
>> >
>> > * from what I see package installation is super-fast in uv. Installing
>> 614
>> > packages takes (wait for it) 1s (!) where I saw it taking way over a
>> minute
>> > with `pip`. This will be hard to beat I think with Python vs. rust.
>> >
>> > Some notes about differences I saw:
>> >
>> > PIP and UV lead to slightly different resolutions when upgrading. This
>> is
>> > not a surprise because different heuristics are involved (the resolution
>> > algorithm is np-complete (https://research.swtch.com/version-sat)  and
>> > it's very inefficient to run the full resolution, so both pip and uv
>> take a
>> > little different approach for shortcuts and limiting the possible space
>> of
>> > solutions. I've done a few PRs limiting (lower-bound) some dependencies
>> to
>> > bring them closer) - but at the end what we get is "correct" in both
>> cases
>> > - I continue running `pip check` to make sure that whatever UV finds is
>> > also correct according to `pip`. Nothing really major there. There were
>> > literally few cases that required some manual adjustments. Nothing
>> > unmanageable also in the future, I was doing similar tweaks with `pip`
>> as
>> > well to help with the resolution.
>> >
>> > Example of differences (left. first is pip, right, second is uv)
>> >
>> > < importlib-resources==5.13.0
>> > ---
>> > > importlib-resources==6.1.1
>> >
>> > vs.
>> >
>> > < pycountry==23.12.11
>> > ---
>> > > pycountry==22.3.5
>> >
>> > It means that with `uv` we have a newer version of importlib_resources
>> but
>> > an older version of pycountry.
>> >
>> > This one I will handle by bumping pycountry in case of facebook provider
>> > and bump it to > 23.12 as the old version is 1.5 years old.
>> >
>> > J.
>> >
>> >
>> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr>
>> wrote:
>> >
>> >> That's impressive! I love this tool, not only for reducing CI time but
>> >> also
>> >> for saving the environment.
>> >> Some of the previous improvements were to further parallelize CI jobs
>> to
>> >> complete the CI faster, but this tool will help reduce the overall
>> time.
>> >>
>> >> Big +1
>> >>
>> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >>
>> >> > Hello here.
>> >> >
>> >> > I have a PR https://github.com/apache/airflow/pull/37683 that
>> >> implements:
>> >> >
>> >> > * ability to choose either uv or PIP when building our images
>> >> > * CI images are built with uv by default (but you can use
>> `--no-use-uv`
>> >> as
>> >> > a flag and switch back to `pip`
>> >> > * PROD images are built with pip by default (but you can us
>> `--use-uv`
>> >> as a
>> >> > flag an switch to uv
>> >> >
>> >> > The preliminary tests show indeed that uv not only has a much faster
>> >> > baseline, but  also their use of caching fits extremely well into our
>> >> > strategy of building images and we will get huge improvements of our
>> CI
>> >> > build timing when using uv.
>> >> >
>> >> > Just for the context - our CI images when built are using a caching
>> >> > strategy to optimise for f
>> >> >
>> >> > 1) fast building when there are no changes (around 1 minute to build
>> >> with
>> >> > pip),
>> >> > 2) slower building when someone adds or modifies non-conflicting
>> >> dependency
>> >> > (around. 8 minutes to build, out of which ~ 6 m is pip resolution and
>> >> > installation)
>> >> > 3) much longer build time when there are conflicting dependencies or
>> >> when
>> >> > we change Dockerfile or scripts or when Python base image changes
>> >> (around
>> >> > 27 minutes build out of which pip resolving is ~ 20m).
>> >> >
>> >> > Those are all `pip` numbers. Currently `pip` does not use resolution
>> >> > caching between the steps. Comparison of some basic installation
>> steps
>> >> from
>> >> > initial tests show that UV is way faster:
>> >> >
>> >> > * Resolving and Installing airflow with [devel-ci] (610
>> dependencies):
>> >> pip
>> >> > ~ 6m, uv ~ 1m 30 s
>> >> > * Re-resolving and reinstalling [devel-ci] using local
>> pyproject.toml;
>> >> pip
>> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in
>> this
>> >> > case.
>> >> >
>> >> > I have not yet tested well (but I will once they happen) --eager
>> >> upgrade of
>> >> > dependencies (pip - very much depends but it's often in the range of
>> 10
>> >> > minutes) - I expect it not to take more than 2-3 minutes with uv
>> >> >
>> >> > So overall it looks like we are looking at those improvements:
>> >> >
>> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
>> >> > (because we are using docker layer caching and pip resolution and
>> >> > installation is not used at all)
>> >> > 2) Updating dependencies: 8m with pip will probably go down with uv
>> to ~
>> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no
>> >> remote
>> >> > changes and cache is used (70% improvement)
>> >> > 3) Re-resolving and reinstalling everything 27 m will probably go
>> down
>> >> with
>> >> > uv to ~ 9m => 67% improvements.
>> >> >
>> >> > If those numbers hold and the resolution quality will be comparable
>> to
>> >> > `pip` - then well, it's definitely worth it - and the numbers are
>> very
>> >> > close to what the `uv` authors claimed.
>> >> >
>> >> > I am impressed :)
>> >> >
>> >> > J.
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <
>> amoghdesai.oss@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > I agree with Niko here.
>> >> > >
>> >> > > If someone is willing to give it a try, we should enable it
>> >> > experimentally
>> >> > > and give it a stint for a couple of weeks. If we see significant
>> >> results,
>> >> > > we can adopt it.
>> >> > >
>> >> > > Thanks & Regards,
>> >> > > Amogh Desai
>> >> > >
>> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
>> >> > <onikolas@amazon.com.invalid
>> >> > > >
>> >> > > wrote:
>> >> > >
>> >> > > > The Astral folks also seem very focused on it being a
>> >> drop-in/compliant
>> >> > > > replacement for pip. So I think it's definitely worth dropping
>> it in
>> >> > and
>> >> > > > seeing if we get the expected performance improvements. If tests
>> >> still
>> >> > > pass
>> >> > > > and user facing constraints and install instructions remain
>> >> unchanged I
>> >> > > > don't see why not, if someone is willing to spend the time on it.
>> >> Never
>> >> > > > mind the extra features it would give us (I, like others, am also
>> >> very
>> >> > > > excited about --resolution=lowest, ability).
>> >> > > >
>> >> > > > ________________________________
>> >> > > > From: Andrey Anshin <an...@taragol.is>
>> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
>> >> > > > To: dev@airflow.apache.org
>> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering
>> >> trying
>> >> > > > out uv for our CI workflows
>> >> > > >
>> >> > > > CAUTION: This email originated from outside of the organization.
>> Do
>> >> not
>> >> > > > click links or open attachments unless you can confirm the sender
>> >> and
>> >> > > know
>> >> > > > the content is safe.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
>> >> > externe.
>> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si
>> vous ne
>> >> > > pouvez
>> >> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
>> >> certain
>> >> > > que
>> >> > > > le contenu ne présente aucun risque.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > > I share Andrey's skepticism. It's just yet another tool which
>> has
>> >> an
>> >> > > > unclear
>> >> > > > development strategy.
>> >> > > >
>> >> > > > My point was more about a matter of presentation. If someone told
>> >> you
>> >> > > "this
>> >> > > > is a new tool, like a killer of previous tools" then you might
>> think
>> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...  not
>> >> really
>> >> > > > interesting". On the other hand if someone told you what in cases
>> >> you
>> >> > > might
>> >> > > > solve, then this might be a mind changer.
>> >> > > >
>> >> > > > Especially the promising `--resolution=lowest` option. We always
>> >> want
>> >> > to
>> >> > > > test something with minimal dependencies because we are not sure
>> >> that
>> >> > it
>> >> > > > might work with pretty old dependencies, and recently I've
>> started
>> >> to
>> >> > > work
>> >> > > > on POC to collect minimal versions of the Airflow and Providers.
>> >> And at
>> >> > > the
>> >> > > > moment when I almost finished it the uv was released. Well
>> >> sometimes it
>> >> > > is
>> >> > > > better to wait a bit and maybe someone would invent the same
>> >> > > > solution 😁 and you don't have to spend a personal time.
>> >> > > >
>> >> > > > So as POC I'm on it, we still need a `pip` and validate some
>> stuff
>> >> by a
>> >> > > pip
>> >> > > > because it is only one officially supported way to install
>> Airflow
>> >> but
>> >> > if
>> >> > > > something could be improved in the CI then I'm on it, in most
>> cases
>> >> it
>> >> > > > would be behind of Breeze and many of the contributors might be
>> even
>> >> > not
>> >> > > > noticed that something changed.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com>
>> >> wrote:
>> >> > > >
>> >> > > > > Actually - of you read that blog post, the strategy is clear -
>> >> they
>> >> > aim
>> >> > > > to
>> >> > > > > create a comprehensive packaging tooling and improvnts are
>> >> measured
>> >> > > > (80-100
>> >> > > > > times they claim - I using caching - they (unlike pip) use a
>> lot
>> >> of
>> >> > > local
>> >> > > > > caching including resolving  dependencies).
>> >> > > > >
>> >> > > > > So I think both arguments are not valid if you ask me.
>> >> > > > >
>> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
>> >> > kxepal@apache.org
>> >> > > >
>> >> > > > > napisał:
>> >> > > > >
>> >> > > > > > I share Andrey's skepticism. It's just yet another tool which
>> >> has
>> >> > an
>> >> > > > > > unclear development strategy. Should you make it a free
>> testing
>> >> > > suite?
>> >> > > > > What
>> >> > > > > > project would receive in exchange? A lot of words about being
>> >> > faster,
>> >> > > > but
>> >> > > > > > how much? Are these milliseconds worth to change the stable
>> tool
>> >> > > with a
>> >> > > > > new
>> >> > > > > > one? And will it notably improve something?
>> >> > > > > >
>> >> > > > > > I think it's worth to try it just for fun and provide
>> feedback,
>> >> but
>> >> > > > it'll
>> >> > > > > > have to pass a long road to become such stable as pip.
>> >> > > > > >
>> >> > > > > > --
>> >> > > > > > ,,,^..^,,,
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
>> jarek@potiuk.com>
>> >> > > wrote:
>> >> > > > > >
>> >> > > > > > > My opinion:
>> >> > > > > > >
>> >> > > > > > > I think there is a place for a number of such tools. For a
>> >> long
>> >> > > time
>> >> > > > > the
>> >> > > > > > > packaging team and `pip` team have been working not only on
>> >> `pip`
>> >> > > > > > > implementation but also (and most importantly) to make sure
>> >> that
>> >> > > what
>> >> > > > > > `pip`
>> >> > > > > > > does is to be the beacon of standardisation of packaging
>> APIs
>> >> and
>> >> > > > PEPs.
>> >> > > > > > It
>> >> > > > > > > will never IMHO have a lot of the fancy features that other
>> >> tools
>> >> > > > might
>> >> > > > > > > provide (like the ones I mentioned). It will always be
>> there
>> >> to
>> >> > > > provide
>> >> > > > > > the
>> >> > > > > > > robust and solid CLI to run all packaging things, but there
>> >> are
>> >> > > > plenty
>> >> > > > > of
>> >> > > > > > > opportunities to provide improved or modified, or more (or
>> >> less)
>> >> > > > > > > opinionated ways of doing things that are addressing some
>> >> cases
>> >> > > that
>> >> > > > > > `pip`
>> >> > > > > > > team simply will not be able or willing to handle,
>> preferring
>> >> > > "pure"
>> >> > > > > > > standard approach vs. implement all the optional things.
>> For
>> >> > > example
>> >> > > > > the
>> >> > > > > > > way how pre-releases are handled can be improved to be more
>> >> > > > selective.
>> >> > > > > > The
>> >> > > > > > > PEP describing it gives the tools an option to add more
>> fancy
>> >> > > > > behaviours
>> >> > > > > > > (some of which we could find useful in our CI tooling).
>> Should
>> >> > > `pip`
>> >> > > > > > > implement those - I don't think so. It would distract
>> >> maintainers
>> >> > > > from
>> >> > > > > > > other more important things. It is quite ok to use other
>> >> tooling
>> >> > in
>> >> > > > > > places
>> >> > > > > > > like our CI, where they do some parts of the installation
>> >> better.
>> >> > > > > > >
>> >> > > > > > > For me `pip` is going more into the direction of `usable
>> >> > reference
>> >> > > > > > > implementation of package installed` - any standard/ PEP
>> will
>> >> not
>> >> > > > > matter
>> >> > > > > > if
>> >> > > > > > > `pip` does not implement it. But others might go in
>> different
>> >> > > > > directions
>> >> > > > > > > and implement some less popular features and do it better,
>> >> > faster,
>> >> > > > with
>> >> > > > > > > greater flexibility. IMHO it's a win-win.
>> >> > > > > > >
>> >> > > > > > > J.
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
>> >> > > > > andrey.anshin@taragol.is
>> >> > > > > > >
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Yesterday my friend shared with me that tool and I've
>> been
>> >> told
>> >> > > > that
>> >> > > > > > more
>> >> > > > > > > > presumably it would be a niche tool. I've been told "who
>> >> needs
>> >> > > yet
>> >> > > > > > > another
>> >> > > > > > > > installer which stands to resolve all your problems' '.
>> >> > > > > > > > I guess I was wrong?
>> >> > > > > > > >
>> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
>> >> jarek@potiuk.com>
>> >> > > > wrote:
>> >> > > > > > > >
>> >> > > > > > > > > Hey everyone,
>> >> > > > > > > > >
>> >> > > > > > > > > Few days ago the ruff creators have released a new tool
>> >> uv -
>> >> > > > which
>> >> > > > > is
>> >> > > > > > > an
>> >> > > > > > > > > extremely fast (written in rust) and fully featured
>> tool
>> >> > > > generally
>> >> > > > > > > fully
>> >> > > > > > > > > compatible with `pip`.
>> >> > > > > > > > >
>> >> > > > > > > > > Blog post here: https://astral.sh/blog/uv
>> >> > > > > > > > >
>> >> > > > > > > > > It looks like It has a number of things that would make
>> >> our
>> >> > CI
>> >> > > > > cases
>> >> > > > > > > and
>> >> > > > > > > > > tooling quite a bit faster and better including a few
>> >> things
>> >> > > > that I
>> >> > > > > > > have
>> >> > > > > > > > > implemented some workarounds for and some that I have
>> not
>> >> > > > > > > > > implemented because `pip` had no good solution.
>> >> > > > > > > > >
>> >> > > > > > > > > I looked at the docs and it solves some problems that
>> are
>> >> > > > currently
>> >> > > > > > > > > difficult or impossible to handle with `pip`:
>> >> > > > > > > > >
>> >> > > > > > > > > * ability to use overrides (which are constraints on
>> >> > steroids -
>> >> > > > > > > allowing
>> >> > > > > > > > to
>> >> > > > > > > > > override limits specified by the packages - this will
>> be
>> >> very
>> >> > > > > useful
>> >> > > > > > to
>> >> > > > > > > > > better handle our cases with "chicken-egg" providers
>> (for
>> >> > > example
>> >> > > > > > like
>> >> > > > > > > we
>> >> > > > > > > > > had in FAB) where we have pre-release packages
>> depending
>> >> on
>> >> > > each
>> >> > > > > > other
>> >> > > > > > > > >
>> >> > > > > > > > > * different resolution strategies including
>> >> > --resolution=lowest
>> >> > > > > which
>> >> > > > > > > > will
>> >> > > > > > > > > finally allow us to see whether airflow's lower bounds
>> are
>> >> > > still
>> >> > > > > > > holding
>> >> > > > > > > > > (i.e. - will our test still pass if we use the lowest
>> >> > supported
>> >> > > > > > version
>> >> > > > > > > > of
>> >> > > > > > > > > our dependencies?  this is something i wanted to do for
>> >> quite
>> >> > > > some
>> >> > > > > > time
>> >> > > > > > > > and
>> >> > > > > > > > > recorded an issue for that -
>> >> > > > > > > > > https://github.com/apache/airflow/issues/35549
>> >> > > > > > > > > but lack of tooling support made it a wish, with
>> >> > > > > > `--resolution=lowest`
>> >> > > > > > > it
>> >> > > > > > > > > seems like super-easy thing to do.
>> >> > > > > > > > >
>> >> > > > > > > > > * It is said to be many, many times faster - with
>> better
>> >> > > caching
>> >> > > > > and
>> >> > > > > > > > > resolution speeds (similarly like with ruff they claim
>> >> orders
>> >> > > of
>> >> > > > > > > > magnitude
>> >> > > > > > > > > speedups in a number of cases). We can likely make very
>> >> good
>> >> > > use
>> >> > > > of
>> >> > > > > > it
>> >> > > > > > > > and
>> >> > > > > > > > > speed up some parts of our CI workflow significantly.
>> >> > > > > > > > >
>> >> > > > > > > > > I might likely do some experimenting with uv in our
>> >> > toolchain,
>> >> > > > but
>> >> > > > > > > wanted
>> >> > > > > > > > > to make sure we are all aware of it - and ask if
>> someone
>> >> has
>> >> > > > > > something
>> >> > > > > > > > > against it (and maybe someone would like to do some
>> work
>> >> > there
>> >> > > > > trying
>> >> > > > > > > it
>> >> > > > > > > > > out - I will be happy to guide others with the
>> dev/tooling
>> >> > > > mindset
>> >> > > > > > and
>> >> > > > > > > > > incline to do some changes there/review PRs and
>> cooperate
>> >> on
>> >> > > > > testing
>> >> > > > > > > > those
>> >> > > > > > > > > things.
>> >> > > > > > > > >
>> >> > > > > > > > > It's not a user-facing change, and I do not think we
>> want
>> >> to
>> >> > > get
>> >> > > > > rid
>> >> > > > > > of
>> >> > > > > > > > > `pip` as an installation tool in general (in our images
>> >> and
>> >> > > user
>> >> > > > > > facing
>> >> > > > > > > > > side) - it's mostly an internal CI tooling improvement
>> I
>> >> am
>> >> > > > > thinking
>> >> > > > > > > of.
>> >> > > > > > > > > Maybe at some point in time we can recommend it also
>> for
>> >> > > > > development
>> >> > > > > > > > > workflows, and maybe someday it will gain enough
>> >> popularity
>> >> > to
>> >> > > > > think
>> >> > > > > > > > about
>> >> > > > > > > > > recommending it to our users, but definitely not now
>> nor
>> >> in
>> >> > > even
>> >> > > > > > > mid-term
>> >> > > > > > > > > future.
>> >> > > > > > > > >
>> >> > > > > > > > > Let me know what you think.
>> >> > > > > > > > >
>> >> > > > > > > > > Repo here: https://github.com/astral-sh/uv
>> >> > > > > > > > >
>> >> > > > > > > > > J.
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>>
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Amogh Desai <am...@gmail.com>.
Thanks for the superb investigation and effort @Jarek Potiuk
<ja...@potiuk.com>!

I quite like the performance improvement numbers uv brings in compared to
pip.
I see no reason not to switch to UV in prod images as well.

I will take a look at the pull request soon.

Thanks & Regards,
Amogh Desai

On Mon, Feb 26, 2024 at 5:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> I think I will get it green finally:
> https://github.com/apache/airflow/pull/37692.
>
> I know where the test flakiness was from. Generally speaking it turned out
> that there is no free lunch and - of course - cache from uv increased our
> CI image size significantly (by around 1.5G) - and it caused much slower
> test execution (and test became more flaky because of that). So after
> looking at that I decided to disable the cache - it's definitely not worth
> it to increase the size of our images that much. We still have
> significant (50% - 60% improvements - not the 60% - 70% like we had with
> cache), but it's still significant enough. Without cache the "upgrade
> scenario is ~ 40s (so no 4s any more) instead of 7m with pip - so this is
> still a huge improvement (image size is even smaller than the one with
> `pip`).
>
>
> J,
>
>
>
> On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Some more findings.
> >
> > Overall, I can confirm that with `uv` we will get significant - 60 - 70%
> > on build image times. This will impact both CI but also `breeze` local
> > rebuilds.
> >
> > I am getting closer to a mergeable state. I switched to
> > https://github.com/apache/airflow/pull/37692 to test "upgrade to latest
> > dependencies" workflow and canary build impact.
> >
> > The PR is getting greener and greener. I have a few last things to
> > address.
> >
> > An interesting story is that a flaky test in CLI
> >
> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::test_cli_webserver_background)
> > we had is suddenly significantly more flaky, so I will have to take a
> look
> > at how to finally remove the flakiness from it.
> > This is a good thing because this test had been flaky for quite a while
> > but it was very difficult to reproduce and seems that for some reason it
> is
> > now much easier to reproduce (which also means we will know when we fix
> it0.
> >
> > Looking at stats it seems that a lot  (but not all) of the speed
> > improvement might come with Parallel downloading of dependencies -
> > which are in the works also for pip (
> > https://github.com/pypa/pip/pull/12388) - though it's not clear how much
> > it will help as the Batch Dowloader in pip is involved only after
> > resolution. We will see after it is implemented if it changes things.
> >
> > I am also now switching PROD builds to use uv to see how much we can
> save,
> > but I leave `pip` as default for releases and users, the only difference
> is
> > CI - I've added separate step for `pip` PROD build to compare and to make
> > sure it's running fine in CI.
> >
> > The numbers:
> >
> > * for "upgrade to newer dependencies" scenario - uv is WAY faster - as I
> > thought. In the "current" stage of the main it is: ~7m pip, 5 s (!) uv.
> > Here caching of uv makes a huge difference, and while there is some work
> in
> > `pip` and resolvelib (looking at PRs/issues) it's going to be quite some
> > time to get similar results from pip and "upgrade" builds will go down
> > eventually from 12m to 5 m - which is a major improvement - especially
> for
> > elapsed time of CI builds.
> >
> > * from what I see package installation is super-fast in uv. Installing
> 614
> > packages takes (wait for it) 1s (!) where I saw it taking way over a
> minute
> > with `pip`. This will be hard to beat I think with Python vs. rust.
> >
> > Some notes about differences I saw:
> >
> > PIP and UV lead to slightly different resolutions when upgrading. This is
> > not a surprise because different heuristics are involved (the resolution
> > algorithm is np-complete (https://research.swtch.com/version-sat)  and
> > it's very inefficient to run the full resolution, so both pip and uv
> take a
> > little different approach for shortcuts and limiting the possible space
> of
> > solutions. I've done a few PRs limiting (lower-bound) some dependencies
> to
> > bring them closer) - but at the end what we get is "correct" in both
> cases
> > - I continue running `pip check` to make sure that whatever UV finds is
> > also correct according to `pip`. Nothing really major there. There were
> > literally few cases that required some manual adjustments. Nothing
> > unmanageable also in the future, I was doing similar tweaks with `pip` as
> > well to help with the resolution.
> >
> > Example of differences (left. first is pip, right, second is uv)
> >
> > < importlib-resources==5.13.0
> > ---
> > > importlib-resources==6.1.1
> >
> > vs.
> >
> > < pycountry==23.12.11
> > ---
> > > pycountry==22.3.5
> >
> > It means that with `uv` we have a newer version of importlib_resources
> but
> > an older version of pycountry.
> >
> > This one I will handle by bumping pycountry in case of facebook provider
> > and bump it to > 23.12 as the old version is 1.5 years old.
> >
> > J.
> >
> >
> > On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr> wrote:
> >
> >> That's impressive! I love this tool, not only for reducing CI time but
> >> also
> >> for saving the environment.
> >> Some of the previous improvements were to further parallelize CI jobs to
> >> complete the CI faster, but this tool will help reduce the overall time.
> >>
> >> Big +1
> >>
> >> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >>
> >> > Hello here.
> >> >
> >> > I have a PR https://github.com/apache/airflow/pull/37683 that
> >> implements:
> >> >
> >> > * ability to choose either uv or PIP when building our images
> >> > * CI images are built with uv by default (but you can use
> `--no-use-uv`
> >> as
> >> > a flag and switch back to `pip`
> >> > * PROD images are built with pip by default (but you can us `--use-uv`
> >> as a
> >> > flag an switch to uv
> >> >
> >> > The preliminary tests show indeed that uv not only has a much faster
> >> > baseline, but  also their use of caching fits extremely well into our
> >> > strategy of building images and we will get huge improvements of our
> CI
> >> > build timing when using uv.
> >> >
> >> > Just for the context - our CI images when built are using a caching
> >> > strategy to optimise for f
> >> >
> >> > 1) fast building when there are no changes (around 1 minute to build
> >> with
> >> > pip),
> >> > 2) slower building when someone adds or modifies non-conflicting
> >> dependency
> >> > (around. 8 minutes to build, out of which ~ 6 m is pip resolution and
> >> > installation)
> >> > 3) much longer build time when there are conflicting dependencies or
> >> when
> >> > we change Dockerfile or scripts or when Python base image changes
> >> (around
> >> > 27 minutes build out of which pip resolving is ~ 20m).
> >> >
> >> > Those are all `pip` numbers. Currently `pip` does not use resolution
> >> > caching between the steps. Comparison of some basic installation steps
> >> from
> >> > initial tests show that UV is way faster:
> >> >
> >> > * Resolving and Installing airflow with [devel-ci] (610 dependencies):
> >> pip
> >> > ~ 6m, uv ~ 1m 30 s
> >> > * Re-resolving and reinstalling [devel-ci] using local pyproject.toml;
> >> pip
> >> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in
> this
> >> > case.
> >> >
> >> > I have not yet tested well (but I will once they happen) --eager
> >> upgrade of
> >> > dependencies (pip - very much depends but it's often in the range of
> 10
> >> > minutes) - I expect it not to take more than 2-3 minutes with uv
> >> >
> >> > So overall it looks like we are looking at those improvements:
> >> >
> >> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
> >> > (because we are using docker layer caching and pip resolution and
> >> > installation is not used at all)
> >> > 2) Updating dependencies: 8m with pip will probably go down with uv
> to ~
> >> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no
> >> remote
> >> > changes and cache is used (70% improvement)
> >> > 3) Re-resolving and reinstalling everything 27 m will probably go down
> >> with
> >> > uv to ~ 9m => 67% improvements.
> >> >
> >> > If those numbers hold and the resolution quality will be comparable to
> >> > `pip` - then well, it's definitely worth it - and the numbers are very
> >> > close to what the `uv` authors claimed.
> >> >
> >> > I am impressed :)
> >> >
> >> > J.
> >> >
> >> >
> >> >
> >> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <amoghdesai.oss@gmail.com
> >
> >> > wrote:
> >> >
> >> > > I agree with Niko here.
> >> > >
> >> > > If someone is willing to give it a try, we should enable it
> >> > experimentally
> >> > > and give it a stint for a couple of weeks. If we see significant
> >> results,
> >> > > we can adopt it.
> >> > >
> >> > > Thanks & Regards,
> >> > > Amogh Desai
> >> > >
> >> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
> >> > <onikolas@amazon.com.invalid
> >> > > >
> >> > > wrote:
> >> > >
> >> > > > The Astral folks also seem very focused on it being a
> >> drop-in/compliant
> >> > > > replacement for pip. So I think it's definitely worth dropping it
> in
> >> > and
> >> > > > seeing if we get the expected performance improvements. If tests
> >> still
> >> > > pass
> >> > > > and user facing constraints and install instructions remain
> >> unchanged I
> >> > > > don't see why not, if someone is willing to spend the time on it.
> >> Never
> >> > > > mind the extra features it would give us (I, like others, am also
> >> very
> >> > > > excited about --resolution=lowest, ability).
> >> > > >
> >> > > > ________________________________
> >> > > > From: Andrey Anshin <an...@taragol.is>
> >> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
> >> > > > To: dev@airflow.apache.org
> >> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering
> >> trying
> >> > > > out uv for our CI workflows
> >> > > >
> >> > > > CAUTION: This email originated from outside of the organization.
> Do
> >> not
> >> > > > click links or open attachments unless you can confirm the sender
> >> and
> >> > > know
> >> > > > the content is safe.
> >> > > >
> >> > > >
> >> > > >
> >> > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> >> > externe.
> >> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous
> ne
> >> > > pouvez
> >> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
> >> certain
> >> > > que
> >> > > > le contenu ne présente aucun risque.
> >> > > >
> >> > > >
> >> > > >
> >> > > > > I share Andrey's skepticism. It's just yet another tool which
> has
> >> an
> >> > > > unclear
> >> > > > development strategy.
> >> > > >
> >> > > > My point was more about a matter of presentation. If someone told
> >> you
> >> > > "this
> >> > > > is a new tool, like a killer of previous tools" then you might
> think
> >> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...  not
> >> really
> >> > > > interesting". On the other hand if someone told you what in cases
> >> you
> >> > > might
> >> > > > solve, then this might be a mind changer.
> >> > > >
> >> > > > Especially the promising `--resolution=lowest` option. We always
> >> want
> >> > to
> >> > > > test something with minimal dependencies because we are not sure
> >> that
> >> > it
> >> > > > might work with pretty old dependencies, and recently I've started
> >> to
> >> > > work
> >> > > > on POC to collect minimal versions of the Airflow and Providers.
> >> And at
> >> > > the
> >> > > > moment when I almost finished it the uv was released. Well
> >> sometimes it
> >> > > is
> >> > > > better to wait a bit and maybe someone would invent the same
> >> > > > solution 😁 and you don't have to spend a personal time.
> >> > > >
> >> > > > So as POC I'm on it, we still need a `pip` and validate some stuff
> >> by a
> >> > > pip
> >> > > > because it is only one officially supported way to install Airflow
> >> but
> >> > if
> >> > > > something could be improved in the CI then I'm on it, in most
> cases
> >> it
> >> > > > would be behind of Breeze and many of the contributors might be
> even
> >> > not
> >> > > > noticed that something changed.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com>
> >> wrote:
> >> > > >
> >> > > > > Actually - of you read that blog post, the strategy is clear -
> >> they
> >> > aim
> >> > > > to
> >> > > > > create a comprehensive packaging tooling and improvnts are
> >> measured
> >> > > > (80-100
> >> > > > > times they claim - I using caching - they (unlike pip) use a lot
> >> of
> >> > > local
> >> > > > > caching including resolving  dependencies).
> >> > > > >
> >> > > > > So I think both arguments are not valid if you ask me.
> >> > > > >
> >> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
> >> > kxepal@apache.org
> >> > > >
> >> > > > > napisał:
> >> > > > >
> >> > > > > > I share Andrey's skepticism. It's just yet another tool which
> >> has
> >> > an
> >> > > > > > unclear development strategy. Should you make it a free
> testing
> >> > > suite?
> >> > > > > What
> >> > > > > > project would receive in exchange? A lot of words about being
> >> > faster,
> >> > > > but
> >> > > > > > how much? Are these milliseconds worth to change the stable
> tool
> >> > > with a
> >> > > > > new
> >> > > > > > one? And will it notably improve something?
> >> > > > > >
> >> > > > > > I think it's worth to try it just for fun and provide
> feedback,
> >> but
> >> > > > it'll
> >> > > > > > have to pass a long road to become such stable as pip.
> >> > > > > >
> >> > > > > > --
> >> > > > > > ,,,^..^,,,
> >> > > > > >
> >> > > > > >
> >> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <
> jarek@potiuk.com>
> >> > > wrote:
> >> > > > > >
> >> > > > > > > My opinion:
> >> > > > > > >
> >> > > > > > > I think there is a place for a number of such tools. For a
> >> long
> >> > > time
> >> > > > > the
> >> > > > > > > packaging team and `pip` team have been working not only on
> >> `pip`
> >> > > > > > > implementation but also (and most importantly) to make sure
> >> that
> >> > > what
> >> > > > > > `pip`
> >> > > > > > > does is to be the beacon of standardisation of packaging
> APIs
> >> and
> >> > > > PEPs.
> >> > > > > > It
> >> > > > > > > will never IMHO have a lot of the fancy features that other
> >> tools
> >> > > > might
> >> > > > > > > provide (like the ones I mentioned). It will always be there
> >> to
> >> > > > provide
> >> > > > > > the
> >> > > > > > > robust and solid CLI to run all packaging things, but there
> >> are
> >> > > > plenty
> >> > > > > of
> >> > > > > > > opportunities to provide improved or modified, or more (or
> >> less)
> >> > > > > > > opinionated ways of doing things that are addressing some
> >> cases
> >> > > that
> >> > > > > > `pip`
> >> > > > > > > team simply will not be able or willing to handle,
> preferring
> >> > > "pure"
> >> > > > > > > standard approach vs. implement all the optional things. For
> >> > > example
> >> > > > > the
> >> > > > > > > way how pre-releases are handled can be improved to be more
> >> > > > selective.
> >> > > > > > The
> >> > > > > > > PEP describing it gives the tools an option to add more
> fancy
> >> > > > > behaviours
> >> > > > > > > (some of which we could find useful in our CI tooling).
> Should
> >> > > `pip`
> >> > > > > > > implement those - I don't think so. It would distract
> >> maintainers
> >> > > > from
> >> > > > > > > other more important things. It is quite ok to use other
> >> tooling
> >> > in
> >> > > > > > places
> >> > > > > > > like our CI, where they do some parts of the installation
> >> better.
> >> > > > > > >
> >> > > > > > > For me `pip` is going more into the direction of `usable
> >> > reference
> >> > > > > > > implementation of package installed` - any standard/ PEP
> will
> >> not
> >> > > > > matter
> >> > > > > > if
> >> > > > > > > `pip` does not implement it. But others might go in
> different
> >> > > > > directions
> >> > > > > > > and implement some less popular features and do it better,
> >> > faster,
> >> > > > with
> >> > > > > > > greater flexibility. IMHO it's a win-win.
> >> > > > > > >
> >> > > > > > > J.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> >> > > > > andrey.anshin@taragol.is
> >> > > > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Yesterday my friend shared with me that tool and I've been
> >> told
> >> > > > that
> >> > > > > > more
> >> > > > > > > > presumably it would be a niche tool. I've been told "who
> >> needs
> >> > > yet
> >> > > > > > > another
> >> > > > > > > > installer which stands to resolve all your problems' '.
> >> > > > > > > > I guess I was wrong?
> >> > > > > > > >
> >> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
> >> jarek@potiuk.com>
> >> > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hey everyone,
> >> > > > > > > > >
> >> > > > > > > > > Few days ago the ruff creators have released a new tool
> >> uv -
> >> > > > which
> >> > > > > is
> >> > > > > > > an
> >> > > > > > > > > extremely fast (written in rust) and fully featured tool
> >> > > > generally
> >> > > > > > > fully
> >> > > > > > > > > compatible with `pip`.
> >> > > > > > > > >
> >> > > > > > > > > Blog post here: https://astral.sh/blog/uv
> >> > > > > > > > >
> >> > > > > > > > > It looks like It has a number of things that would make
> >> our
> >> > CI
> >> > > > > cases
> >> > > > > > > and
> >> > > > > > > > > tooling quite a bit faster and better including a few
> >> things
> >> > > > that I
> >> > > > > > > have
> >> > > > > > > > > implemented some workarounds for and some that I have
> not
> >> > > > > > > > > implemented because `pip` had no good solution.
> >> > > > > > > > >
> >> > > > > > > > > I looked at the docs and it solves some problems that
> are
> >> > > > currently
> >> > > > > > > > > difficult or impossible to handle with `pip`:
> >> > > > > > > > >
> >> > > > > > > > > * ability to use overrides (which are constraints on
> >> > steroids -
> >> > > > > > > allowing
> >> > > > > > > > to
> >> > > > > > > > > override limits specified by the packages - this will be
> >> very
> >> > > > > useful
> >> > > > > > to
> >> > > > > > > > > better handle our cases with "chicken-egg" providers
> (for
> >> > > example
> >> > > > > > like
> >> > > > > > > we
> >> > > > > > > > > had in FAB) where we have pre-release packages depending
> >> on
> >> > > each
> >> > > > > > other
> >> > > > > > > > >
> >> > > > > > > > > * different resolution strategies including
> >> > --resolution=lowest
> >> > > > > which
> >> > > > > > > > will
> >> > > > > > > > > finally allow us to see whether airflow's lower bounds
> are
> >> > > still
> >> > > > > > > holding
> >> > > > > > > > > (i.e. - will our test still pass if we use the lowest
> >> > supported
> >> > > > > > version
> >> > > > > > > > of
> >> > > > > > > > > our dependencies?  this is something i wanted to do for
> >> quite
> >> > > > some
> >> > > > > > time
> >> > > > > > > > and
> >> > > > > > > > > recorded an issue for that -
> >> > > > > > > > > https://github.com/apache/airflow/issues/35549
> >> > > > > > > > > but lack of tooling support made it a wish, with
> >> > > > > > `--resolution=lowest`
> >> > > > > > > it
> >> > > > > > > > > seems like super-easy thing to do.
> >> > > > > > > > >
> >> > > > > > > > > * It is said to be many, many times faster - with better
> >> > > caching
> >> > > > > and
> >> > > > > > > > > resolution speeds (similarly like with ruff they claim
> >> orders
> >> > > of
> >> > > > > > > > magnitude
> >> > > > > > > > > speedups in a number of cases). We can likely make very
> >> good
> >> > > use
> >> > > > of
> >> > > > > > it
> >> > > > > > > > and
> >> > > > > > > > > speed up some parts of our CI workflow significantly.
> >> > > > > > > > >
> >> > > > > > > > > I might likely do some experimenting with uv in our
> >> > toolchain,
> >> > > > but
> >> > > > > > > wanted
> >> > > > > > > > > to make sure we are all aware of it - and ask if someone
> >> has
> >> > > > > > something
> >> > > > > > > > > against it (and maybe someone would like to do some work
> >> > there
> >> > > > > trying
> >> > > > > > > it
> >> > > > > > > > > out - I will be happy to guide others with the
> dev/tooling
> >> > > > mindset
> >> > > > > > and
> >> > > > > > > > > incline to do some changes there/review PRs and
> cooperate
> >> on
> >> > > > > testing
> >> > > > > > > > those
> >> > > > > > > > > things.
> >> > > > > > > > >
> >> > > > > > > > > It's not a user-facing change, and I do not think we
> want
> >> to
> >> > > get
> >> > > > > rid
> >> > > > > > of
> >> > > > > > > > > `pip` as an installation tool in general (in our images
> >> and
> >> > > user
> >> > > > > > facing
> >> > > > > > > > > side) - it's mostly an internal CI tooling improvement I
> >> am
> >> > > > > thinking
> >> > > > > > > of.
> >> > > > > > > > > Maybe at some point in time we can recommend it also for
> >> > > > > development
> >> > > > > > > > > workflows, and maybe someday it will gain enough
> >> popularity
> >> > to
> >> > > > > think
> >> > > > > > > > about
> >> > > > > > > > > recommending it to our users, but definitely not now nor
> >> in
> >> > > even
> >> > > > > > > mid-term
> >> > > > > > > > > future.
> >> > > > > > > > >
> >> > > > > > > > > Let me know what you think.
> >> > > > > > > > >
> >> > > > > > > > > Repo here: https://github.com/astral-sh/uv
> >> > > > > > > > >
> >> > > > > > > > > J.
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
I think I will get it green finally:
https://github.com/apache/airflow/pull/37692.

I know where the test flakiness was from. Generally speaking it turned out
that there is no free lunch and - of course - cache from uv increased our
CI image size significantly (by around 1.5G) - and it caused much slower
test execution (and test became more flaky because of that). So after
looking at that I decided to disable the cache - it's definitely not worth
it to increase the size of our images that much. We still have
significant (50% - 60% improvements - not the 60% - 70% like we had with
cache), but it's still significant enough. Without cache the "upgrade
scenario is ~ 40s (so no 4s any more) instead of 7m with pip - so this is
still a huge improvement (image size is even smaller than the one with
`pip`).


J,



On Sun, Feb 25, 2024 at 9:17 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Some more findings.
>
> Overall, I can confirm that with `uv` we will get significant - 60 - 70%
> on build image times. This will impact both CI but also `breeze` local
> rebuilds.
>
> I am getting closer to a mergeable state. I switched to
> https://github.com/apache/airflow/pull/37692 to test "upgrade to latest
> dependencies" workflow and canary build impact.
>
> The PR is getting greener and greener. I have a few last things to
> address.
>
> An interesting story is that a flaky test in CLI
> (tests/cli/commands/test_webserver_command.py::TestCliWebServer::test_cli_webserver_background)
> we had is suddenly significantly more flaky, so I will have to take a look
> at how to finally remove the flakiness from it.
> This is a good thing because this test had been flaky for quite a while
> but it was very difficult to reproduce and seems that for some reason it is
> now much easier to reproduce (which also means we will know when we fix it0.
>
> Looking at stats it seems that a lot  (but not all) of the speed
> improvement might come with Parallel downloading of dependencies -
> which are in the works also for pip (
> https://github.com/pypa/pip/pull/12388) - though it's not clear how much
> it will help as the Batch Dowloader in pip is involved only after
> resolution. We will see after it is implemented if it changes things.
>
> I am also now switching PROD builds to use uv to see how much we can save,
> but I leave `pip` as default for releases and users, the only difference is
> CI - I've added separate step for `pip` PROD build to compare and to make
> sure it's running fine in CI.
>
> The numbers:
>
> * for "upgrade to newer dependencies" scenario - uv is WAY faster - as I
> thought. In the "current" stage of the main it is: ~7m pip, 5 s (!) uv.
> Here caching of uv makes a huge difference, and while there is some work in
> `pip` and resolvelib (looking at PRs/issues) it's going to be quite some
> time to get similar results from pip and "upgrade" builds will go down
> eventually from 12m to 5 m - which is a major improvement - especially for
> elapsed time of CI builds.
>
> * from what I see package installation is super-fast in uv. Installing 614
> packages takes (wait for it) 1s (!) where I saw it taking way over a minute
> with `pip`. This will be hard to beat I think with Python vs. rust.
>
> Some notes about differences I saw:
>
> PIP and UV lead to slightly different resolutions when upgrading. This is
> not a surprise because different heuristics are involved (the resolution
> algorithm is np-complete (https://research.swtch.com/version-sat)  and
> it's very inefficient to run the full resolution, so both pip and uv take a
> little different approach for shortcuts and limiting the possible space of
> solutions. I've done a few PRs limiting (lower-bound) some dependencies to
> bring them closer) - but at the end what we get is "correct" in both cases
> - I continue running `pip check` to make sure that whatever UV finds is
> also correct according to `pip`. Nothing really major there. There were
> literally few cases that required some manual adjustments. Nothing
> unmanageable also in the future, I was doing similar tweaks with `pip` as
> well to help with the resolution.
>
> Example of differences (left. first is pip, right, second is uv)
>
> < importlib-resources==5.13.0
> ---
> > importlib-resources==6.1.1
>
> vs.
>
> < pycountry==23.12.11
> ---
> > pycountry==22.3.5
>
> It means that with `uv` we have a newer version of importlib_resources but
> an older version of pycountry.
>
> This one I will handle by bumping pycountry in case of facebook provider
> and bump it to > 23.12 as the old version is 1.5 years old.
>
> J.
>
>
> On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr> wrote:
>
>> That's impressive! I love this tool, not only for reducing CI time but
>> also
>> for saving the environment.
>> Some of the previous improvements were to further parallelize CI jobs to
>> complete the CI faster, but this tool will help reduce the overall time.
>>
>> Big +1
>>
>> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > Hello here.
>> >
>> > I have a PR https://github.com/apache/airflow/pull/37683 that
>> implements:
>> >
>> > * ability to choose either uv or PIP when building our images
>> > * CI images are built with uv by default (but you can use `--no-use-uv`
>> as
>> > a flag and switch back to `pip`
>> > * PROD images are built with pip by default (but you can us `--use-uv`
>> as a
>> > flag an switch to uv
>> >
>> > The preliminary tests show indeed that uv not only has a much faster
>> > baseline, but  also their use of caching fits extremely well into our
>> > strategy of building images and we will get huge improvements of our CI
>> > build timing when using uv.
>> >
>> > Just for the context - our CI images when built are using a caching
>> > strategy to optimise for f
>> >
>> > 1) fast building when there are no changes (around 1 minute to build
>> with
>> > pip),
>> > 2) slower building when someone adds or modifies non-conflicting
>> dependency
>> > (around. 8 minutes to build, out of which ~ 6 m is pip resolution and
>> > installation)
>> > 3) much longer build time when there are conflicting dependencies or
>> when
>> > we change Dockerfile or scripts or when Python base image changes
>> (around
>> > 27 minutes build out of which pip resolving is ~ 20m).
>> >
>> > Those are all `pip` numbers. Currently `pip` does not use resolution
>> > caching between the steps. Comparison of some basic installation steps
>> from
>> > initial tests show that UV is way faster:
>> >
>> > * Resolving and Installing airflow with [devel-ci] (610 dependencies):
>> pip
>> > ~ 6m, uv ~ 1m 30 s
>> > * Re-resolving and reinstalling [devel-ci] using local pyproject.toml;
>> pip
>> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in this
>> > case.
>> >
>> > I have not yet tested well (but I will once they happen) --eager
>> upgrade of
>> > dependencies (pip - very much depends but it's often in the range of 10
>> > minutes) - I expect it not to take more than 2-3 minutes with uv
>> >
>> > So overall it looks like we are looking at those improvements:
>> >
>> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
>> > (because we are using docker layer caching and pip resolution and
>> > installation is not used at all)
>> > 2) Updating dependencies: 8m with pip will probably go down with uv to ~
>> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no
>> remote
>> > changes and cache is used (70% improvement)
>> > 3) Re-resolving and reinstalling everything 27 m will probably go down
>> with
>> > uv to ~ 9m => 67% improvements.
>> >
>> > If those numbers hold and the resolution quality will be comparable to
>> > `pip` - then well, it's definitely worth it - and the numbers are very
>> > close to what the `uv` authors claimed.
>> >
>> > I am impressed :)
>> >
>> > J.
>> >
>> >
>> >
>> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <am...@gmail.com>
>> > wrote:
>> >
>> > > I agree with Niko here.
>> > >
>> > > If someone is willing to give it a try, we should enable it
>> > experimentally
>> > > and give it a stint for a couple of weeks. If we see significant
>> results,
>> > > we can adopt it.
>> > >
>> > > Thanks & Regards,
>> > > Amogh Desai
>> > >
>> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
>> > <onikolas@amazon.com.invalid
>> > > >
>> > > wrote:
>> > >
>> > > > The Astral folks also seem very focused on it being a
>> drop-in/compliant
>> > > > replacement for pip. So I think it's definitely worth dropping it in
>> > and
>> > > > seeing if we get the expected performance improvements. If tests
>> still
>> > > pass
>> > > > and user facing constraints and install instructions remain
>> unchanged I
>> > > > don't see why not, if someone is willing to spend the time on it.
>> Never
>> > > > mind the extra features it would give us (I, like others, am also
>> very
>> > > > excited about --resolution=lowest, ability).
>> > > >
>> > > > ________________________________
>> > > > From: Andrey Anshin <an...@taragol.is>
>> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
>> > > > To: dev@airflow.apache.org
>> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering
>> trying
>> > > > out uv for our CI workflows
>> > > >
>> > > > CAUTION: This email originated from outside of the organization. Do
>> not
>> > > > click links or open attachments unless you can confirm the sender
>> and
>> > > know
>> > > > the content is safe.
>> > > >
>> > > >
>> > > >
>> > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
>> > externe.
>> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
>> > > pouvez
>> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
>> certain
>> > > que
>> > > > le contenu ne présente aucun risque.
>> > > >
>> > > >
>> > > >
>> > > > > I share Andrey's skepticism. It's just yet another tool which has
>> an
>> > > > unclear
>> > > > development strategy.
>> > > >
>> > > > My point was more about a matter of presentation. If someone told
>> you
>> > > "this
>> > > > is a new tool, like a killer of previous tools" then you might think
>> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...  not
>> really
>> > > > interesting". On the other hand if someone told you what in cases
>> you
>> > > might
>> > > > solve, then this might be a mind changer.
>> > > >
>> > > > Especially the promising `--resolution=lowest` option. We always
>> want
>> > to
>> > > > test something with minimal dependencies because we are not sure
>> that
>> > it
>> > > > might work with pretty old dependencies, and recently I've started
>> to
>> > > work
>> > > > on POC to collect minimal versions of the Airflow and Providers.
>> And at
>> > > the
>> > > > moment when I almost finished it the uv was released. Well
>> sometimes it
>> > > is
>> > > > better to wait a bit and maybe someone would invent the same
>> > > > solution 😁 and you don't have to spend a personal time.
>> > > >
>> > > > So as POC I'm on it, we still need a `pip` and validate some stuff
>> by a
>> > > pip
>> > > > because it is only one officially supported way to install Airflow
>> but
>> > if
>> > > > something could be improved in the CI then I'm on it, in most cases
>> it
>> > > > would be behind of Breeze and many of the contributors might be even
>> > not
>> > > > noticed that something changed.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> > > >
>> > > > > Actually - of you read that blog post, the strategy is clear -
>> they
>> > aim
>> > > > to
>> > > > > create a comprehensive packaging tooling and improvnts are
>> measured
>> > > > (80-100
>> > > > > times they claim - I using caching - they (unlike pip) use a lot
>> of
>> > > local
>> > > > > caching including resolving  dependencies).
>> > > > >
>> > > > > So I think both arguments are not valid if you ask me.
>> > > > >
>> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
>> > kxepal@apache.org
>> > > >
>> > > > > napisał:
>> > > > >
>> > > > > > I share Andrey's skepticism. It's just yet another tool which
>> has
>> > an
>> > > > > > unclear development strategy. Should you make it a free testing
>> > > suite?
>> > > > > What
>> > > > > > project would receive in exchange? A lot of words about being
>> > faster,
>> > > > but
>> > > > > > how much? Are these milliseconds worth to change the stable tool
>> > > with a
>> > > > > new
>> > > > > > one? And will it notably improve something?
>> > > > > >
>> > > > > > I think it's worth to try it just for fun and provide feedback,
>> but
>> > > > it'll
>> > > > > > have to pass a long road to become such stable as pip.
>> > > > > >
>> > > > > > --
>> > > > > > ,,,^..^,,,
>> > > > > >
>> > > > > >
>> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com>
>> > > wrote:
>> > > > > >
>> > > > > > > My opinion:
>> > > > > > >
>> > > > > > > I think there is a place for a number of such tools. For a
>> long
>> > > time
>> > > > > the
>> > > > > > > packaging team and `pip` team have been working not only on
>> `pip`
>> > > > > > > implementation but also (and most importantly) to make sure
>> that
>> > > what
>> > > > > > `pip`
>> > > > > > > does is to be the beacon of standardisation of packaging APIs
>> and
>> > > > PEPs.
>> > > > > > It
>> > > > > > > will never IMHO have a lot of the fancy features that other
>> tools
>> > > > might
>> > > > > > > provide (like the ones I mentioned). It will always be there
>> to
>> > > > provide
>> > > > > > the
>> > > > > > > robust and solid CLI to run all packaging things, but there
>> are
>> > > > plenty
>> > > > > of
>> > > > > > > opportunities to provide improved or modified, or more (or
>> less)
>> > > > > > > opinionated ways of doing things that are addressing some
>> cases
>> > > that
>> > > > > > `pip`
>> > > > > > > team simply will not be able or willing to handle, preferring
>> > > "pure"
>> > > > > > > standard approach vs. implement all the optional things. For
>> > > example
>> > > > > the
>> > > > > > > way how pre-releases are handled can be improved to be more
>> > > > selective.
>> > > > > > The
>> > > > > > > PEP describing it gives the tools an option to add more fancy
>> > > > > behaviours
>> > > > > > > (some of which we could find useful in our CI tooling). Should
>> > > `pip`
>> > > > > > > implement those - I don't think so. It would distract
>> maintainers
>> > > > from
>> > > > > > > other more important things. It is quite ok to use other
>> tooling
>> > in
>> > > > > > places
>> > > > > > > like our CI, where they do some parts of the installation
>> better.
>> > > > > > >
>> > > > > > > For me `pip` is going more into the direction of `usable
>> > reference
>> > > > > > > implementation of package installed` - any standard/ PEP will
>> not
>> > > > > matter
>> > > > > > if
>> > > > > > > `pip` does not implement it. But others might go in different
>> > > > > directions
>> > > > > > > and implement some less popular features and do it better,
>> > faster,
>> > > > with
>> > > > > > > greater flexibility. IMHO it's a win-win.
>> > > > > > >
>> > > > > > > J.
>> > > > > > >
>> > > > > > >
>> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
>> > > > > andrey.anshin@taragol.is
>> > > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Yesterday my friend shared with me that tool and I've been
>> told
>> > > > that
>> > > > > > more
>> > > > > > > > presumably it would be a niche tool. I've been told "who
>> needs
>> > > yet
>> > > > > > > another
>> > > > > > > > installer which stands to resolve all your problems' '.
>> > > > > > > > I guess I was wrong?
>> > > > > > > >
>> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <
>> jarek@potiuk.com>
>> > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hey everyone,
>> > > > > > > > >
>> > > > > > > > > Few days ago the ruff creators have released a new tool
>> uv -
>> > > > which
>> > > > > is
>> > > > > > > an
>> > > > > > > > > extremely fast (written in rust) and fully featured tool
>> > > > generally
>> > > > > > > fully
>> > > > > > > > > compatible with `pip`.
>> > > > > > > > >
>> > > > > > > > > Blog post here: https://astral.sh/blog/uv
>> > > > > > > > >
>> > > > > > > > > It looks like It has a number of things that would make
>> our
>> > CI
>> > > > > cases
>> > > > > > > and
>> > > > > > > > > tooling quite a bit faster and better including a few
>> things
>> > > > that I
>> > > > > > > have
>> > > > > > > > > implemented some workarounds for and some that I have not
>> > > > > > > > > implemented because `pip` had no good solution.
>> > > > > > > > >
>> > > > > > > > > I looked at the docs and it solves some problems that are
>> > > > currently
>> > > > > > > > > difficult or impossible to handle with `pip`:
>> > > > > > > > >
>> > > > > > > > > * ability to use overrides (which are constraints on
>> > steroids -
>> > > > > > > allowing
>> > > > > > > > to
>> > > > > > > > > override limits specified by the packages - this will be
>> very
>> > > > > useful
>> > > > > > to
>> > > > > > > > > better handle our cases with "chicken-egg" providers (for
>> > > example
>> > > > > > like
>> > > > > > > we
>> > > > > > > > > had in FAB) where we have pre-release packages depending
>> on
>> > > each
>> > > > > > other
>> > > > > > > > >
>> > > > > > > > > * different resolution strategies including
>> > --resolution=lowest
>> > > > > which
>> > > > > > > > will
>> > > > > > > > > finally allow us to see whether airflow's lower bounds are
>> > > still
>> > > > > > > holding
>> > > > > > > > > (i.e. - will our test still pass if we use the lowest
>> > supported
>> > > > > > version
>> > > > > > > > of
>> > > > > > > > > our dependencies?  this is something i wanted to do for
>> quite
>> > > > some
>> > > > > > time
>> > > > > > > > and
>> > > > > > > > > recorded an issue for that -
>> > > > > > > > > https://github.com/apache/airflow/issues/35549
>> > > > > > > > > but lack of tooling support made it a wish, with
>> > > > > > `--resolution=lowest`
>> > > > > > > it
>> > > > > > > > > seems like super-easy thing to do.
>> > > > > > > > >
>> > > > > > > > > * It is said to be many, many times faster - with better
>> > > caching
>> > > > > and
>> > > > > > > > > resolution speeds (similarly like with ruff they claim
>> orders
>> > > of
>> > > > > > > > magnitude
>> > > > > > > > > speedups in a number of cases). We can likely make very
>> good
>> > > use
>> > > > of
>> > > > > > it
>> > > > > > > > and
>> > > > > > > > > speed up some parts of our CI workflow significantly.
>> > > > > > > > >
>> > > > > > > > > I might likely do some experimenting with uv in our
>> > toolchain,
>> > > > but
>> > > > > > > wanted
>> > > > > > > > > to make sure we are all aware of it - and ask if someone
>> has
>> > > > > > something
>> > > > > > > > > against it (and maybe someone would like to do some work
>> > there
>> > > > > trying
>> > > > > > > it
>> > > > > > > > > out - I will be happy to guide others with the dev/tooling
>> > > > mindset
>> > > > > > and
>> > > > > > > > > incline to do some changes there/review PRs and cooperate
>> on
>> > > > > testing
>> > > > > > > > those
>> > > > > > > > > things.
>> > > > > > > > >
>> > > > > > > > > It's not a user-facing change, and I do not think we want
>> to
>> > > get
>> > > > > rid
>> > > > > > of
>> > > > > > > > > `pip` as an installation tool in general (in our images
>> and
>> > > user
>> > > > > > facing
>> > > > > > > > > side) - it's mostly an internal CI tooling improvement I
>> am
>> > > > > thinking
>> > > > > > > of.
>> > > > > > > > > Maybe at some point in time we can recommend it also for
>> > > > > development
>> > > > > > > > > workflows, and maybe someday it will gain enough
>> popularity
>> > to
>> > > > > think
>> > > > > > > > about
>> > > > > > > > > recommending it to our users, but definitely not now nor
>> in
>> > > even
>> > > > > > > mid-term
>> > > > > > > > > future.
>> > > > > > > > >
>> > > > > > > > > Let me know what you think.
>> > > > > > > > >
>> > > > > > > > > Repo here: https://github.com/astral-sh/uv
>> > > > > > > > >
>> > > > > > > > > J.
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Some more findings.

Overall, I can confirm that with `uv` we will get significant - 60 - 70% on
build image times. This will impact both CI but also `breeze` local
rebuilds.

I am getting closer to a mergeable state. I switched to
https://github.com/apache/airflow/pull/37692 to test "upgrade to latest
dependencies" workflow and canary build impact.

The PR is getting greener and greener. I have a few last things to address.

An interesting story is that a flaky test in CLI
(tests/cli/commands/test_webserver_command.py::TestCliWebServer::test_cli_webserver_background)
we had is suddenly significantly more flaky, so I will have to take a look
at how to finally remove the flakiness from it.
This is a good thing because this test had been flaky for quite a while but
it was very difficult to reproduce and seems that for some reason it is now
much easier to reproduce (which also means we will know when we fix it0.

Looking at stats it seems that a lot  (but not all) of the speed
improvement might come with Parallel downloading of dependencies -
which are in the works also for pip (https://github.com/pypa/pip/pull/12388)
- though it's not clear how much it will help as the Batch Dowloader in pip
is involved only after resolution. We will see after it is implemented if
it changes things.

I am also now switching PROD builds to use uv to see how much we can save,
but I leave `pip` as default for releases and users, the only difference is
CI - I've added separate step for `pip` PROD build to compare and to make
sure it's running fine in CI.

The numbers:

* for "upgrade to newer dependencies" scenario - uv is WAY faster - as I
thought. In the "current" stage of the main it is: ~7m pip, 5 s (!) uv.
Here caching of uv makes a huge difference, and while there is some work in
`pip` and resolvelib (looking at PRs/issues) it's going to be quite some
time to get similar results from pip and "upgrade" builds will go down
eventually from 12m to 5 m - which is a major improvement - especially for
elapsed time of CI builds.

* from what I see package installation is super-fast in uv. Installing 614
packages takes (wait for it) 1s (!) where I saw it taking way over a minute
with `pip`. This will be hard to beat I think with Python vs. rust.

Some notes about differences I saw:

PIP and UV lead to slightly different resolutions when upgrading. This is
not a surprise because different heuristics are involved (the resolution
algorithm is np-complete (https://research.swtch.com/version-sat)  and it's
very inefficient to run the full resolution, so both pip and uv take a
little different approach for shortcuts and limiting the possible space of
solutions. I've done a few PRs limiting (lower-bound) some dependencies to
bring them closer) - but at the end what we get is "correct" in both cases
- I continue running `pip check` to make sure that whatever UV finds is
also correct according to `pip`. Nothing really major there. There were
literally few cases that required some manual adjustments. Nothing
unmanageable also in the future, I was doing similar tweaks with `pip` as
well to help with the resolution.

Example of differences (left. first is pip, right, second is uv)

< importlib-resources==5.13.0
---
> importlib-resources==6.1.1

vs.

< pycountry==23.12.11
---
> pycountry==22.3.5

It means that with `uv` we have a newer version of importlib_resources but
an older version of pycountry.

This one I will handle by bumping pycountry in case of facebook provider
and bump it to > 23.12 as the old version is 1.5 years old.

J.


On Sun, Feb 25, 2024 at 12:52 AM Hussein Awala <hu...@awala.fr> wrote:

> That's impressive! I love this tool, not only for reducing CI time but also
> for saving the environment.
> Some of the previous improvements were to further parallelize CI jobs to
> complete the CI faster, but this tool will help reduce the overall time.
>
> Big +1
>
> On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Hello here.
> >
> > I have a PR https://github.com/apache/airflow/pull/37683 that
> implements:
> >
> > * ability to choose either uv or PIP when building our images
> > * CI images are built with uv by default (but you can use `--no-use-uv`
> as
> > a flag and switch back to `pip`
> > * PROD images are built with pip by default (but you can us `--use-uv`
> as a
> > flag an switch to uv
> >
> > The preliminary tests show indeed that uv not only has a much faster
> > baseline, but  also their use of caching fits extremely well into our
> > strategy of building images and we will get huge improvements of our CI
> > build timing when using uv.
> >
> > Just for the context - our CI images when built are using a caching
> > strategy to optimise for f
> >
> > 1) fast building when there are no changes (around 1 minute to build with
> > pip),
> > 2) slower building when someone adds or modifies non-conflicting
> dependency
> > (around. 8 minutes to build, out of which ~ 6 m is pip resolution and
> > installation)
> > 3) much longer build time when there are conflicting dependencies or when
> > we change Dockerfile or scripts or when Python base image changes (around
> > 27 minutes build out of which pip resolving is ~ 20m).
> >
> > Those are all `pip` numbers. Currently `pip` does not use resolution
> > caching between the steps. Comparison of some basic installation steps
> from
> > initial tests show that UV is way faster:
> >
> > * Resolving and Installing airflow with [devel-ci] (610 dependencies):
> pip
> > ~ 6m, uv ~ 1m 30 s
> > * Re-resolving and reinstalling [devel-ci] using local pyproject.toml;
> pip
> > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in this
> > case.
> >
> > I have not yet tested well (but I will once they happen) --eager upgrade
> of
> > dependencies (pip - very much depends but it's often in the range of 10
> > minutes) - I expect it not to take more than 2-3 minutes with uv
> >
> > So overall it looks like we are looking at those improvements:
> >
> > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
> > (because we are using docker layer caching and pip resolution and
> > installation is not used at all)
> > 2) Updating dependencies: 8m with pip will probably go down with uv to ~
> > 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no
> remote
> > changes and cache is used (70% improvement)
> > 3) Re-resolving and reinstalling everything 27 m will probably go down
> with
> > uv to ~ 9m => 67% improvements.
> >
> > If those numbers hold and the resolution quality will be comparable to
> > `pip` - then well, it's definitely worth it - and the numbers are very
> > close to what the `uv` authors claimed.
> >
> > I am impressed :)
> >
> > J.
> >
> >
> >
> > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <am...@gmail.com>
> > wrote:
> >
> > > I agree with Niko here.
> > >
> > > If someone is willing to give it a try, we should enable it
> > experimentally
> > > and give it a stint for a couple of weeks. If we see significant
> results,
> > > we can adopt it.
> > >
> > > Thanks & Regards,
> > > Amogh Desai
> > >
> > > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
> > <onikolas@amazon.com.invalid
> > > >
> > > wrote:
> > >
> > > > The Astral folks also seem very focused on it being a
> drop-in/compliant
> > > > replacement for pip. So I think it's definitely worth dropping it in
> > and
> > > > seeing if we get the expected performance improvements. If tests
> still
> > > pass
> > > > and user facing constraints and install instructions remain
> unchanged I
> > > > don't see why not, if someone is willing to spend the time on it.
> Never
> > > > mind the extra features it would give us (I, like others, am also
> very
> > > > excited about --resolution=lowest, ability).
> > > >
> > > > ________________________________
> > > > From: Andrey Anshin <an...@taragol.is>
> > > > Sent: Tuesday, February 20, 2024 12:26:56 AM
> > > > To: dev@airflow.apache.org
> > > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering
> trying
> > > > out uv for our CI workflows
> > > >
> > > > CAUTION: This email originated from outside of the organization. Do
> not
> > > > click links or open attachments unless you can confirm the sender and
> > > know
> > > > the content is safe.
> > > >
> > > >
> > > >
> > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> > externe.
> > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > > pouvez
> > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
> certain
> > > que
> > > > le contenu ne présente aucun risque.
> > > >
> > > >
> > > >
> > > > > I share Andrey's skepticism. It's just yet another tool which has
> an
> > > > unclear
> > > > development strategy.
> > > >
> > > > My point was more about a matter of presentation. If someone told you
> > > "this
> > > > is a new tool, like a killer of previous tools" then you might think
> > > > "Yeah...yeah...yeah.. yet another replacement to tool X...  not
> really
> > > > interesting". On the other hand if someone told you what in cases you
> > > might
> > > > solve, then this might be a mind changer.
> > > >
> > > > Especially the promising `--resolution=lowest` option. We always want
> > to
> > > > test something with minimal dependencies because we are not sure that
> > it
> > > > might work with pretty old dependencies, and recently I've started to
> > > work
> > > > on POC to collect minimal versions of the Airflow and Providers. And
> at
> > > the
> > > > moment when I almost finished it the uv was released. Well sometimes
> it
> > > is
> > > > better to wait a bit and maybe someone would invent the same
> > > > solution 😁 and you don't have to spend a personal time.
> > > >
> > > > So as POC I'm on it, we still need a `pip` and validate some stuff
> by a
> > > pip
> > > > because it is only one officially supported way to install Airflow
> but
> > if
> > > > something could be improved in the CI then I'm on it, in most cases
> it
> > > > would be behind of Breeze and many of the contributors might be even
> > not
> > > > noticed that something changed.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > >
> > > > > Actually - of you read that blog post, the strategy is clear - they
> > aim
> > > > to
> > > > > create a comprehensive packaging tooling and improvnts are measured
> > > > (80-100
> > > > > times they claim - I using caching - they (unlike pip) use a lot of
> > > local
> > > > > caching including resolving  dependencies).
> > > > >
> > > > > So I think both arguments are not valid if you ask me.
> > > > >
> > > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
> > kxepal@apache.org
> > > >
> > > > > napisał:
> > > > >
> > > > > > I share Andrey's skepticism. It's just yet another tool which has
> > an
> > > > > > unclear development strategy. Should you make it a free testing
> > > suite?
> > > > > What
> > > > > > project would receive in exchange? A lot of words about being
> > faster,
> > > > but
> > > > > > how much? Are these milliseconds worth to change the stable tool
> > > with a
> > > > > new
> > > > > > one? And will it notably improve something?
> > > > > >
> > > > > > I think it's worth to try it just for fun and provide feedback,
> but
> > > > it'll
> > > > > > have to pass a long road to become such stable as pip.
> > > > > >
> > > > > > --
> > > > > > ,,,^..^,,,
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com>
> > > wrote:
> > > > > >
> > > > > > > My opinion:
> > > > > > >
> > > > > > > I think there is a place for a number of such tools. For a long
> > > time
> > > > > the
> > > > > > > packaging team and `pip` team have been working not only on
> `pip`
> > > > > > > implementation but also (and most importantly) to make sure
> that
> > > what
> > > > > > `pip`
> > > > > > > does is to be the beacon of standardisation of packaging APIs
> and
> > > > PEPs.
> > > > > > It
> > > > > > > will never IMHO have a lot of the fancy features that other
> tools
> > > > might
> > > > > > > provide (like the ones I mentioned). It will always be there to
> > > > provide
> > > > > > the
> > > > > > > robust and solid CLI to run all packaging things, but there are
> > > > plenty
> > > > > of
> > > > > > > opportunities to provide improved or modified, or more (or
> less)
> > > > > > > opinionated ways of doing things that are addressing some cases
> > > that
> > > > > > `pip`
> > > > > > > team simply will not be able or willing to handle, preferring
> > > "pure"
> > > > > > > standard approach vs. implement all the optional things. For
> > > example
> > > > > the
> > > > > > > way how pre-releases are handled can be improved to be more
> > > > selective.
> > > > > > The
> > > > > > > PEP describing it gives the tools an option to add more fancy
> > > > > behaviours
> > > > > > > (some of which we could find useful in our CI tooling). Should
> > > `pip`
> > > > > > > implement those - I don't think so. It would distract
> maintainers
> > > > from
> > > > > > > other more important things. It is quite ok to use other
> tooling
> > in
> > > > > > places
> > > > > > > like our CI, where they do some parts of the installation
> better.
> > > > > > >
> > > > > > > For me `pip` is going more into the direction of `usable
> > reference
> > > > > > > implementation of package installed` - any standard/ PEP will
> not
> > > > > matter
> > > > > > if
> > > > > > > `pip` does not implement it. But others might go in different
> > > > > directions
> > > > > > > and implement some less popular features and do it better,
> > faster,
> > > > with
> > > > > > > greater flexibility. IMHO it's a win-win.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> > > > > andrey.anshin@taragol.is
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Yesterday my friend shared with me that tool and I've been
> told
> > > > that
> > > > > > more
> > > > > > > > presumably it would be a niche tool. I've been told "who
> needs
> > > yet
> > > > > > > another
> > > > > > > > installer which stands to resolve all your problems' '.
> > > > > > > > I guess I was wrong?
> > > > > > > >
> > > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <jarek@potiuk.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hey everyone,
> > > > > > > > >
> > > > > > > > > Few days ago the ruff creators have released a new tool uv
> -
> > > > which
> > > > > is
> > > > > > > an
> > > > > > > > > extremely fast (written in rust) and fully featured tool
> > > > generally
> > > > > > > fully
> > > > > > > > > compatible with `pip`.
> > > > > > > > >
> > > > > > > > > Blog post here: https://astral.sh/blog/uv
> > > > > > > > >
> > > > > > > > > It looks like It has a number of things that would make our
> > CI
> > > > > cases
> > > > > > > and
> > > > > > > > > tooling quite a bit faster and better including a few
> things
> > > > that I
> > > > > > > have
> > > > > > > > > implemented some workarounds for and some that I have not
> > > > > > > > > implemented because `pip` had no good solution.
> > > > > > > > >
> > > > > > > > > I looked at the docs and it solves some problems that are
> > > > currently
> > > > > > > > > difficult or impossible to handle with `pip`:
> > > > > > > > >
> > > > > > > > > * ability to use overrides (which are constraints on
> > steroids -
> > > > > > > allowing
> > > > > > > > to
> > > > > > > > > override limits specified by the packages - this will be
> very
> > > > > useful
> > > > > > to
> > > > > > > > > better handle our cases with "chicken-egg" providers (for
> > > example
> > > > > > like
> > > > > > > we
> > > > > > > > > had in FAB) where we have pre-release packages depending on
> > > each
> > > > > > other
> > > > > > > > >
> > > > > > > > > * different resolution strategies including
> > --resolution=lowest
> > > > > which
> > > > > > > > will
> > > > > > > > > finally allow us to see whether airflow's lower bounds are
> > > still
> > > > > > > holding
> > > > > > > > > (i.e. - will our test still pass if we use the lowest
> > supported
> > > > > > version
> > > > > > > > of
> > > > > > > > > our dependencies?  this is something i wanted to do for
> quite
> > > > some
> > > > > > time
> > > > > > > > and
> > > > > > > > > recorded an issue for that -
> > > > > > > > > https://github.com/apache/airflow/issues/35549
> > > > > > > > > but lack of tooling support made it a wish, with
> > > > > > `--resolution=lowest`
> > > > > > > it
> > > > > > > > > seems like super-easy thing to do.
> > > > > > > > >
> > > > > > > > > * It is said to be many, many times faster - with better
> > > caching
> > > > > and
> > > > > > > > > resolution speeds (similarly like with ruff they claim
> orders
> > > of
> > > > > > > > magnitude
> > > > > > > > > speedups in a number of cases). We can likely make very
> good
> > > use
> > > > of
> > > > > > it
> > > > > > > > and
> > > > > > > > > speed up some parts of our CI workflow significantly.
> > > > > > > > >
> > > > > > > > > I might likely do some experimenting with uv in our
> > toolchain,
> > > > but
> > > > > > > wanted
> > > > > > > > > to make sure we are all aware of it - and ask if someone
> has
> > > > > > something
> > > > > > > > > against it (and maybe someone would like to do some work
> > there
> > > > > trying
> > > > > > > it
> > > > > > > > > out - I will be happy to guide others with the dev/tooling
> > > > mindset
> > > > > > and
> > > > > > > > > incline to do some changes there/review PRs and cooperate
> on
> > > > > testing
> > > > > > > > those
> > > > > > > > > things.
> > > > > > > > >
> > > > > > > > > It's not a user-facing change, and I do not think we want
> to
> > > get
> > > > > rid
> > > > > > of
> > > > > > > > > `pip` as an installation tool in general (in our images and
> > > user
> > > > > > facing
> > > > > > > > > side) - it's mostly an internal CI tooling improvement I am
> > > > > thinking
> > > > > > > of.
> > > > > > > > > Maybe at some point in time we can recommend it also for
> > > > > development
> > > > > > > > > workflows, and maybe someday it will gain enough popularity
> > to
> > > > > think
> > > > > > > > about
> > > > > > > > > recommending it to our users, but definitely not now nor in
> > > even
> > > > > > > mid-term
> > > > > > > > > future.
> > > > > > > > >
> > > > > > > > > Let me know what you think.
> > > > > > > > >
> > > > > > > > > Repo here: https://github.com/astral-sh/uv
> > > > > > > > >
> > > > > > > > > J.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Hussein Awala <hu...@awala.fr>.
That's impressive! I love this tool, not only for reducing CI time but also
for saving the environment.
Some of the previous improvements were to further parallelize CI jobs to
complete the CI faster, but this tool will help reduce the overall time.

Big +1

On Sun, Feb 25, 2024 at 12:34 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> Hello here.
>
> I have a PR https://github.com/apache/airflow/pull/37683 that implements:
>
> * ability to choose either uv or PIP when building our images
> * CI images are built with uv by default (but you can use `--no-use-uv` as
> a flag and switch back to `pip`
> * PROD images are built with pip by default (but you can us `--use-uv` as a
> flag an switch to uv
>
> The preliminary tests show indeed that uv not only has a much faster
> baseline, but  also their use of caching fits extremely well into our
> strategy of building images and we will get huge improvements of our CI
> build timing when using uv.
>
> Just for the context - our CI images when built are using a caching
> strategy to optimise for f
>
> 1) fast building when there are no changes (around 1 minute to build with
> pip),
> 2) slower building when someone adds or modifies non-conflicting dependency
> (around. 8 minutes to build, out of which ~ 6 m is pip resolution and
> installation)
> 3) much longer build time when there are conflicting dependencies or when
> we change Dockerfile or scripts or when Python base image changes (around
> 27 minutes build out of which pip resolving is ~ 20m).
>
> Those are all `pip` numbers. Currently `pip` does not use resolution
> caching between the steps. Comparison of some basic installation steps from
> initial tests show that UV is way faster:
>
> * Resolving and Installing airflow with [devel-ci] (610 dependencies): pip
> ~ 6m, uv ~ 1m 30 s
> * Re-resolving and reinstalling [devel-ci] using local pyproject.toml; pip
> ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in this
> case.
>
> I have not yet tested well (but I will once they happen) --eager upgrade of
> dependencies (pip - very much depends but it's often in the range of 10
> minutes) - I expect it not to take more than 2-3 minutes with uv
>
> So overall it looks like we are looking at those improvements:
>
> 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
> (because we are using docker layer caching and pip resolution and
> installation is not used at all)
> 2) Updating dependencies: 8m with pip will probably go down with uv to ~
> 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no remote
> changes and cache is used (70% improvement)
> 3) Re-resolving and reinstalling everything 27 m will probably go down with
> uv to ~ 9m => 67% improvements.
>
> If those numbers hold and the resolution quality will be comparable to
> `pip` - then well, it's definitely worth it - and the numbers are very
> close to what the `uv` authors claimed.
>
> I am impressed :)
>
> J.
>
>
>
> On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <am...@gmail.com>
> wrote:
>
> > I agree with Niko here.
> >
> > If someone is willing to give it a try, we should enable it
> experimentally
> > and give it a stint for a couple of weeks. If we see significant results,
> > we can adopt it.
> >
> > Thanks & Regards,
> > Amogh Desai
> >
> > On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko
> <onikolas@amazon.com.invalid
> > >
> > wrote:
> >
> > > The Astral folks also seem very focused on it being a drop-in/compliant
> > > replacement for pip. So I think it's definitely worth dropping it in
> and
> > > seeing if we get the expected performance improvements. If tests still
> > pass
> > > and user facing constraints and install instructions remain unchanged I
> > > don't see why not, if someone is willing to spend the time on it. Never
> > > mind the extra features it would give us (I, like others, am also very
> > > excited about --resolution=lowest, ability).
> > >
> > > ________________________________
> > > From: Andrey Anshin <an...@taragol.is>
> > > Sent: Tuesday, February 20, 2024 12:26:56 AM
> > > To: dev@airflow.apache.org
> > > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying
> > > out uv for our CI workflows
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > >
> > >
> > >
> > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> externe.
> > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> > pouvez
> > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> > que
> > > le contenu ne présente aucun risque.
> > >
> > >
> > >
> > > > I share Andrey's skepticism. It's just yet another tool which has an
> > > unclear
> > > development strategy.
> > >
> > > My point was more about a matter of presentation. If someone told you
> > "this
> > > is a new tool, like a killer of previous tools" then you might think
> > > "Yeah...yeah...yeah.. yet another replacement to tool X...  not really
> > > interesting". On the other hand if someone told you what in cases you
> > might
> > > solve, then this might be a mind changer.
> > >
> > > Especially the promising `--resolution=lowest` option. We always want
> to
> > > test something with minimal dependencies because we are not sure that
> it
> > > might work with pretty old dependencies, and recently I've started to
> > work
> > > on POC to collect minimal versions of the Airflow and Providers. And at
> > the
> > > moment when I almost finished it the uv was released. Well sometimes it
> > is
> > > better to wait a bit and maybe someone would invent the same
> > > solution 😁 and you don't have to spend a personal time.
> > >
> > > So as POC I'm on it, we still need a `pip` and validate some stuff by a
> > pip
> > > because it is only one officially supported way to install Airflow but
> if
> > > something could be improved in the CI then I'm on it, in most cases it
> > > would be behind of Breeze and many of the contributors might be even
> not
> > > noticed that something changed.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > Actually - of you read that blog post, the strategy is clear - they
> aim
> > > to
> > > > create a comprehensive packaging tooling and improvnts are measured
> > > (80-100
> > > > times they claim - I using caching - they (unlike pip) use a lot of
> > local
> > > > caching including resolving  dependencies).
> > > >
> > > > So I think both arguments are not valid if you ask me.
> > > >
> > > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <
> kxepal@apache.org
> > >
> > > > napisał:
> > > >
> > > > > I share Andrey's skepticism. It's just yet another tool which has
> an
> > > > > unclear development strategy. Should you make it a free testing
> > suite?
> > > > What
> > > > > project would receive in exchange? A lot of words about being
> faster,
> > > but
> > > > > how much? Are these milliseconds worth to change the stable tool
> > with a
> > > > new
> > > > > one? And will it notably improve something?
> > > > >
> > > > > I think it's worth to try it just for fun and provide feedback, but
> > > it'll
> > > > > have to pass a long road to become such stable as pip.
> > > > >
> > > > > --
> > > > > ,,,^..^,,,
> > > > >
> > > > >
> > > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > > >
> > > > > > My opinion:
> > > > > >
> > > > > > I think there is a place for a number of such tools. For a long
> > time
> > > > the
> > > > > > packaging team and `pip` team have been working not only on `pip`
> > > > > > implementation but also (and most importantly) to make sure that
> > what
> > > > > `pip`
> > > > > > does is to be the beacon of standardisation of packaging APIs and
> > > PEPs.
> > > > > It
> > > > > > will never IMHO have a lot of the fancy features that other tools
> > > might
> > > > > > provide (like the ones I mentioned). It will always be there to
> > > provide
> > > > > the
> > > > > > robust and solid CLI to run all packaging things, but there are
> > > plenty
> > > > of
> > > > > > opportunities to provide improved or modified, or more (or less)
> > > > > > opinionated ways of doing things that are addressing some cases
> > that
> > > > > `pip`
> > > > > > team simply will not be able or willing to handle, preferring
> > "pure"
> > > > > > standard approach vs. implement all the optional things. For
> > example
> > > > the
> > > > > > way how pre-releases are handled can be improved to be more
> > > selective.
> > > > > The
> > > > > > PEP describing it gives the tools an option to add more fancy
> > > > behaviours
> > > > > > (some of which we could find useful in our CI tooling). Should
> > `pip`
> > > > > > implement those - I don't think so. It would distract maintainers
> > > from
> > > > > > other more important things. It is quite ok to use other tooling
> in
> > > > > places
> > > > > > like our CI, where they do some parts of the installation better.
> > > > > >
> > > > > > For me `pip` is going more into the direction of `usable
> reference
> > > > > > implementation of package installed` - any standard/ PEP will not
> > > > matter
> > > > > if
> > > > > > `pip` does not implement it. But others might go in different
> > > > directions
> > > > > > and implement some less popular features and do it better,
> faster,
> > > with
> > > > > > greater flexibility. IMHO it's a win-win.
> > > > > >
> > > > > > J.
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> > > > andrey.anshin@taragol.is
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Yesterday my friend shared with me that tool and I've been told
> > > that
> > > > > more
> > > > > > > presumably it would be a niche tool. I've been told "who needs
> > yet
> > > > > > another
> > > > > > > installer which stands to resolve all your problems' '.
> > > > > > > I guess I was wrong?
> > > > > > >
> > > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hey everyone,
> > > > > > > >
> > > > > > > > Few days ago the ruff creators have released a new tool uv -
> > > which
> > > > is
> > > > > > an
> > > > > > > > extremely fast (written in rust) and fully featured tool
> > > generally
> > > > > > fully
> > > > > > > > compatible with `pip`.
> > > > > > > >
> > > > > > > > Blog post here: https://astral.sh/blog/uv
> > > > > > > >
> > > > > > > > It looks like It has a number of things that would make our
> CI
> > > > cases
> > > > > > and
> > > > > > > > tooling quite a bit faster and better including a few things
> > > that I
> > > > > > have
> > > > > > > > implemented some workarounds for and some that I have not
> > > > > > > > implemented because `pip` had no good solution.
> > > > > > > >
> > > > > > > > I looked at the docs and it solves some problems that are
> > > currently
> > > > > > > > difficult or impossible to handle with `pip`:
> > > > > > > >
> > > > > > > > * ability to use overrides (which are constraints on
> steroids -
> > > > > > allowing
> > > > > > > to
> > > > > > > > override limits specified by the packages - this will be very
> > > > useful
> > > > > to
> > > > > > > > better handle our cases with "chicken-egg" providers (for
> > example
> > > > > like
> > > > > > we
> > > > > > > > had in FAB) where we have pre-release packages depending on
> > each
> > > > > other
> > > > > > > >
> > > > > > > > * different resolution strategies including
> --resolution=lowest
> > > > which
> > > > > > > will
> > > > > > > > finally allow us to see whether airflow's lower bounds are
> > still
> > > > > > holding
> > > > > > > > (i.e. - will our test still pass if we use the lowest
> supported
> > > > > version
> > > > > > > of
> > > > > > > > our dependencies?  this is something i wanted to do for quite
> > > some
> > > > > time
> > > > > > > and
> > > > > > > > recorded an issue for that -
> > > > > > > > https://github.com/apache/airflow/issues/35549
> > > > > > > > but lack of tooling support made it a wish, with
> > > > > `--resolution=lowest`
> > > > > > it
> > > > > > > > seems like super-easy thing to do.
> > > > > > > >
> > > > > > > > * It is said to be many, many times faster - with better
> > caching
> > > > and
> > > > > > > > resolution speeds (similarly like with ruff they claim orders
> > of
> > > > > > > magnitude
> > > > > > > > speedups in a number of cases). We can likely make very good
> > use
> > > of
> > > > > it
> > > > > > > and
> > > > > > > > speed up some parts of our CI workflow significantly.
> > > > > > > >
> > > > > > > > I might likely do some experimenting with uv in our
> toolchain,
> > > but
> > > > > > wanted
> > > > > > > > to make sure we are all aware of it - and ask if someone has
> > > > > something
> > > > > > > > against it (and maybe someone would like to do some work
> there
> > > > trying
> > > > > > it
> > > > > > > > out - I will be happy to guide others with the dev/tooling
> > > mindset
> > > > > and
> > > > > > > > incline to do some changes there/review PRs and cooperate on
> > > > testing
> > > > > > > those
> > > > > > > > things.
> > > > > > > >
> > > > > > > > It's not a user-facing change, and I do not think we want to
> > get
> > > > rid
> > > > > of
> > > > > > > > `pip` as an installation tool in general (in our images and
> > user
> > > > > facing
> > > > > > > > side) - it's mostly an internal CI tooling improvement I am
> > > > thinking
> > > > > > of.
> > > > > > > > Maybe at some point in time we can recommend it also for
> > > > development
> > > > > > > > workflows, and maybe someday it will gain enough popularity
> to
> > > > think
> > > > > > > about
> > > > > > > > recommending it to our users, but definitely not now nor in
> > even
> > > > > > mid-term
> > > > > > > > future.
> > > > > > > >
> > > > > > > > Let me know what you think.
> > > > > > > >
> > > > > > > > Repo here: https://github.com/astral-sh/uv
> > > > > > > >
> > > > > > > > J.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Hello here.

I have a PR https://github.com/apache/airflow/pull/37683 that implements:

* ability to choose either uv or PIP when building our images
* CI images are built with uv by default (but you can use `--no-use-uv` as
a flag and switch back to `pip`
* PROD images are built with pip by default (but you can us `--use-uv` as a
flag an switch to uv

The preliminary tests show indeed that uv not only has a much faster
baseline, but  also their use of caching fits extremely well into our
strategy of building images and we will get huge improvements of our CI
build timing when using uv.

Just for the context - our CI images when built are using a caching
strategy to optimise for f

1) fast building when there are no changes (around 1 minute to build with
pip),
2) slower building when someone adds or modifies non-conflicting dependency
(around. 8 minutes to build, out of which ~ 6 m is pip resolution and
installation)
3) much longer build time when there are conflicting dependencies or when
we change Dockerfile or scripts or when Python base image changes (around
27 minutes build out of which pip resolving is ~ 20m).

Those are all `pip` numbers. Currently `pip` does not use resolution
caching between the steps. Comparison of some basic installation steps from
initial tests show that UV is way faster:

* Resolving and Installing airflow with [devel-ci] (610 dependencies): pip
~ 6m, uv ~ 1m 30 s
* Re-resolving and reinstalling [devel-ci] using local pyproject.toml; pip
~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in this
case.

I have not yet tested well (but I will once they happen) --eager upgrade of
dependencies (pip - very much depends but it's often in the range of 10
minutes) - I expect it not to take more than 2-3 minutes with uv

So overall it looks like we are looking at those improvements:

1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m
(because we are using docker layer caching and pip resolution and
installation is not used at all)
2) Updating dependencies: 8m with pip will probably go down with uv to ~
3.30s => 60% improvement and in many cases ~ 2.5 m when there are no remote
changes and cache is used (70% improvement)
3) Re-resolving and reinstalling everything 27 m will probably go down with
uv to ~ 9m => 67% improvements.

If those numbers hold and the resolution quality will be comparable to
`pip` - then well, it's definitely worth it - and the numbers are very
close to what the `uv` authors claimed.

I am impressed :)

J.



On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <am...@gmail.com>
wrote:

> I agree with Niko here.
>
> If someone is willing to give it a try, we should enable it experimentally
> and give it a stint for a couple of weeks. If we see significant results,
> we can adopt it.
>
> Thanks & Regards,
> Amogh Desai
>
> On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko <onikolas@amazon.com.invalid
> >
> wrote:
>
> > The Astral folks also seem very focused on it being a drop-in/compliant
> > replacement for pip. So I think it's definitely worth dropping it in and
> > seeing if we get the expected performance improvements. If tests still
> pass
> > and user facing constraints and install instructions remain unchanged I
> > don't see why not, if someone is willing to spend the time on it. Never
> > mind the extra features it would give us (I, like others, am also very
> > excited about --resolution=lowest, ability).
> >
> > ________________________________
> > From: Andrey Anshin <an...@taragol.is>
> > Sent: Tuesday, February 20, 2024 12:26:56 AM
> > To: dev@airflow.apache.org
> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying
> > out uv for our CI workflows
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > > I share Andrey's skepticism. It's just yet another tool which has an
> > unclear
> > development strategy.
> >
> > My point was more about a matter of presentation. If someone told you
> "this
> > is a new tool, like a killer of previous tools" then you might think
> > "Yeah...yeah...yeah.. yet another replacement to tool X...  not really
> > interesting". On the other hand if someone told you what in cases you
> might
> > solve, then this might be a mind changer.
> >
> > Especially the promising `--resolution=lowest` option. We always want to
> > test something with minimal dependencies because we are not sure that it
> > might work with pretty old dependencies, and recently I've started to
> work
> > on POC to collect minimal versions of the Airflow and Providers. And at
> the
> > moment when I almost finished it the uv was released. Well sometimes it
> is
> > better to wait a bit and maybe someone would invent the same
> > solution 😁 and you don't have to spend a personal time.
> >
> > So as POC I'm on it, we still need a `pip` and validate some stuff by a
> pip
> > because it is only one officially supported way to install Airflow but if
> > something could be improved in the CI then I'm on it, in most cases it
> > would be behind of Breeze and many of the contributors might be even not
> > noticed that something changed.
> >
> >
> >
> >
> >
> >
> >
> >
> > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > Actually - of you read that blog post, the strategy is clear - they aim
> > to
> > > create a comprehensive packaging tooling and improvnts are measured
> > (80-100
> > > times they claim - I using caching - they (unlike pip) use a lot of
> local
> > > caching including resolving  dependencies).
> > >
> > > So I think both arguments are not valid if you ask me.
> > >
> > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <kxepal@apache.org
> >
> > > napisał:
> > >
> > > > I share Andrey's skepticism. It's just yet another tool which has an
> > > > unclear development strategy. Should you make it a free testing
> suite?
> > > What
> > > > project would receive in exchange? A lot of words about being faster,
> > but
> > > > how much? Are these milliseconds worth to change the stable tool
> with a
> > > new
> > > > one? And will it notably improve something?
> > > >
> > > > I think it's worth to try it just for fun and provide feedback, but
> > it'll
> > > > have to pass a long road to become such stable as pip.
> > > >
> > > > --
> > > > ,,,^..^,,,
> > > >
> > > >
> > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > > >
> > > > > My opinion:
> > > > >
> > > > > I think there is a place for a number of such tools. For a long
> time
> > > the
> > > > > packaging team and `pip` team have been working not only on `pip`
> > > > > implementation but also (and most importantly) to make sure that
> what
> > > > `pip`
> > > > > does is to be the beacon of standardisation of packaging APIs and
> > PEPs.
> > > > It
> > > > > will never IMHO have a lot of the fancy features that other tools
> > might
> > > > > provide (like the ones I mentioned). It will always be there to
> > provide
> > > > the
> > > > > robust and solid CLI to run all packaging things, but there are
> > plenty
> > > of
> > > > > opportunities to provide improved or modified, or more (or less)
> > > > > opinionated ways of doing things that are addressing some cases
> that
> > > > `pip`
> > > > > team simply will not be able or willing to handle, preferring
> "pure"
> > > > > standard approach vs. implement all the optional things. For
> example
> > > the
> > > > > way how pre-releases are handled can be improved to be more
> > selective.
> > > > The
> > > > > PEP describing it gives the tools an option to add more fancy
> > > behaviours
> > > > > (some of which we could find useful in our CI tooling). Should
> `pip`
> > > > > implement those - I don't think so. It would distract maintainers
> > from
> > > > > other more important things. It is quite ok to use other tooling in
> > > > places
> > > > > like our CI, where they do some parts of the installation better.
> > > > >
> > > > > For me `pip` is going more into the direction of `usable reference
> > > > > implementation of package installed` - any standard/ PEP will not
> > > matter
> > > > if
> > > > > `pip` does not implement it. But others might go in different
> > > directions
> > > > > and implement some less popular features and do it better, faster,
> > with
> > > > > greater flexibility. IMHO it's a win-win.
> > > > >
> > > > > J.
> > > > >
> > > > >
> > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> > > andrey.anshin@taragol.is
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Yesterday my friend shared with me that tool and I've been told
> > that
> > > > more
> > > > > > presumably it would be a niche tool. I've been told "who needs
> yet
> > > > > another
> > > > > > installer which stands to resolve all your problems' '.
> > > > > > I guess I was wrong?
> > > > > >
> > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > > > >
> > > > > > > Hey everyone,
> > > > > > >
> > > > > > > Few days ago the ruff creators have released a new tool uv -
> > which
> > > is
> > > > > an
> > > > > > > extremely fast (written in rust) and fully featured tool
> > generally
> > > > > fully
> > > > > > > compatible with `pip`.
> > > > > > >
> > > > > > > Blog post here: https://astral.sh/blog/uv
> > > > > > >
> > > > > > > It looks like It has a number of things that would make our CI
> > > cases
> > > > > and
> > > > > > > tooling quite a bit faster and better including a few things
> > that I
> > > > > have
> > > > > > > implemented some workarounds for and some that I have not
> > > > > > > implemented because `pip` had no good solution.
> > > > > > >
> > > > > > > I looked at the docs and it solves some problems that are
> > currently
> > > > > > > difficult or impossible to handle with `pip`:
> > > > > > >
> > > > > > > * ability to use overrides (which are constraints on steroids -
> > > > > allowing
> > > > > > to
> > > > > > > override limits specified by the packages - this will be very
> > > useful
> > > > to
> > > > > > > better handle our cases with "chicken-egg" providers (for
> example
> > > > like
> > > > > we
> > > > > > > had in FAB) where we have pre-release packages depending on
> each
> > > > other
> > > > > > >
> > > > > > > * different resolution strategies including --resolution=lowest
> > > which
> > > > > > will
> > > > > > > finally allow us to see whether airflow's lower bounds are
> still
> > > > > holding
> > > > > > > (i.e. - will our test still pass if we use the lowest supported
> > > > version
> > > > > > of
> > > > > > > our dependencies?  this is something i wanted to do for quite
> > some
> > > > time
> > > > > > and
> > > > > > > recorded an issue for that -
> > > > > > > https://github.com/apache/airflow/issues/35549
> > > > > > > but lack of tooling support made it a wish, with
> > > > `--resolution=lowest`
> > > > > it
> > > > > > > seems like super-easy thing to do.
> > > > > > >
> > > > > > > * It is said to be many, many times faster - with better
> caching
> > > and
> > > > > > > resolution speeds (similarly like with ruff they claim orders
> of
> > > > > > magnitude
> > > > > > > speedups in a number of cases). We can likely make very good
> use
> > of
> > > > it
> > > > > > and
> > > > > > > speed up some parts of our CI workflow significantly.
> > > > > > >
> > > > > > > I might likely do some experimenting with uv in our toolchain,
> > but
> > > > > wanted
> > > > > > > to make sure we are all aware of it - and ask if someone has
> > > > something
> > > > > > > against it (and maybe someone would like to do some work there
> > > trying
> > > > > it
> > > > > > > out - I will be happy to guide others with the dev/tooling
> > mindset
> > > > and
> > > > > > > incline to do some changes there/review PRs and cooperate on
> > > testing
> > > > > > those
> > > > > > > things.
> > > > > > >
> > > > > > > It's not a user-facing change, and I do not think we want to
> get
> > > rid
> > > > of
> > > > > > > `pip` as an installation tool in general (in our images and
> user
> > > > facing
> > > > > > > side) - it's mostly an internal CI tooling improvement I am
> > > thinking
> > > > > of.
> > > > > > > Maybe at some point in time we can recommend it also for
> > > development
> > > > > > > workflows, and maybe someday it will gain enough popularity to
> > > think
> > > > > > about
> > > > > > > recommending it to our users, but definitely not now nor in
> even
> > > > > mid-term
> > > > > > > future.
> > > > > > >
> > > > > > > Let me know what you think.
> > > > > > >
> > > > > > > Repo here: https://github.com/astral-sh/uv
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Amogh Desai <am...@gmail.com>.
I agree with Niko here.

If someone is willing to give it a try, we should enable it experimentally
and give it a stint for a couple of weeks. If we see significant results,
we can adopt it.

Thanks & Regards,
Amogh Desai

On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko <on...@amazon.com.invalid>
wrote:

> The Astral folks also seem very focused on it being a drop-in/compliant
> replacement for pip. So I think it's definitely worth dropping it in and
> seeing if we get the expected performance improvements. If tests still pass
> and user facing constraints and install instructions remain unchanged I
> don't see why not, if someone is willing to spend the time on it. Never
> mind the extra features it would give us (I, like others, am also very
> excited about --resolution=lowest, ability).
>
> ________________________________
> From: Andrey Anshin <an...@taragol.is>
> Sent: Tuesday, February 20, 2024 12:26:56 AM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying
> out uv for our CI workflows
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
> le contenu ne présente aucun risque.
>
>
>
> > I share Andrey's skepticism. It's just yet another tool which has an
> unclear
> development strategy.
>
> My point was more about a matter of presentation. If someone told you "this
> is a new tool, like a killer of previous tools" then you might think
> "Yeah...yeah...yeah.. yet another replacement to tool X...  not really
> interesting". On the other hand if someone told you what in cases you might
> solve, then this might be a mind changer.
>
> Especially the promising `--resolution=lowest` option. We always want to
> test something with minimal dependencies because we are not sure that it
> might work with pretty old dependencies, and recently I've started to work
> on POC to collect minimal versions of the Airflow and Providers. And at the
> moment when I almost finished it the uv was released. Well sometimes it is
> better to wait a bit and maybe someone would invent the same
> solution 😁 and you don't have to spend a personal time.
>
> So as POC I'm on it, we still need a `pip` and validate some stuff by a pip
> because it is only one officially supported way to install Airflow but if
> something could be improved in the CI then I'm on it, in most cases it
> would be behind of Breeze and many of the contributors might be even not
> noticed that something changed.
>
>
>
>
>
>
>
>
> On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Actually - of you read that blog post, the strategy is clear - they aim
> to
> > create a comprehensive packaging tooling and improvnts are measured
> (80-100
> > times they claim - I using caching - they (unlike pip) use a lot of local
> > caching including resolving  dependencies).
> >
> > So I think both arguments are not valid if you ask me.
> >
> > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <kx...@apache.org>
> > napisał:
> >
> > > I share Andrey's skepticism. It's just yet another tool which has an
> > > unclear development strategy. Should you make it a free testing suite?
> > What
> > > project would receive in exchange? A lot of words about being faster,
> but
> > > how much? Are these milliseconds worth to change the stable tool with a
> > new
> > > one? And will it notably improve something?
> > >
> > > I think it's worth to try it just for fun and provide feedback, but
> it'll
> > > have to pass a long road to become such stable as pip.
> > >
> > > --
> > > ,,,^..^,,,
> > >
> > >
> > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > My opinion:
> > > >
> > > > I think there is a place for a number of such tools. For a long time
> > the
> > > > packaging team and `pip` team have been working not only on `pip`
> > > > implementation but also (and most importantly) to make sure that what
> > > `pip`
> > > > does is to be the beacon of standardisation of packaging APIs and
> PEPs.
> > > It
> > > > will never IMHO have a lot of the fancy features that other tools
> might
> > > > provide (like the ones I mentioned). It will always be there to
> provide
> > > the
> > > > robust and solid CLI to run all packaging things, but there are
> plenty
> > of
> > > > opportunities to provide improved or modified, or more (or less)
> > > > opinionated ways of doing things that are addressing some cases that
> > > `pip`
> > > > team simply will not be able or willing to handle, preferring "pure"
> > > > standard approach vs. implement all the optional things. For example
> > the
> > > > way how pre-releases are handled can be improved to be more
> selective.
> > > The
> > > > PEP describing it gives the tools an option to add more fancy
> > behaviours
> > > > (some of which we could find useful in our CI tooling). Should `pip`
> > > > implement those - I don't think so. It would distract maintainers
> from
> > > > other more important things. It is quite ok to use other tooling in
> > > places
> > > > like our CI, where they do some parts of the installation better.
> > > >
> > > > For me `pip` is going more into the direction of `usable reference
> > > > implementation of package installed` - any standard/ PEP will not
> > matter
> > > if
> > > > `pip` does not implement it. But others might go in different
> > directions
> > > > and implement some less popular features and do it better, faster,
> with
> > > > greater flexibility. IMHO it's a win-win.
> > > >
> > > > J.
> > > >
> > > >
> > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> > andrey.anshin@taragol.is
> > > >
> > > > wrote:
> > > >
> > > > > Yesterday my friend shared with me that tool and I've been told
> that
> > > more
> > > > > presumably it would be a niche tool. I've been told "who needs yet
> > > > another
> > > > > installer which stands to resolve all your problems' '.
> > > > > I guess I was wrong?
> > > > >
> > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > > > >
> > > > > > Hey everyone,
> > > > > >
> > > > > > Few days ago the ruff creators have released a new tool uv -
> which
> > is
> > > > an
> > > > > > extremely fast (written in rust) and fully featured tool
> generally
> > > > fully
> > > > > > compatible with `pip`.
> > > > > >
> > > > > > Blog post here: https://astral.sh/blog/uv
> > > > > >
> > > > > > It looks like It has a number of things that would make our CI
> > cases
> > > > and
> > > > > > tooling quite a bit faster and better including a few things
> that I
> > > > have
> > > > > > implemented some workarounds for and some that I have not
> > > > > > implemented because `pip` had no good solution.
> > > > > >
> > > > > > I looked at the docs and it solves some problems that are
> currently
> > > > > > difficult or impossible to handle with `pip`:
> > > > > >
> > > > > > * ability to use overrides (which are constraints on steroids -
> > > > allowing
> > > > > to
> > > > > > override limits specified by the packages - this will be very
> > useful
> > > to
> > > > > > better handle our cases with "chicken-egg" providers (for example
> > > like
> > > > we
> > > > > > had in FAB) where we have pre-release packages depending on each
> > > other
> > > > > >
> > > > > > * different resolution strategies including --resolution=lowest
> > which
> > > > > will
> > > > > > finally allow us to see whether airflow's lower bounds are still
> > > > holding
> > > > > > (i.e. - will our test still pass if we use the lowest supported
> > > version
> > > > > of
> > > > > > our dependencies?  this is something i wanted to do for quite
> some
> > > time
> > > > > and
> > > > > > recorded an issue for that -
> > > > > > https://github.com/apache/airflow/issues/35549
> > > > > > but lack of tooling support made it a wish, with
> > > `--resolution=lowest`
> > > > it
> > > > > > seems like super-easy thing to do.
> > > > > >
> > > > > > * It is said to be many, many times faster - with better caching
> > and
> > > > > > resolution speeds (similarly like with ruff they claim orders of
> > > > > magnitude
> > > > > > speedups in a number of cases). We can likely make very good use
> of
> > > it
> > > > > and
> > > > > > speed up some parts of our CI workflow significantly.
> > > > > >
> > > > > > I might likely do some experimenting with uv in our toolchain,
> but
> > > > wanted
> > > > > > to make sure we are all aware of it - and ask if someone has
> > > something
> > > > > > against it (and maybe someone would like to do some work there
> > trying
> > > > it
> > > > > > out - I will be happy to guide others with the dev/tooling
> mindset
> > > and
> > > > > > incline to do some changes there/review PRs and cooperate on
> > testing
> > > > > those
> > > > > > things.
> > > > > >
> > > > > > It's not a user-facing change, and I do not think we want to get
> > rid
> > > of
> > > > > > `pip` as an installation tool in general (in our images and user
> > > facing
> > > > > > side) - it's mostly an internal CI tooling improvement I am
> > thinking
> > > > of.
> > > > > > Maybe at some point in time we can recommend it also for
> > development
> > > > > > workflows, and maybe someday it will gain enough popularity to
> > think
> > > > > about
> > > > > > recommending it to our users, but definitely not now nor in even
> > > > mid-term
> > > > > > future.
> > > > > >
> > > > > > Let me know what you think.
> > > > > >
> > > > > > Repo here: https://github.com/astral-sh/uv
> > > > > >
> > > > > > J.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by "Oliveira, Niko" <on...@amazon.com.INVALID>.
The Astral folks also seem very focused on it being a drop-in/compliant replacement for pip. So I think it's definitely worth dropping it in and seeing if we get the expected performance improvements. If tests still pass and user facing constraints and install instructions remain unchanged I don't see why not, if someone is willing to spend the time on it. Never mind the extra features it would give us (I, like others, am also very excited about --resolution=lowest, ability).

________________________________
From: Andrey Anshin <an...@taragol.is>
Sent: Tuesday, February 20, 2024 12:26:56 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying out uv for our CI workflows

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le contenu ne présente aucun risque.



> I share Andrey's skepticism. It's just yet another tool which has an unclear
development strategy.

My point was more about a matter of presentation. If someone told you "this
is a new tool, like a killer of previous tools" then you might think
"Yeah...yeah...yeah.. yet another replacement to tool X...  not really
interesting". On the other hand if someone told you what in cases you might
solve, then this might be a mind changer.

Especially the promising `--resolution=lowest` option. We always want to
test something with minimal dependencies because we are not sure that it
might work with pretty old dependencies, and recently I've started to work
on POC to collect minimal versions of the Airflow and Providers. And at the
moment when I almost finished it the uv was released. Well sometimes it is
better to wait a bit and maybe someone would invent the same
solution 😁 and you don't have to spend a personal time.

So as POC I'm on it, we still need a `pip` and validate some stuff by a pip
because it is only one officially supported way to install Airflow but if
something could be improved in the CI then I'm on it, in most cases it
would be behind of Breeze and many of the contributors might be even not
noticed that something changed.








On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:

> Actually - of you read that blog post, the strategy is clear - they aim to
> create a comprehensive packaging tooling and improvnts are measured (80-100
> times they claim - I using caching - they (unlike pip) use a lot of local
> caching including resolving  dependencies).
>
> So I think both arguments are not valid if you ask me.
>
> wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <kx...@apache.org>
> napisał:
>
> > I share Andrey's skepticism. It's just yet another tool which has an
> > unclear development strategy. Should you make it a free testing suite?
> What
> > project would receive in exchange? A lot of words about being faster, but
> > how much? Are these milliseconds worth to change the stable tool with a
> new
> > one? And will it notably improve something?
> >
> > I think it's worth to try it just for fun and provide feedback, but it'll
> > have to pass a long road to become such stable as pip.
> >
> > --
> > ,,,^..^,,,
> >
> >
> > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > My opinion:
> > >
> > > I think there is a place for a number of such tools. For a long time
> the
> > > packaging team and `pip` team have been working not only on `pip`
> > > implementation but also (and most importantly) to make sure that what
> > `pip`
> > > does is to be the beacon of standardisation of packaging APIs and PEPs.
> > It
> > > will never IMHO have a lot of the fancy features that other tools might
> > > provide (like the ones I mentioned). It will always be there to provide
> > the
> > > robust and solid CLI to run all packaging things, but there are plenty
> of
> > > opportunities to provide improved or modified, or more (or less)
> > > opinionated ways of doing things that are addressing some cases that
> > `pip`
> > > team simply will not be able or willing to handle, preferring "pure"
> > > standard approach vs. implement all the optional things. For example
> the
> > > way how pre-releases are handled can be improved to be more selective.
> > The
> > > PEP describing it gives the tools an option to add more fancy
> behaviours
> > > (some of which we could find useful in our CI tooling). Should `pip`
> > > implement those - I don't think so. It would distract maintainers from
> > > other more important things. It is quite ok to use other tooling in
> > places
> > > like our CI, where they do some parts of the installation better.
> > >
> > > For me `pip` is going more into the direction of `usable reference
> > > implementation of package installed` - any standard/ PEP will not
> matter
> > if
> > > `pip` does not implement it. But others might go in different
> directions
> > > and implement some less popular features and do it better, faster, with
> > > greater flexibility. IMHO it's a win-win.
> > >
> > > J.
> > >
> > >
> > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> andrey.anshin@taragol.is
> > >
> > > wrote:
> > >
> > > > Yesterday my friend shared with me that tool and I've been told that
> > more
> > > > presumably it would be a niche tool. I've been told "who needs yet
> > > another
> > > > installer which stands to resolve all your problems' '.
> > > > I guess I was wrong?
> > > >
> > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > >
> > > > > Hey everyone,
> > > > >
> > > > > Few days ago the ruff creators have released a new tool uv - which
> is
> > > an
> > > > > extremely fast (written in rust) and fully featured tool generally
> > > fully
> > > > > compatible with `pip`.
> > > > >
> > > > > Blog post here: https://astral.sh/blog/uv
> > > > >
> > > > > It looks like It has a number of things that would make our CI
> cases
> > > and
> > > > > tooling quite a bit faster and better including a few things that I
> > > have
> > > > > implemented some workarounds for and some that I have not
> > > > > implemented because `pip` had no good solution.
> > > > >
> > > > > I looked at the docs and it solves some problems that are currently
> > > > > difficult or impossible to handle with `pip`:
> > > > >
> > > > > * ability to use overrides (which are constraints on steroids -
> > > allowing
> > > > to
> > > > > override limits specified by the packages - this will be very
> useful
> > to
> > > > > better handle our cases with "chicken-egg" providers (for example
> > like
> > > we
> > > > > had in FAB) where we have pre-release packages depending on each
> > other
> > > > >
> > > > > * different resolution strategies including --resolution=lowest
> which
> > > > will
> > > > > finally allow us to see whether airflow's lower bounds are still
> > > holding
> > > > > (i.e. - will our test still pass if we use the lowest supported
> > version
> > > > of
> > > > > our dependencies?  this is something i wanted to do for quite some
> > time
> > > > and
> > > > > recorded an issue for that -
> > > > > https://github.com/apache/airflow/issues/35549
> > > > > but lack of tooling support made it a wish, with
> > `--resolution=lowest`
> > > it
> > > > > seems like super-easy thing to do.
> > > > >
> > > > > * It is said to be many, many times faster - with better caching
> and
> > > > > resolution speeds (similarly like with ruff they claim orders of
> > > > magnitude
> > > > > speedups in a number of cases). We can likely make very good use of
> > it
> > > > and
> > > > > speed up some parts of our CI workflow significantly.
> > > > >
> > > > > I might likely do some experimenting with uv in our toolchain, but
> > > wanted
> > > > > to make sure we are all aware of it - and ask if someone has
> > something
> > > > > against it (and maybe someone would like to do some work there
> trying
> > > it
> > > > > out - I will be happy to guide others with the dev/tooling mindset
> > and
> > > > > incline to do some changes there/review PRs and cooperate on
> testing
> > > > those
> > > > > things.
> > > > >
> > > > > It's not a user-facing change, and I do not think we want to get
> rid
> > of
> > > > > `pip` as an installation tool in general (in our images and user
> > facing
> > > > > side) - it's mostly an internal CI tooling improvement I am
> thinking
> > > of.
> > > > > Maybe at some point in time we can recommend it also for
> development
> > > > > workflows, and maybe someday it will gain enough popularity to
> think
> > > > about
> > > > > recommending it to our users, but definitely not now nor in even
> > > mid-term
> > > > > future.
> > > > >
> > > > > Let me know what you think.
> > > > >
> > > > > Repo here: https://github.com/astral-sh/uv
> > > > >
> > > > > J.
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Andrey Anshin <an...@taragol.is>.
> I share Andrey's skepticism. It's just yet another tool which has an unclear
development strategy.

My point was more about a matter of presentation. If someone told you "this
is a new tool, like a killer of previous tools" then you might think
"Yeah...yeah...yeah.. yet another replacement to tool X...  not really
interesting". On the other hand if someone told you what in cases you might
solve, then this might be a mind changer.

Especially the promising `--resolution=lowest` option. We always want to
test something with minimal dependencies because we are not sure that it
might work with pretty old dependencies, and recently I've started to work
on POC to collect minimal versions of the Airflow and Providers. And at the
moment when I almost finished it the uv was released. Well sometimes it is
better to wait a bit and maybe someone would invent the same
solution 😁 and you don't have to spend a personal time.

So as POC I'm on it, we still need a `pip` and validate some stuff by a pip
because it is only one officially supported way to install Airflow but if
something could be improved in the CI then I'm on it, in most cases it
would be behind of Breeze and many of the contributors might be even not
noticed that something changed.








On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <ja...@potiuk.com> wrote:

> Actually - of you read that blog post, the strategy is clear - they aim to
> create a comprehensive packaging tooling and improvnts are measured (80-100
> times they claim - I using caching - they (unlike pip) use a lot of local
> caching including resolving  dependencies).
>
> So I think both arguments are not valid if you ask me.
>
> wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <kx...@apache.org>
> napisał:
>
> > I share Andrey's skepticism. It's just yet another tool which has an
> > unclear development strategy. Should you make it a free testing suite?
> What
> > project would receive in exchange? A lot of words about being faster, but
> > how much? Are these milliseconds worth to change the stable tool with a
> new
> > one? And will it notably improve something?
> >
> > I think it's worth to try it just for fun and provide feedback, but it'll
> > have to pass a long road to become such stable as pip.
> >
> > --
> > ,,,^..^,,,
> >
> >
> > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > My opinion:
> > >
> > > I think there is a place for a number of such tools. For a long time
> the
> > > packaging team and `pip` team have been working not only on `pip`
> > > implementation but also (and most importantly) to make sure that what
> > `pip`
> > > does is to be the beacon of standardisation of packaging APIs and PEPs.
> > It
> > > will never IMHO have a lot of the fancy features that other tools might
> > > provide (like the ones I mentioned). It will always be there to provide
> > the
> > > robust and solid CLI to run all packaging things, but there are plenty
> of
> > > opportunities to provide improved or modified, or more (or less)
> > > opinionated ways of doing things that are addressing some cases that
> > `pip`
> > > team simply will not be able or willing to handle, preferring "pure"
> > > standard approach vs. implement all the optional things. For example
> the
> > > way how pre-releases are handled can be improved to be more selective.
> > The
> > > PEP describing it gives the tools an option to add more fancy
> behaviours
> > > (some of which we could find useful in our CI tooling). Should `pip`
> > > implement those - I don't think so. It would distract maintainers from
> > > other more important things. It is quite ok to use other tooling in
> > places
> > > like our CI, where they do some parts of the installation better.
> > >
> > > For me `pip` is going more into the direction of `usable reference
> > > implementation of package installed` - any standard/ PEP will not
> matter
> > if
> > > `pip` does not implement it. But others might go in different
> directions
> > > and implement some less popular features and do it better, faster, with
> > > greater flexibility. IMHO it's a win-win.
> > >
> > > J.
> > >
> > >
> > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> andrey.anshin@taragol.is
> > >
> > > wrote:
> > >
> > > > Yesterday my friend shared with me that tool and I've been told that
> > more
> > > > presumably it would be a niche tool. I've been told "who needs yet
> > > another
> > > > installer which stands to resolve all your problems' '.
> > > > I guess I was wrong?
> > > >
> > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > >
> > > > > Hey everyone,
> > > > >
> > > > > Few days ago the ruff creators have released a new tool uv - which
> is
> > > an
> > > > > extremely fast (written in rust) and fully featured tool generally
> > > fully
> > > > > compatible with `pip`.
> > > > >
> > > > > Blog post here: https://astral.sh/blog/uv
> > > > >
> > > > > It looks like It has a number of things that would make our CI
> cases
> > > and
> > > > > tooling quite a bit faster and better including a few things that I
> > > have
> > > > > implemented some workarounds for and some that I have not
> > > > > implemented because `pip` had no good solution.
> > > > >
> > > > > I looked at the docs and it solves some problems that are currently
> > > > > difficult or impossible to handle with `pip`:
> > > > >
> > > > > * ability to use overrides (which are constraints on steroids -
> > > allowing
> > > > to
> > > > > override limits specified by the packages - this will be very
> useful
> > to
> > > > > better handle our cases with "chicken-egg" providers (for example
> > like
> > > we
> > > > > had in FAB) where we have pre-release packages depending on each
> > other
> > > > >
> > > > > * different resolution strategies including --resolution=lowest
> which
> > > > will
> > > > > finally allow us to see whether airflow's lower bounds are still
> > > holding
> > > > > (i.e. - will our test still pass if we use the lowest supported
> > version
> > > > of
> > > > > our dependencies?  this is something i wanted to do for quite some
> > time
> > > > and
> > > > > recorded an issue for that -
> > > > > https://github.com/apache/airflow/issues/35549
> > > > > but lack of tooling support made it a wish, with
> > `--resolution=lowest`
> > > it
> > > > > seems like super-easy thing to do.
> > > > >
> > > > > * It is said to be many, many times faster - with better caching
> and
> > > > > resolution speeds (similarly like with ruff they claim orders of
> > > > magnitude
> > > > > speedups in a number of cases). We can likely make very good use of
> > it
> > > > and
> > > > > speed up some parts of our CI workflow significantly.
> > > > >
> > > > > I might likely do some experimenting with uv in our toolchain, but
> > > wanted
> > > > > to make sure we are all aware of it - and ask if someone has
> > something
> > > > > against it (and maybe someone would like to do some work there
> trying
> > > it
> > > > > out - I will be happy to guide others with the dev/tooling mindset
> > and
> > > > > incline to do some changes there/review PRs and cooperate on
> testing
> > > > those
> > > > > things.
> > > > >
> > > > > It's not a user-facing change, and I do not think we want to get
> rid
> > of
> > > > > `pip` as an installation tool in general (in our images and user
> > facing
> > > > > side) - it's mostly an internal CI tooling improvement I am
> thinking
> > > of.
> > > > > Maybe at some point in time we can recommend it also for
> development
> > > > > workflows, and maybe someday it will gain enough popularity to
> think
> > > > about
> > > > > recommending it to our users, but definitely not now nor in even
> > > mid-term
> > > > > future.
> > > > >
> > > > > Let me know what you think.
> > > > >
> > > > > Repo here: https://github.com/astral-sh/uv
> > > > >
> > > > > J.
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Pierre Jeambrun <pi...@gmail.com>.
I can definitely see benefits and I would be in favour of using it in our
CI jobs, allowing for faster and more robust pipeline (alternative
resolution etc…)

As Jarek mentioned, it is not aimed to be a user facing change, just
internal to make things easier. We have cases (Sbom generation for
providers, creation of all airflow images and such) where pip slowness is
the main reason that such jobs take minutes, and hours in case we want to
catchup old versions.

I don’t see harm using this as an extra dev tool. We just need to make sure
that constraint files and other “user” assets are not impacted. (As planned
I assumed)

On Tue 20 Feb 2024 at 06:56, Jarek Potiuk <ja...@potiuk.com> wrote:

> Actually - of you read that blog post, the strategy is clear - they aim to
> create a comprehensive packaging tooling and improvnts are measured (80-100
> times they claim - I using caching - they (unlike pip) use a lot of local
> caching including resolving  dependencies).
>
> So I think both arguments are not valid if you ask me.
>
> wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <kx...@apache.org>
> napisał:
>
> > I share Andrey's skepticism. It's just yet another tool which has an
> > unclear development strategy. Should you make it a free testing suite?
> What
> > project would receive in exchange? A lot of words about being faster, but
> > how much? Are these milliseconds worth to change the stable tool with a
> new
> > one? And will it notably improve something?
> >
> > I think it's worth to try it just for fun and provide feedback, but it'll
> > have to pass a long road to become such stable as pip.
> >
> > --
> > ,,,^..^,,,
> >
> >
> > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > My opinion:
> > >
> > > I think there is a place for a number of such tools. For a long time
> the
> > > packaging team and `pip` team have been working not only on `pip`
> > > implementation but also (and most importantly) to make sure that what
> > `pip`
> > > does is to be the beacon of standardisation of packaging APIs and PEPs.
> > It
> > > will never IMHO have a lot of the fancy features that other tools might
> > > provide (like the ones I mentioned). It will always be there to provide
> > the
> > > robust and solid CLI to run all packaging things, but there are plenty
> of
> > > opportunities to provide improved or modified, or more (or less)
> > > opinionated ways of doing things that are addressing some cases that
> > `pip`
> > > team simply will not be able or willing to handle, preferring "pure"
> > > standard approach vs. implement all the optional things. For example
> the
> > > way how pre-releases are handled can be improved to be more selective.
> > The
> > > PEP describing it gives the tools an option to add more fancy
> behaviours
> > > (some of which we could find useful in our CI tooling). Should `pip`
> > > implement those - I don't think so. It would distract maintainers from
> > > other more important things. It is quite ok to use other tooling in
> > places
> > > like our CI, where they do some parts of the installation better.
> > >
> > > For me `pip` is going more into the direction of `usable reference
> > > implementation of package installed` - any standard/ PEP will not
> matter
> > if
> > > `pip` does not implement it. But others might go in different
> directions
> > > and implement some less popular features and do it better, faster, with
> > > greater flexibility. IMHO it's a win-win.
> > >
> > > J.
> > >
> > >
> > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <
> andrey.anshin@taragol.is
> > >
> > > wrote:
> > >
> > > > Yesterday my friend shared with me that tool and I've been told that
> > more
> > > > presumably it would be a niche tool. I've been told "who needs yet
> > > another
> > > > installer which stands to resolve all your problems' '.
> > > > I guess I was wrong?
> > > >
> > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > >
> > > > > Hey everyone,
> > > > >
> > > > > Few days ago the ruff creators have released a new tool uv - which
> is
> > > an
> > > > > extremely fast (written in rust) and fully featured tool generally
> > > fully
> > > > > compatible with `pip`.
> > > > >
> > > > > Blog post here: https://astral.sh/blog/uv
> > > > >
> > > > > It looks like It has a number of things that would make our CI
> cases
> > > and
> > > > > tooling quite a bit faster and better including a few things that I
> > > have
> > > > > implemented some workarounds for and some that I have not
> > > > > implemented because `pip` had no good solution.
> > > > >
> > > > > I looked at the docs and it solves some problems that are currently
> > > > > difficult or impossible to handle with `pip`:
> > > > >
> > > > > * ability to use overrides (which are constraints on steroids -
> > > allowing
> > > > to
> > > > > override limits specified by the packages - this will be very
> useful
> > to
> > > > > better handle our cases with "chicken-egg" providers (for example
> > like
> > > we
> > > > > had in FAB) where we have pre-release packages depending on each
> > other
> > > > >
> > > > > * different resolution strategies including --resolution=lowest
> which
> > > > will
> > > > > finally allow us to see whether airflow's lower bounds are still
> > > holding
> > > > > (i.e. - will our test still pass if we use the lowest supported
> > version
> > > > of
> > > > > our dependencies?  this is something i wanted to do for quite some
> > time
> > > > and
> > > > > recorded an issue for that -
> > > > > https://github.com/apache/airflow/issues/35549
> > > > > but lack of tooling support made it a wish, with
> > `--resolution=lowest`
> > > it
> > > > > seems like super-easy thing to do.
> > > > >
> > > > > * It is said to be many, many times faster - with better caching
> and
> > > > > resolution speeds (similarly like with ruff they claim orders of
> > > > magnitude
> > > > > speedups in a number of cases). We can likely make very good use of
> > it
> > > > and
> > > > > speed up some parts of our CI workflow significantly.
> > > > >
> > > > > I might likely do some experimenting with uv in our toolchain, but
> > > wanted
> > > > > to make sure we are all aware of it - and ask if someone has
> > something
> > > > > against it (and maybe someone would like to do some work there
> trying
> > > it
> > > > > out - I will be happy to guide others with the dev/tooling mindset
> > and
> > > > > incline to do some changes there/review PRs and cooperate on
> testing
> > > > those
> > > > > things.
> > > > >
> > > > > It's not a user-facing change, and I do not think we want to get
> rid
> > of
> > > > > `pip` as an installation tool in general (in our images and user
> > facing
> > > > > side) - it's mostly an internal CI tooling improvement I am
> thinking
> > > of.
> > > > > Maybe at some point in time we can recommend it also for
> development
> > > > > workflows, and maybe someday it will gain enough popularity to
> think
> > > > about
> > > > > recommending it to our users, but definitely not now nor in even
> > > mid-term
> > > > > future.
> > > > >
> > > > > Let me know what you think.
> > > > >
> > > > > Repo here: https://github.com/astral-sh/uv
> > > > >
> > > > > J.
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
Actually - of you read that blog post, the strategy is clear - they aim to
create a comprehensive packaging tooling and improvnts are measured (80-100
times they claim - I using caching - they (unlike pip) use a lot of local
caching including resolving  dependencies).

So I think both arguments are not valid if you ask me.

wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin <kx...@apache.org>
napisał:

> I share Andrey's skepticism. It's just yet another tool which has an
> unclear development strategy. Should you make it a free testing suite? What
> project would receive in exchange? A lot of words about being faster, but
> how much? Are these milliseconds worth to change the stable tool with a new
> one? And will it notably improve something?
>
> I think it's worth to try it just for fun and provide feedback, but it'll
> have to pass a long road to become such stable as pip.
>
> --
> ,,,^..^,,,
>
>
> On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > My opinion:
> >
> > I think there is a place for a number of such tools. For a long time the
> > packaging team and `pip` team have been working not only on `pip`
> > implementation but also (and most importantly) to make sure that what
> `pip`
> > does is to be the beacon of standardisation of packaging APIs and PEPs.
> It
> > will never IMHO have a lot of the fancy features that other tools might
> > provide (like the ones I mentioned). It will always be there to provide
> the
> > robust and solid CLI to run all packaging things, but there are plenty of
> > opportunities to provide improved or modified, or more (or less)
> > opinionated ways of doing things that are addressing some cases that
> `pip`
> > team simply will not be able or willing to handle, preferring "pure"
> > standard approach vs. implement all the optional things. For example the
> > way how pre-releases are handled can be improved to be more selective.
> The
> > PEP describing it gives the tools an option to add more fancy behaviours
> > (some of which we could find useful in our CI tooling). Should `pip`
> > implement those - I don't think so. It would distract maintainers from
> > other more important things. It is quite ok to use other tooling in
> places
> > like our CI, where they do some parts of the installation better.
> >
> > For me `pip` is going more into the direction of `usable reference
> > implementation of package installed` - any standard/ PEP will not matter
> if
> > `pip` does not implement it. But others might go in different directions
> > and implement some less popular features and do it better, faster, with
> > greater flexibility. IMHO it's a win-win.
> >
> > J.
> >
> >
> > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <andrey.anshin@taragol.is
> >
> > wrote:
> >
> > > Yesterday my friend shared with me that tool and I've been told that
> more
> > > presumably it would be a niche tool. I've been told "who needs yet
> > another
> > > installer which stands to resolve all your problems' '.
> > > I guess I was wrong?
> > >
> > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > Hey everyone,
> > > >
> > > > Few days ago the ruff creators have released a new tool uv - which is
> > an
> > > > extremely fast (written in rust) and fully featured tool generally
> > fully
> > > > compatible with `pip`.
> > > >
> > > > Blog post here: https://astral.sh/blog/uv
> > > >
> > > > It looks like It has a number of things that would make our CI cases
> > and
> > > > tooling quite a bit faster and better including a few things that I
> > have
> > > > implemented some workarounds for and some that I have not
> > > > implemented because `pip` had no good solution.
> > > >
> > > > I looked at the docs and it solves some problems that are currently
> > > > difficult or impossible to handle with `pip`:
> > > >
> > > > * ability to use overrides (which are constraints on steroids -
> > allowing
> > > to
> > > > override limits specified by the packages - this will be very useful
> to
> > > > better handle our cases with "chicken-egg" providers (for example
> like
> > we
> > > > had in FAB) where we have pre-release packages depending on each
> other
> > > >
> > > > * different resolution strategies including --resolution=lowest which
> > > will
> > > > finally allow us to see whether airflow's lower bounds are still
> > holding
> > > > (i.e. - will our test still pass if we use the lowest supported
> version
> > > of
> > > > our dependencies?  this is something i wanted to do for quite some
> time
> > > and
> > > > recorded an issue for that -
> > > > https://github.com/apache/airflow/issues/35549
> > > > but lack of tooling support made it a wish, with
> `--resolution=lowest`
> > it
> > > > seems like super-easy thing to do.
> > > >
> > > > * It is said to be many, many times faster - with better caching and
> > > > resolution speeds (similarly like with ruff they claim orders of
> > > magnitude
> > > > speedups in a number of cases). We can likely make very good use of
> it
> > > and
> > > > speed up some parts of our CI workflow significantly.
> > > >
> > > > I might likely do some experimenting with uv in our toolchain, but
> > wanted
> > > > to make sure we are all aware of it - and ask if someone has
> something
> > > > against it (and maybe someone would like to do some work there trying
> > it
> > > > out - I will be happy to guide others with the dev/tooling mindset
> and
> > > > incline to do some changes there/review PRs and cooperate on testing
> > > those
> > > > things.
> > > >
> > > > It's not a user-facing change, and I do not think we want to get rid
> of
> > > > `pip` as an installation tool in general (in our images and user
> facing
> > > > side) - it's mostly an internal CI tooling improvement I am thinking
> > of.
> > > > Maybe at some point in time we can recommend it also for development
> > > > workflows, and maybe someday it will gain enough popularity to think
> > > about
> > > > recommending it to our users, but definitely not now nor in even
> > mid-term
> > > > future.
> > > >
> > > > Let me know what you think.
> > > >
> > > > Repo here: https://github.com/astral-sh/uv
> > > >
> > > > J.
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Amogh Desai <am...@gmail.com>.
I am not having any strong objections to any of the decision take here,
solely
because this won't be an user facing change

On Tue, Feb 20, 2024 at 7:07 AM Alexander Shorin <kx...@apache.org> wrote:

> I share Andrey's skepticism. It's just yet another tool which has an
> unclear development strategy. Should you make it a free testing suite? What
> project would receive in exchange? A lot of words about being faster, but
> how much? Are these milliseconds worth to change the stable tool with a new
> one? And will it notably improve something?
>
> I think it's worth to try it just for fun and provide feedback, but it'll
> have to pass a long road to become such stable as pip.
>
> --
> ,,,^..^,,,
>
>
> On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > My opinion:
> >
> > I think there is a place for a number of such tools. For a long time the
> > packaging team and `pip` team have been working not only on `pip`
> > implementation but also (and most importantly) to make sure that what
> `pip`
> > does is to be the beacon of standardisation of packaging APIs and PEPs.
> It
> > will never IMHO have a lot of the fancy features that other tools might
> > provide (like the ones I mentioned). It will always be there to provide
> the
> > robust and solid CLI to run all packaging things, but there are plenty of
> > opportunities to provide improved or modified, or more (or less)
> > opinionated ways of doing things that are addressing some cases that
> `pip`
> > team simply will not be able or willing to handle, preferring "pure"
> > standard approach vs. implement all the optional things. For example the
> > way how pre-releases are handled can be improved to be more selective.
> The
> > PEP describing it gives the tools an option to add more fancy behaviours
> > (some of which we could find useful in our CI tooling). Should `pip`
> > implement those - I don't think so. It would distract maintainers from
> > other more important things. It is quite ok to use other tooling in
> places
> > like our CI, where they do some parts of the installation better.
> >
> > For me `pip` is going more into the direction of `usable reference
> > implementation of package installed` - any standard/ PEP will not matter
> if
> > `pip` does not implement it. But others might go in different directions
> > and implement some less popular features and do it better, faster, with
> > greater flexibility. IMHO it's a win-win.
> >
> > J.
> >
> >
> > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <andrey.anshin@taragol.is
> >
> > wrote:
> >
> > > Yesterday my friend shared with me that tool and I've been told that
> more
> > > presumably it would be a niche tool. I've been told "who needs yet
> > another
> > > installer which stands to resolve all your problems' '.
> > > I guess I was wrong?
> > >
> > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > > Hey everyone,
> > > >
> > > > Few days ago the ruff creators have released a new tool uv - which is
> > an
> > > > extremely fast (written in rust) and fully featured tool generally
> > fully
> > > > compatible with `pip`.
> > > >
> > > > Blog post here: https://astral.sh/blog/uv
> > > >
> > > > It looks like It has a number of things that would make our CI cases
> > and
> > > > tooling quite a bit faster and better including a few things that I
> > have
> > > > implemented some workarounds for and some that I have not
> > > > implemented because `pip` had no good solution.
> > > >
> > > > I looked at the docs and it solves some problems that are currently
> > > > difficult or impossible to handle with `pip`:
> > > >
> > > > * ability to use overrides (which are constraints on steroids -
> > allowing
> > > to
> > > > override limits specified by the packages - this will be very useful
> to
> > > > better handle our cases with "chicken-egg" providers (for example
> like
> > we
> > > > had in FAB) where we have pre-release packages depending on each
> other
> > > >
> > > > * different resolution strategies including --resolution=lowest which
> > > will
> > > > finally allow us to see whether airflow's lower bounds are still
> > holding
> > > > (i.e. - will our test still pass if we use the lowest supported
> version
> > > of
> > > > our dependencies?  this is something i wanted to do for quite some
> time
> > > and
> > > > recorded an issue for that -
> > > > https://github.com/apache/airflow/issues/35549
> > > > but lack of tooling support made it a wish, with
> `--resolution=lowest`
> > it
> > > > seems like super-easy thing to do.
> > > >
> > > > * It is said to be many, many times faster - with better caching and
> > > > resolution speeds (similarly like with ruff they claim orders of
> > > magnitude
> > > > speedups in a number of cases). We can likely make very good use of
> it
> > > and
> > > > speed up some parts of our CI workflow significantly.
> > > >
> > > > I might likely do some experimenting with uv in our toolchain, but
> > wanted
> > > > to make sure we are all aware of it - and ask if someone has
> something
> > > > against it (and maybe someone would like to do some work there trying
> > it
> > > > out - I will be happy to guide others with the dev/tooling mindset
> and
> > > > incline to do some changes there/review PRs and cooperate on testing
> > > those
> > > > things.
> > > >
> > > > It's not a user-facing change, and I do not think we want to get rid
> of
> > > > `pip` as an installation tool in general (in our images and user
> facing
> > > > side) - it's mostly an internal CI tooling improvement I am thinking
> > of.
> > > > Maybe at some point in time we can recommend it also for development
> > > > workflows, and maybe someday it will gain enough popularity to think
> > > about
> > > > recommending it to our users, but definitely not now nor in even
> > mid-term
> > > > future.
> > > >
> > > > Let me know what you think.
> > > >
> > > > Repo here: https://github.com/astral-sh/uv
> > > >
> > > > J.
> > > >
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Alexander Shorin <kx...@apache.org>.
I share Andrey's skepticism. It's just yet another tool which has an
unclear development strategy. Should you make it a free testing suite? What
project would receive in exchange? A lot of words about being faster, but
how much? Are these milliseconds worth to change the stable tool with a new
one? And will it notably improve something?

I think it's worth to try it just for fun and provide feedback, but it'll
have to pass a long road to become such stable as pip.

--
,,,^..^,,,


On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:

> My opinion:
>
> I think there is a place for a number of such tools. For a long time the
> packaging team and `pip` team have been working not only on `pip`
> implementation but also (and most importantly) to make sure that what `pip`
> does is to be the beacon of standardisation of packaging APIs and PEPs. It
> will never IMHO have a lot of the fancy features that other tools might
> provide (like the ones I mentioned). It will always be there to provide the
> robust and solid CLI to run all packaging things, but there are plenty of
> opportunities to provide improved or modified, or more (or less)
> opinionated ways of doing things that are addressing some cases that `pip`
> team simply will not be able or willing to handle, preferring "pure"
> standard approach vs. implement all the optional things. For example the
> way how pre-releases are handled can be improved to be more selective. The
> PEP describing it gives the tools an option to add more fancy behaviours
> (some of which we could find useful in our CI tooling). Should `pip`
> implement those - I don't think so. It would distract maintainers from
> other more important things. It is quite ok to use other tooling in places
> like our CI, where they do some parts of the installation better.
>
> For me `pip` is going more into the direction of `usable reference
> implementation of package installed` - any standard/ PEP will not matter if
> `pip` does not implement it. But others might go in different directions
> and implement some less popular features and do it better, faster, with
> greater flexibility. IMHO it's a win-win.
>
> J.
>
>
> On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <an...@taragol.is>
> wrote:
>
> > Yesterday my friend shared with me that tool and I've been told that more
> > presumably it would be a niche tool. I've been told "who needs yet
> another
> > installer which stands to resolve all your problems' '.
> > I guess I was wrong?
> >
> > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > Hey everyone,
> > >
> > > Few days ago the ruff creators have released a new tool uv - which is
> an
> > > extremely fast (written in rust) and fully featured tool generally
> fully
> > > compatible with `pip`.
> > >
> > > Blog post here: https://astral.sh/blog/uv
> > >
> > > It looks like It has a number of things that would make our CI cases
> and
> > > tooling quite a bit faster and better including a few things that I
> have
> > > implemented some workarounds for and some that I have not
> > > implemented because `pip` had no good solution.
> > >
> > > I looked at the docs and it solves some problems that are currently
> > > difficult or impossible to handle with `pip`:
> > >
> > > * ability to use overrides (which are constraints on steroids -
> allowing
> > to
> > > override limits specified by the packages - this will be very useful to
> > > better handle our cases with "chicken-egg" providers (for example like
> we
> > > had in FAB) where we have pre-release packages depending on each other
> > >
> > > * different resolution strategies including --resolution=lowest which
> > will
> > > finally allow us to see whether airflow's lower bounds are still
> holding
> > > (i.e. - will our test still pass if we use the lowest supported version
> > of
> > > our dependencies?  this is something i wanted to do for quite some time
> > and
> > > recorded an issue for that -
> > > https://github.com/apache/airflow/issues/35549
> > > but lack of tooling support made it a wish, with `--resolution=lowest`
> it
> > > seems like super-easy thing to do.
> > >
> > > * It is said to be many, many times faster - with better caching and
> > > resolution speeds (similarly like with ruff they claim orders of
> > magnitude
> > > speedups in a number of cases). We can likely make very good use of it
> > and
> > > speed up some parts of our CI workflow significantly.
> > >
> > > I might likely do some experimenting with uv in our toolchain, but
> wanted
> > > to make sure we are all aware of it - and ask if someone has something
> > > against it (and maybe someone would like to do some work there trying
> it
> > > out - I will be happy to guide others with the dev/tooling mindset and
> > > incline to do some changes there/review PRs and cooperate on testing
> > those
> > > things.
> > >
> > > It's not a user-facing change, and I do not think we want to get rid of
> > > `pip` as an installation tool in general (in our images and user facing
> > > side) - it's mostly an internal CI tooling improvement I am thinking
> of.
> > > Maybe at some point in time we can recommend it also for development
> > > workflows, and maybe someday it will gain enough popularity to think
> > about
> > > recommending it to our users, but definitely not now nor in even
> mid-term
> > > future.
> > >
> > > Let me know what you think.
> > >
> > > Repo here: https://github.com/astral-sh/uv
> > >
> > > J.
> > >
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Jarek Potiuk <ja...@potiuk.com>.
My opinion:

I think there is a place for a number of such tools. For a long time the
packaging team and `pip` team have been working not only on `pip`
implementation but also (and most importantly) to make sure that what `pip`
does is to be the beacon of standardisation of packaging APIs and PEPs. It
will never IMHO have a lot of the fancy features that other tools might
provide (like the ones I mentioned). It will always be there to provide the
robust and solid CLI to run all packaging things, but there are plenty of
opportunities to provide improved or modified, or more (or less)
opinionated ways of doing things that are addressing some cases that `pip`
team simply will not be able or willing to handle, preferring "pure"
standard approach vs. implement all the optional things. For example the
way how pre-releases are handled can be improved to be more selective. The
PEP describing it gives the tools an option to add more fancy behaviours
(some of which we could find useful in our CI tooling). Should `pip`
implement those - I don't think so. It would distract maintainers from
other more important things. It is quite ok to use other tooling in places
like our CI, where they do some parts of the installation better.

For me `pip` is going more into the direction of `usable reference
implementation of package installed` - any standard/ PEP will not matter if
`pip` does not implement it. But others might go in different directions
and implement some less popular features and do it better, faster, with
greater flexibility. IMHO it's a win-win.

J.


On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin <an...@taragol.is>
wrote:

> Yesterday my friend shared with me that tool and I've been told that more
> presumably it would be a niche tool. I've been told "who needs yet another
> installer which stands to resolve all your problems' '.
> I guess I was wrong?
>
> On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Hey everyone,
> >
> > Few days ago the ruff creators have released a new tool uv - which is an
> > extremely fast (written in rust) and fully featured tool generally fully
> > compatible with `pip`.
> >
> > Blog post here: https://astral.sh/blog/uv
> >
> > It looks like It has a number of things that would make our CI cases and
> > tooling quite a bit faster and better including a few things that I have
> > implemented some workarounds for and some that I have not
> > implemented because `pip` had no good solution.
> >
> > I looked at the docs and it solves some problems that are currently
> > difficult or impossible to handle with `pip`:
> >
> > * ability to use overrides (which are constraints on steroids - allowing
> to
> > override limits specified by the packages - this will be very useful to
> > better handle our cases with "chicken-egg" providers (for example like we
> > had in FAB) where we have pre-release packages depending on each other
> >
> > * different resolution strategies including --resolution=lowest which
> will
> > finally allow us to see whether airflow's lower bounds are still holding
> > (i.e. - will our test still pass if we use the lowest supported version
> of
> > our dependencies?  this is something i wanted to do for quite some time
> and
> > recorded an issue for that -
> > https://github.com/apache/airflow/issues/35549
> > but lack of tooling support made it a wish, with `--resolution=lowest` it
> > seems like super-easy thing to do.
> >
> > * It is said to be many, many times faster - with better caching and
> > resolution speeds (similarly like with ruff they claim orders of
> magnitude
> > speedups in a number of cases). We can likely make very good use of it
> and
> > speed up some parts of our CI workflow significantly.
> >
> > I might likely do some experimenting with uv in our toolchain, but wanted
> > to make sure we are all aware of it - and ask if someone has something
> > against it (and maybe someone would like to do some work there trying it
> > out - I will be happy to guide others with the dev/tooling mindset and
> > incline to do some changes there/review PRs and cooperate on testing
> those
> > things.
> >
> > It's not a user-facing change, and I do not think we want to get rid of
> > `pip` as an installation tool in general (in our images and user facing
> > side) - it's mostly an internal CI tooling improvement I am thinking of.
> > Maybe at some point in time we can recommend it also for development
> > workflows, and maybe someday it will gain enough popularity to think
> about
> > recommending it to our users, but definitely not now nor in even mid-term
> > future.
> >
> > Let me know what you think.
> >
> > Repo here: https://github.com/astral-sh/uv
> >
> > J.
> >
>

Re: [DISCUSS] Considering trying out uv for our CI workflows

Posted by Andrey Anshin <an...@taragol.is>.
Yesterday my friend shared with me that tool and I've been told that more
presumably it would be a niche tool. I've been told "who needs yet another
installer which stands to resolve all your problems' '.
I guess I was wrong?

On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <ja...@potiuk.com> wrote:

> Hey everyone,
>
> Few days ago the ruff creators have released a new tool uv - which is an
> extremely fast (written in rust) and fully featured tool generally fully
> compatible with `pip`.
>
> Blog post here: https://astral.sh/blog/uv
>
> It looks like It has a number of things that would make our CI cases and
> tooling quite a bit faster and better including a few things that I have
> implemented some workarounds for and some that I have not
> implemented because `pip` had no good solution.
>
> I looked at the docs and it solves some problems that are currently
> difficult or impossible to handle with `pip`:
>
> * ability to use overrides (which are constraints on steroids - allowing to
> override limits specified by the packages - this will be very useful to
> better handle our cases with "chicken-egg" providers (for example like we
> had in FAB) where we have pre-release packages depending on each other
>
> * different resolution strategies including --resolution=lowest which will
> finally allow us to see whether airflow's lower bounds are still holding
> (i.e. - will our test still pass if we use the lowest supported version of
> our dependencies?  this is something i wanted to do for quite some time and
> recorded an issue for that -
> https://github.com/apache/airflow/issues/35549
> but lack of tooling support made it a wish, with `--resolution=lowest` it
> seems like super-easy thing to do.
>
> * It is said to be many, many times faster - with better caching and
> resolution speeds (similarly like with ruff they claim orders of magnitude
> speedups in a number of cases). We can likely make very good use of it and
> speed up some parts of our CI workflow significantly.
>
> I might likely do some experimenting with uv in our toolchain, but wanted
> to make sure we are all aware of it - and ask if someone has something
> against it (and maybe someone would like to do some work there trying it
> out - I will be happy to guide others with the dev/tooling mindset and
> incline to do some changes there/review PRs and cooperate on testing those
> things.
>
> It's not a user-facing change, and I do not think we want to get rid of
> `pip` as an installation tool in general (in our images and user facing
> side) - it's mostly an internal CI tooling improvement I am thinking of.
> Maybe at some point in time we can recommend it also for development
> workflows, and maybe someday it will gain enough popularity to think about
> recommending it to our users, but definitely not now nor in even mid-term
> future.
>
> Let me know what you think.
>
> Repo here: https://github.com/astral-sh/uv
>
> J.
>