You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Kristopher Kane <kk...@gmail.com> on 2023/03/24 15:31:50 UTC

Updating Provider dependencies for development.

While waiting on an adjacent PR
(https://github.com/apache/airflow/pull/30067) to holistically update
Google Cloud provider deps, I'm attempting to update the provider.yaml
for Google Cloud that upgrades google-cloud-dataproc from 5.0.0 to
5.4.0 in order to test an enhancement to Dataproc operators. I took
that PR's provider.yaml in my branch in order to start.

I cannot get a local virtualenv to get any version of
google-cloud-dataproc but the currently resolved 5.0.0 even though I
force it to '==5.4.0' in providers.yaml.

Here is what I have done to force it to no avail:
1) Forced providers.yaml for Google Cloud to "google-cloud-dataproc==5.4.0"
2) Not using a constraints file to rule it out. - run with:
INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -U -e
".[devel,google,postgres]"
3) Observe that generated/provider_dependencies.json actually has
"google-cloud-dataproc==5.4.0"

This technique does install successfully without the constraints file
but I still only get version 5.0.0 in 'pip list'.  Is there somewhere
I should be looking other than the provider's provider.yaml?

Thanks,

Kris

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: Updating Provider dependencies for development.

Posted by Kristopher Kane <kk...@gmail.com>.
Awesome thanks.  I must have coincidentally done it with Breeze since
I did see the generated providers updated but I must have never gone
back to run the pip install.  I have a working environment now to run
with.

Kris

On Fri, Mar 24, 2023 at 1:48 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> The "update-provider-dependencies" is the right pre-commit name BTW,
> not "generate...".
>
> On Fri, Mar 24, 2023 at 6:47 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > The provider.yaml is the source of information, but when you install
> > application via setup.py locally after changing it, you need to
> > generate the "generated/provider_dependencies.json" - this is the file
> > that is used by setup.py when you install it via ".[extras]". This
> > file will get automatically updated by pre-commit, when you commit
> > your code:
> >
> > pre-commit run generate-provider-dependencies --all-files
> >
> > or if you use breeze (this one supports autocompletion):
> >
> > breeze static-checks --type update-providers-dependencies --all-files
> >
> > But you can also update it manually if you wish - it is plain json
> > file, easy to edit.
> >
> > Besides, in order to check if your changes result in a consistent
> > no-conflict set of dependencies you can run this command to have an
> > automated attempt (by pip) to upgrade to the latest set of
> > dependencies that are consistent with all the limits:
> >
> > breeze ci-image build --upgrade-to-newer-dependencies
> >
> > This should attempt to build the CI image which will contain the
> > latest versions of all the dependencies with all the limits from all
> > provider.yaml (after converting to
> > generated/provider_dependencies.json) applied. You should be able to
> > enter such an image with an interactive shell session with `breeze` or
> > `breeze shell` command after it succeeds.
> > This might not succeed though in case of unresolvable conflicts with
> > other libraries (and the work of google to upgrade some old
> > dependencies is what is necessary to solve some of them - especially
> > protobuf conflicts)
> >
> > J.
> >
> > On Fri, Mar 24, 2023 at 4:32 PM Kristopher Kane <kk...@gmail.com> wrote:
> > >
> > > While waiting on an adjacent PR
> > > (https://github.com/apache/airflow/pull/30067) to holistically update
> > > Google Cloud provider deps, I'm attempting to update the provider.yaml
> > > for Google Cloud that upgrades google-cloud-dataproc from 5.0.0 to
> > > 5.4.0 in order to test an enhancement to Dataproc operators. I took
> > > that PR's provider.yaml in my branch in order to start.
> > >
> > > I cannot get a local virtualenv to get any version of
> > > google-cloud-dataproc but the currently resolved 5.0.0 even though I
> > > force it to '==5.4.0' in providers.yaml.
> > >
> > > Here is what I have done to force it to no avail:
> > > 1) Forced providers.yaml for Google Cloud to "google-cloud-dataproc==5.4.0"
> > > 2) Not using a constraints file to rule it out. - run with:
> > > INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -U -e
> > > ".[devel,google,postgres]"
> > > 3) Observe that generated/provider_dependencies.json actually has
> > > "google-cloud-dataproc==5.4.0"
> > >
> > > This technique does install successfully without the constraints file
> > > but I still only get version 5.0.0 in 'pip list'.  Is there somewhere
> > > I should be looking other than the provider's provider.yaml?
> > >
> > > Thanks,
> > >
> > > Kris
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > > For additional commands, e-mail: dev-help@airflow.apache.org
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> For additional commands, e-mail: dev-help@airflow.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: Updating Provider dependencies for development.

Posted by Jarek Potiuk <ja...@potiuk.com>.
The "update-provider-dependencies" is the right pre-commit name BTW,
not "generate...".

On Fri, Mar 24, 2023 at 6:47 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> The provider.yaml is the source of information, but when you install
> application via setup.py locally after changing it, you need to
> generate the "generated/provider_dependencies.json" - this is the file
> that is used by setup.py when you install it via ".[extras]". This
> file will get automatically updated by pre-commit, when you commit
> your code:
>
> pre-commit run generate-provider-dependencies --all-files
>
> or if you use breeze (this one supports autocompletion):
>
> breeze static-checks --type update-providers-dependencies --all-files
>
> But you can also update it manually if you wish - it is plain json
> file, easy to edit.
>
> Besides, in order to check if your changes result in a consistent
> no-conflict set of dependencies you can run this command to have an
> automated attempt (by pip) to upgrade to the latest set of
> dependencies that are consistent with all the limits:
>
> breeze ci-image build --upgrade-to-newer-dependencies
>
> This should attempt to build the CI image which will contain the
> latest versions of all the dependencies with all the limits from all
> provider.yaml (after converting to
> generated/provider_dependencies.json) applied. You should be able to
> enter such an image with an interactive shell session with `breeze` or
> `breeze shell` command after it succeeds.
> This might not succeed though in case of unresolvable conflicts with
> other libraries (and the work of google to upgrade some old
> dependencies is what is necessary to solve some of them - especially
> protobuf conflicts)
>
> J.
>
> On Fri, Mar 24, 2023 at 4:32 PM Kristopher Kane <kk...@gmail.com> wrote:
> >
> > While waiting on an adjacent PR
> > (https://github.com/apache/airflow/pull/30067) to holistically update
> > Google Cloud provider deps, I'm attempting to update the provider.yaml
> > for Google Cloud that upgrades google-cloud-dataproc from 5.0.0 to
> > 5.4.0 in order to test an enhancement to Dataproc operators. I took
> > that PR's provider.yaml in my branch in order to start.
> >
> > I cannot get a local virtualenv to get any version of
> > google-cloud-dataproc but the currently resolved 5.0.0 even though I
> > force it to '==5.4.0' in providers.yaml.
> >
> > Here is what I have done to force it to no avail:
> > 1) Forced providers.yaml for Google Cloud to "google-cloud-dataproc==5.4.0"
> > 2) Not using a constraints file to rule it out. - run with:
> > INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -U -e
> > ".[devel,google,postgres]"
> > 3) Observe that generated/provider_dependencies.json actually has
> > "google-cloud-dataproc==5.4.0"
> >
> > This technique does install successfully without the constraints file
> > but I still only get version 5.0.0 in 'pip list'.  Is there somewhere
> > I should be looking other than the provider's provider.yaml?
> >
> > Thanks,
> >
> > Kris
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > For additional commands, e-mail: dev-help@airflow.apache.org
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: Updating Provider dependencies for development.

Posted by Jarek Potiuk <ja...@potiuk.com>.
The provider.yaml is the source of information, but when you install
application via setup.py locally after changing it, you need to
generate the "generated/provider_dependencies.json" - this is the file
that is used by setup.py when you install it via ".[extras]". This
file will get automatically updated by pre-commit, when you commit
your code:

pre-commit run generate-provider-dependencies --all-files

or if you use breeze (this one supports autocompletion):

breeze static-checks --type update-providers-dependencies --all-files

But you can also update it manually if you wish - it is plain json
file, easy to edit.

Besides, in order to check if your changes result in a consistent
no-conflict set of dependencies you can run this command to have an
automated attempt (by pip) to upgrade to the latest set of
dependencies that are consistent with all the limits:

breeze ci-image build --upgrade-to-newer-dependencies

This should attempt to build the CI image which will contain the
latest versions of all the dependencies with all the limits from all
provider.yaml (after converting to
generated/provider_dependencies.json) applied. You should be able to
enter such an image with an interactive shell session with `breeze` or
`breeze shell` command after it succeeds.
This might not succeed though in case of unresolvable conflicts with
other libraries (and the work of google to upgrade some old
dependencies is what is necessary to solve some of them - especially
protobuf conflicts)

J.

On Fri, Mar 24, 2023 at 4:32 PM Kristopher Kane <kk...@gmail.com> wrote:
>
> While waiting on an adjacent PR
> (https://github.com/apache/airflow/pull/30067) to holistically update
> Google Cloud provider deps, I'm attempting to update the provider.yaml
> for Google Cloud that upgrades google-cloud-dataproc from 5.0.0 to
> 5.4.0 in order to test an enhancement to Dataproc operators. I took
> that PR's provider.yaml in my branch in order to start.
>
> I cannot get a local virtualenv to get any version of
> google-cloud-dataproc but the currently resolved 5.0.0 even though I
> force it to '==5.4.0' in providers.yaml.
>
> Here is what I have done to force it to no avail:
> 1) Forced providers.yaml for Google Cloud to "google-cloud-dataproc==5.4.0"
> 2) Not using a constraints file to rule it out. - run with:
> INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -U -e
> ".[devel,google,postgres]"
> 3) Observe that generated/provider_dependencies.json actually has
> "google-cloud-dataproc==5.4.0"
>
> This technique does install successfully without the constraints file
> but I still only get version 5.0.0 in 'pip list'.  Is there somewhere
> I should be looking other than the provider's provider.yaml?
>
> Thanks,
>
> Kris
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> For additional commands, e-mail: dev-help@airflow.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org