You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <ja...@potiuk.com> on 2023/07/21 05:22:28 UTC

[VOTE] Make (soon coming) dask provider preinstalled

Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
executor) once separated ?

Discussion was here:
https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6

Please vote:

* +1 -> yes, we want to have dask provider preinstalled
* -1 -> no, it's fine to make it optional
* 0 -> no opinion

Consider it my -1: I think we should NOT preinstall Dask provider.

Voting guidelines here. This is really a "procedural" matter rather
than code modification:

> Votes on procedural issues follow the common format of majority rule unless otherwise stated. That is, if there are more favourable votes than unfavourable ones, the issue is considered to have passed -- regardless of the number of votes in each category. (If the number of votes seems too small to be representative of a community consensus, the issue is typically not pursued.

In this case committers have binding votes but other community members
are encouraged to state their non-binding votes as well.

https://www.apache.org/foundation/voting.html

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Jarek Potiuk <ja...@potiuk.com>.
Let me cancel the vote and start it for all of them.

 As discussed in the separate thread - all three of the "to-be-moved"
executors already have their own extra (and in our docs it was
specified that they are needed for executors), and those extras are
needed anyway (they bring the right dependencies and quite many of
them).
For example I think if someone previously used celery and did NOT use
the "celery" extra and used Celery Executor (same with k8) - it would
have to be really special way of installing airflow (install airflow +
install all celery deps individually) - this is already custom enough
and "special enough" to a fall into https://www.hyrumslaw.com/
squarely.

J.

On Fri, Jul 21, 2023 at 7:58 PM Oliveira, Niko
<on...@amazon.com.invalid> wrote:
>
> -1 (binding)
>
> I think the eventual goal is all 3rd party executors (excluding Local, Sequential, etc) are not pre-installed. I think it will take a while for us to get there with Celery and K8s but it's the right thing to shoot for and we should start with Dask.
>
> ________________________________
> From: Collin McNulty <co...@astronomer.io.INVALID>
> Sent: Friday, July 21, 2023 8:37:38 AM
> To: dev@airflow.apache.org
> Subject: RE: [EXTERNAL][VOTE] Make (soon coming) dask provider preinstalled
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> -1 (non-binding)
> I agree with Ash’s reasoning.
> --
>
> Collin McNulty
> Director of Global Support
>
> Email: collin@astronomer.io <jo...@astronomer.io>
> Time zone: US Central (CST UTC-6 / CDT UTC-5)
>
>
> <https://www.astronomer.io/>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by "Oliveira, Niko" <on...@amazon.com.INVALID>.
-1 (binding)

I think the eventual goal is all 3rd party executors (excluding Local, Sequential, etc) are not pre-installed. I think it will take a while for us to get there with Celery and K8s but it's the right thing to shoot for and we should start with Dask.

________________________________
From: Collin McNulty <co...@astronomer.io.INVALID>
Sent: Friday, July 21, 2023 8:37:38 AM
To: dev@airflow.apache.org
Subject: RE: [EXTERNAL][VOTE] Make (soon coming) dask provider preinstalled

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



-1 (non-binding)
I agree with Ash’s reasoning.
--

Collin McNulty
Director of Global Support

Email: collin@astronomer.io <jo...@astronomer.io>
Time zone: US Central (CST UTC-6 / CDT UTC-5)


<https://www.astronomer.io/>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Collin McNulty <co...@astronomer.io.INVALID>.
-1 (non-binding)
I agree with Ash’s reasoning.
-- 

Collin McNulty
Director of Global Support

Email: collin@astronomer.io <jo...@astronomer.io>
Time zone: US Central (CST UTC-6 / CDT UTC-5)


<https://www.astronomer.io/>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Jed Cunningham <je...@apache.org>.
-1 binding

(btw Vincent, your vote is binding in this vote 🎉)

On Fri, Jul 21, 2023 at 7:52 AM Beck, Vincent <vi...@amazon.com.invalid>
wrote:

> -1 (non-binding)
>
> On 2023-07-21, 5:46 AM, "Michał Modras" <michalmodras@google.com.INVA
> <ma...@google.com.INVA>LID> wrote:
>
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
>
>
> -1 (non-binding)
>
>
> Dask is more niche than Celery or K8s - let's err on the side of making
> Airflow core dependencies more light-weight and not preinstall it.
>
>
> On Fri, Jul 21, 2023 at 10:57 AM Pankaj Koti
> <pankaj.koti@astronomer.io.inva <ma...@astronomer.io.inva>lid>
> wrote:
>
>
> > 0 (non-binding)
> >
> > I agree with most of what has been discussed in the thread
> > https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6 <
> https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6>
> > and this email thread.
> >
> > I did a check on Airflow discussions (
> > https://github.com/apache/airflow/discussions?discussions_q=dask+ <
> https://github.com/apache/airflow/discussions?discussions_q=dask+>)
> > and Airflow issues(
> > https://github.com/apache/airflow/issues?q=is%3Aissue+dask+ <
> https://github.com/apache/airflow/issues?q=is%3Aissue+dask+>)
> > and upon reading their descriptions, see that there are some users using
> > the Dask executor although I personally have never tried it.
> >
> > If we decide to not pre-install it, in my opinion, a detailed and
> top-level
> > news fragment / Changelog would help us here which mentions it to install
> > the provider additionally in case they would like to continue using it.
> >
> > Regards,
> >
> >
> >
> > Pankaj Koti
> >
> > *Senior Software Engineer, *OSS Engineering Team.
> > Location: Pune, India
> >
> > Timezone: Indian Standard Time (IST)
> >
> > Email: pankaj.koti@astronomer.io <ma...@astronomer.io>
> >
> > Mobile: +91 9730079985 <+91%2097300%2079985>
> >
> >
> > On Fri, Jul 21, 2023 at 2:20 PM Jarek Potiuk <jarek@potiuk.com <mailto:
> jarek@potiuk.com>> wrote:
> >
> > > > One possible solution could be allowing users to instruct Airflow not
> > to
> > > pre-install executors providers by passing an argument via
> > > `--install-option` during pip install, and add install them explicitly
> by
> > > providing the extras.
> > >
> > > Yes it would be nice. Unfortunately the dependency mechanisms of
> > > Python (and `pip`) are not as flexible. You cannot individually remove
> > > requirements, and having requirements is the only way you can have
> > > "install also that other package" feature (extras is the mechanism to
> > > install optional dependencies, but it's not something we should ask
> > > regular users to do
> > >
> > >
> > >
> > > On Fri, Jul 21, 2023 at 10:46 AM Jarek Potiuk <jarek@potiuk.com
> <ma...@potiuk.com>> wrote:
> > > >
> > > > And just to clarify: The voting will last till Tuesday 25 July 20023
> > > 11am CST.
> > > >
> > > > On Fri, Jul 21, 2023 at 10:43 AM Jarek Potiuk <jarek@potiuk.com
> <ma...@potiuk.com>>
> > wrote:
> > > > >
> > > > > Clarification: Ash is right (I totally forgot about it).
> > > > >
> > > > > We do have dask in
> > > > >
> > >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
> <
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
> >
> > > > > and it even explicitly mentions it:
> > > > >
> > > > > * "dask pip install 'apache-airflow[dask]' DaskExecutor"
> > > > >
> > > > > So technically speaking you **should** use the dask extra to
> install
> > > > > distributed, dask and cloudpickle package - all needed to run the
> > dask
> > > > > executor. And we will keep it this way - it will just move from
> "core
> > > > > extras" to "providers". Actually it will move to "deprecated" and
> we
> > > > > will alias "dask" it with the new "daskexecutor", because the name
> > > > > "apache-airflow-providers-dask" is not allowed by PyPI (I guess the
> > > > > -dask suffix is considered potentially harmful for typosquatting)
> so
> > I
> > > > > reserved "apache-airflow-providers-daskexecutor".
> > > > >
> > > > > J.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hussein@awala.fr
> <ma...@awala.fr>>
> > > wrote:
> > > > > >
> > > > > > > based on the premise that your had to install the `dask` extra
> in
> > > the
> > > > > > first place to get dask module of the right version
> > > > > >
> > > > > > After reviewing the Dask Executor
> > > > > > <
> > >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html
> <
> https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html
> >
> > > >
> > > > > > section in Airflow documentation, I didn't find any requirement
> or
> > > > > > recommendation to add the `dask` extra when using it. So,
> including
> > > the
> > > > > > dependencies explicitly in the requirements file should suffice
> to
> > > use the
> > > > > > executor. This means that this change might potentially break
> > > existing
> > > > > > setups for some users when they upgrade Airflow version, which
> > > contradicts
> > > > > > the Airflow deprecation policy.
> > > > > >
> > > > > > If the premise holds true for Dask, it should also apply to
> Celery
> > > and
> > > > > > Kubernetes. Nevertheless, we decided to pre-install their
> providers
> > > after
> > > > > > the migration.
> > > > > >
> > > > > > However, this executor isn't as popular as the others, and as an
> > > Airflow
> > > > > > user, I prefer not to have Dask dependencies in my environment
> > when I
> > > > > > install Airflow with the Celery executor. This helps avoid
> > conflicts
> > > and
> > > > > > enables me to upgrade libraries independently from their
> > > constraints. One
> > > > > > possible solution could be allowing users to instruct Airflow not
> > to
> > > > > > pre-install executors providers by passing an argument via
> > > > > > `--install-option` during pip install, and add install them
> > > explicitly by
> > > > > > providing the extras.
> > > > > >
> > > > > > Since the flexibility of our deprecation policy isn't entirely
> > clear
> > > to me,
> > > > > > I'm neutral on this matter, with a vote of +0.
> > > > > >
> > > > > > On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <eladkal@apache.org
> <ma...@apache.org>>
> > > wrote:
> > > > > >
> > > > > > > I agree with Ash
> > > > > > > -1 as well.
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <
> > ash@apache.org <ma...@apache.org>>
> > > wrote:
> > > > > > >
> > > > > > > > -1 - based on the premise that your had to install the `dask`
> > > extra in
> > > > > > > the
> > > > > > > > first place to get dask module of the right version, so if we
> > > make the
> > > > > > > > existing extra depends on the new provider then it's good
> > enough.
> > > > > > > >
> > > > > > > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <jarek@potiuk.com
> <ma...@potiuk.com>>
> > > wrote:
> > > > > > > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0
> > > (with Dask
> > > > > > > > >executor) once separated ?
> > > > > > > > >
> > > > > > > > >Discussion was here:
> > > > > > > > >
> > > https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6 <
> https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6>
> > > > > > > > >
> > > > > > > > >Please vote:
> > > > > > > > >
> > > > > > > > >* +1 -> yes, we want to have dask provider preinstalled
> > > > > > > > >* -1 -> no, it's fine to make it optional
> > > > > > > > >* 0 -> no opinion
> > > > > > > > >
> > > > > > > > >Consider it my -1: I think we should NOT preinstall Dask
> > > provider.
> > > > > > > > >
> > > > > > > > >Voting guidelines here. This is really a "procedural" matter
> > > rather
> > > > > > > > >than code modification:
> > > > > > > > >
> > > > > > > > >> Votes on procedural issues follow the common format of
> > > majority rule
> > > > > > > > unless otherwise stated. That is, if there are more
> favourable
> > > votes than
> > > > > > > > unfavourable ones, the issue is considered to have passed --
> > > regardless
> > > > > > > of
> > > > > > > > the number of votes in each category. (If the number of votes
> > > seems too
> > > > > > > > small to be representative of a community consensus, the
> issue
> > is
> > > > > > > typically
> > > > > > > > not pursued.
> > > > > > > > >
> > > > > > > > >In this case committers have binding votes but other
> community
> > > members
> > > > > > > > >are encouraged to state their non-binding votes as well.
> > > > > > > > >
> > > > > > > > >https://www.apache.org/foundation/voting.html <
> https://www.apache.org/foundation/voting.html>
> > > > > > > > >
> > > > > > > >
> > > >---------------------------------------------------------------------
> > > > > > > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> <ma...@airflow.apache.org>
> > > > > > > > >For additional commands, e-mail:
> dev-help@airflow.apache.org <ma...@airflow.apache.org>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org <mailto:
> dev-unsubscribe@airflow.apache.org>
> > > For additional commands, e-mail: dev-help@airflow.apache.org <mailto:
> dev-help@airflow.apache.org>
> > >
> > >
> >
>
>
>
>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by "Beck, Vincent" <vi...@amazon.com.INVALID>.
-1 (non-binding)

On 2023-07-21, 5:46 AM, "Michał Modras" <michalmodras@google.com.INVA <ma...@google.com.INVA>LID> wrote:


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.






-1 (non-binding)


Dask is more niche than Celery or K8s - let's err on the side of making
Airflow core dependencies more light-weight and not preinstall it.


On Fri, Jul 21, 2023 at 10:57 AM Pankaj Koti
<pankaj.koti@astronomer.io.inva <ma...@astronomer.io.inva>lid> wrote:


> 0 (non-binding)
>
> I agree with most of what has been discussed in the thread
> https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6 <https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6>
> and this email thread.
>
> I did a check on Airflow discussions (
> https://github.com/apache/airflow/discussions?discussions_q=dask+ <https://github.com/apache/airflow/discussions?discussions_q=dask+>)
> and Airflow issues(
> https://github.com/apache/airflow/issues?q=is%3Aissue+dask+ <https://github.com/apache/airflow/issues?q=is%3Aissue+dask+>)
> and upon reading their descriptions, see that there are some users using
> the Dask executor although I personally have never tried it.
>
> If we decide to not pre-install it, in my opinion, a detailed and top-level
> news fragment / Changelog would help us here which mentions it to install
> the provider additionally in case they would like to continue using it.
>
> Regards,
>
>
>
> Pankaj Koti
>
> *Senior Software Engineer, *OSS Engineering Team.
> Location: Pune, India
>
> Timezone: Indian Standard Time (IST)
>
> Email: pankaj.koti@astronomer.io <ma...@astronomer.io>
>
> Mobile: +91 9730079985 <+91%2097300%2079985>
>
>
> On Fri, Jul 21, 2023 at 2:20 PM Jarek Potiuk <jarek@potiuk.com <ma...@potiuk.com>> wrote:
>
> > > One possible solution could be allowing users to instruct Airflow not
> to
> > pre-install executors providers by passing an argument via
> > `--install-option` during pip install, and add install them explicitly by
> > providing the extras.
> >
> > Yes it would be nice. Unfortunately the dependency mechanisms of
> > Python (and `pip`) are not as flexible. You cannot individually remove
> > requirements, and having requirements is the only way you can have
> > "install also that other package" feature (extras is the mechanism to
> > install optional dependencies, but it's not something we should ask
> > regular users to do
> >
> >
> >
> > On Fri, Jul 21, 2023 at 10:46 AM Jarek Potiuk <jarek@potiuk.com <ma...@potiuk.com>> wrote:
> > >
> > > And just to clarify: The voting will last till Tuesday 25 July 20023
> > 11am CST.
> > >
> > > On Fri, Jul 21, 2023 at 10:43 AM Jarek Potiuk <jarek@potiuk.com <ma...@potiuk.com>>
> wrote:
> > > >
> > > > Clarification: Ash is right (I totally forgot about it).
> > > >
> > > > We do have dask in
> > > >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras <https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras>
> > > > and it even explicitly mentions it:
> > > >
> > > > * "dask pip install 'apache-airflow[dask]' DaskExecutor"
> > > >
> > > > So technically speaking you **should** use the dask extra to install
> > > > distributed, dask and cloudpickle package - all needed to run the
> dask
> > > > executor. And we will keep it this way - it will just move from "core
> > > > extras" to "providers". Actually it will move to "deprecated" and we
> > > > will alias "dask" it with the new "daskexecutor", because the name
> > > > "apache-airflow-providers-dask" is not allowed by PyPI (I guess the
> > > > -dask suffix is considered potentially harmful for typosquatting) so
> I
> > > > reserved "apache-airflow-providers-daskexecutor".
> > > >
> > > > J.
> > > >
> > > >
> > > >
> > > > On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hussein@awala.fr <ma...@awala.fr>>
> > wrote:
> > > > >
> > > > > > based on the premise that your had to install the `dask` extra in
> > the
> > > > > first place to get dask module of the right version
> > > > >
> > > > > After reviewing the Dask Executor
> > > > > <
> >
> https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html>
> > >
> > > > > section in Airflow documentation, I didn't find any requirement or
> > > > > recommendation to add the `dask` extra when using it. So, including
> > the
> > > > > dependencies explicitly in the requirements file should suffice to
> > use the
> > > > > executor. This means that this change might potentially break
> > existing
> > > > > setups for some users when they upgrade Airflow version, which
> > contradicts
> > > > > the Airflow deprecation policy.
> > > > >
> > > > > If the premise holds true for Dask, it should also apply to Celery
> > and
> > > > > Kubernetes. Nevertheless, we decided to pre-install their providers
> > after
> > > > > the migration.
> > > > >
> > > > > However, this executor isn't as popular as the others, and as an
> > Airflow
> > > > > user, I prefer not to have Dask dependencies in my environment
> when I
> > > > > install Airflow with the Celery executor. This helps avoid
> conflicts
> > and
> > > > > enables me to upgrade libraries independently from their
> > constraints. One
> > > > > possible solution could be allowing users to instruct Airflow not
> to
> > > > > pre-install executors providers by passing an argument via
> > > > > `--install-option` during pip install, and add install them
> > explicitly by
> > > > > providing the extras.
> > > > >
> > > > > Since the flexibility of our deprecation policy isn't entirely
> clear
> > to me,
> > > > > I'm neutral on this matter, with a vote of +0.
> > > > >
> > > > > On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <eladkal@apache.org <ma...@apache.org>>
> > wrote:
> > > > >
> > > > > > I agree with Ash
> > > > > > -1 as well.
> > > > > >
> > > > > >
> > > > > > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <
> ash@apache.org <ma...@apache.org>>
> > wrote:
> > > > > >
> > > > > > > -1 - based on the premise that your had to install the `dask`
> > extra in
> > > > > > the
> > > > > > > first place to get dask module of the right version, so if we
> > make the
> > > > > > > existing extra depends on the new provider then it's good
> enough.
> > > > > > >
> > > > > > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <jarek@potiuk.com <ma...@potiuk.com>>
> > wrote:
> > > > > > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0
> > (with Dask
> > > > > > > >executor) once separated ?
> > > > > > > >
> > > > > > > >Discussion was here:
> > > > > > > >
> > https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6 <https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6>
> > > > > > > >
> > > > > > > >Please vote:
> > > > > > > >
> > > > > > > >* +1 -> yes, we want to have dask provider preinstalled
> > > > > > > >* -1 -> no, it's fine to make it optional
> > > > > > > >* 0 -> no opinion
> > > > > > > >
> > > > > > > >Consider it my -1: I think we should NOT preinstall Dask
> > provider.
> > > > > > > >
> > > > > > > >Voting guidelines here. This is really a "procedural" matter
> > rather
> > > > > > > >than code modification:
> > > > > > > >
> > > > > > > >> Votes on procedural issues follow the common format of
> > majority rule
> > > > > > > unless otherwise stated. That is, if there are more favourable
> > votes than
> > > > > > > unfavourable ones, the issue is considered to have passed --
> > regardless
> > > > > > of
> > > > > > > the number of votes in each category. (If the number of votes
> > seems too
> > > > > > > small to be representative of a community consensus, the issue
> is
> > > > > > typically
> > > > > > > not pursued.
> > > > > > > >
> > > > > > > >In this case committers have binding votes but other community
> > members
> > > > > > > >are encouraged to state their non-binding votes as well.
> > > > > > > >
> > > > > > > >https://www.apache.org/foundation/voting.html <https://www.apache.org/foundation/voting.html>
> > > > > > > >
> > > > > > >
> > >---------------------------------------------------------------------
> > > > > > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org <ma...@airflow.apache.org>
> > > > > > > >For additional commands, e-mail: dev-help@airflow.apache.org <ma...@airflow.apache.org>
> > > > > > > >
> > > > > > >
> > > > > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org <ma...@airflow.apache.org>
> > For additional commands, e-mail: dev-help@airflow.apache.org <ma...@airflow.apache.org>
> >
> >
>




Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Michał Modras <mi...@google.com.INVALID>.
-1 (non-binding)

Dask is more niche than Celery or K8s - let's err on the side of making
Airflow core dependencies more light-weight and not preinstall it.

On Fri, Jul 21, 2023 at 10:57 AM Pankaj Koti
<pa...@astronomer.io.invalid> wrote:

> 0 (non-binding)
>
> I agree with most of what has been discussed in the thread
> https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> and this email thread.
>
> I did a check on Airflow discussions (
> https://github.com/apache/airflow/discussions?discussions_q=dask+)
> and Airflow issues(
> https://github.com/apache/airflow/issues?q=is%3Aissue+dask+)
> and upon reading their descriptions, see that there are some users using
> the Dask executor although I personally have never tried it.
>
> If we decide to not pre-install it, in my opinion, a detailed and top-level
> news fragment / Changelog would help us here which mentions it to install
> the provider additionally in case they would like to continue using it.
>
> Regards,
>
>
>
> Pankaj Koti
>
> *Senior Software Engineer, *OSS Engineering Team.
> Location: Pune, India
>
> Timezone: Indian Standard Time (IST)
>
> Email: pankaj.koti@astronomer.io
>
> Mobile: +91 9730079985 <+91%2097300%2079985>
>
>
> On Fri, Jul 21, 2023 at 2:20 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > > One possible solution could be allowing users to instruct Airflow not
> to
> > pre-install executors providers by passing an argument via
> > `--install-option` during pip install, and add install them explicitly by
> > providing the extras.
> >
> > Yes it would be nice. Unfortunately the dependency mechanisms of
> > Python (and `pip`) are not as flexible. You cannot individually remove
> > requirements, and having requirements is the only way you can have
> > "install also that other package" feature (extras is the mechanism to
> > install optional dependencies, but it's not something we should ask
> > regular users to do
> >
> >
> >
> > On Fri, Jul 21, 2023 at 10:46 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > And just to clarify:  The voting will last till Tuesday 25 July 20023
> > 11am CST.
> > >
> > > On Fri, Jul 21, 2023 at 10:43 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > > >
> > > > Clarification: Ash is right (I totally forgot about it).
> > > >
> > > > We do have dask in
> > > >
> >
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
> > > >  and it even explicitly mentions it:
> > > >
> > > > * "dask pip install 'apache-airflow[dask]' DaskExecutor"
> > > >
> > > > So technically speaking you **should** use the dask extra to install
> > > > distributed, dask and cloudpickle package - all needed to run the
> dask
> > > > executor. And we will keep it this way - it will just move from "core
> > > > extras" to "providers". Actually it will move to "deprecated" and we
> > > > will alias "dask" it with the new "daskexecutor", because the name
> > > > "apache-airflow-providers-dask" is not allowed by PyPI (I guess the
> > > > -dask suffix is considered potentially harmful for typosquatting) so
> I
> > > > reserved "apache-airflow-providers-daskexecutor".
> > > >
> > > > J.
> > > >
> > > >
> > > >
> > > > On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hu...@awala.fr>
> > wrote:
> > > > >
> > > > > > based on the premise that your had to install the `dask` extra in
> > the
> > > > > first place to get dask module of the right version
> > > > >
> > > > > After reviewing the Dask Executor
> > > > > <
> >
> https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html
> > >
> > > > > section in Airflow documentation, I didn't find any requirement or
> > > > > recommendation to add the `dask` extra when using it. So, including
> > the
> > > > > dependencies explicitly in the requirements file should suffice to
> > use the
> > > > > executor. This means that this change might potentially break
> > existing
> > > > > setups for some users when they upgrade Airflow version, which
> > contradicts
> > > > > the Airflow deprecation policy.
> > > > >
> > > > > If the premise holds true for Dask, it should also apply to Celery
> > and
> > > > > Kubernetes. Nevertheless, we decided to pre-install their providers
> > after
> > > > > the migration.
> > > > >
> > > > > However, this executor isn't as popular as the others, and as an
> > Airflow
> > > > > user, I prefer not to have Dask dependencies in my environment
> when I
> > > > > install Airflow with the Celery executor. This helps avoid
> conflicts
> > and
> > > > > enables me to upgrade libraries independently from their
> > constraints. One
> > > > > possible solution could be allowing users to instruct Airflow not
> to
> > > > > pre-install executors providers by passing an argument via
> > > > > `--install-option` during pip install, and add install them
> > explicitly by
> > > > > providing the extras.
> > > > >
> > > > > Since the flexibility of our deprecation policy isn't entirely
> clear
> > to me,
> > > > > I'm neutral on this matter, with a vote of +0.
> > > > >
> > > > > On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <el...@apache.org>
> > wrote:
> > > > >
> > > > > > I agree with Ash
> > > > > > -1 as well.
> > > > > >
> > > > > >
> > > > > > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <
> ash@apache.org>
> > wrote:
> > > > > >
> > > > > > > -1 - based on the premise that your had to install the `dask`
> > extra in
> > > > > > the
> > > > > > > first place to get dask module of the right version, so if we
> > make the
> > > > > > > existing extra depends on the new provider then it's good
> enough.
> > > > > > >
> > > > > > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com>
> > wrote:
> > > > > > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0
> > (with Dask
> > > > > > > >executor) once separated ?
> > > > > > > >
> > > > > > > >Discussion was here:
> > > > > > > >
> > https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> > > > > > > >
> > > > > > > >Please vote:
> > > > > > > >
> > > > > > > >* +1 -> yes, we want to have dask provider preinstalled
> > > > > > > >* -1 -> no, it's fine to make it optional
> > > > > > > >* 0 -> no opinion
> > > > > > > >
> > > > > > > >Consider it my -1: I think we should NOT preinstall Dask
> > provider.
> > > > > > > >
> > > > > > > >Voting guidelines here. This is really a "procedural" matter
> > rather
> > > > > > > >than code modification:
> > > > > > > >
> > > > > > > >> Votes on procedural issues follow the common format of
> > majority rule
> > > > > > > unless otherwise stated. That is, if there are more favourable
> > votes than
> > > > > > > unfavourable ones, the issue is considered to have passed --
> > regardless
> > > > > > of
> > > > > > > the number of votes in each category. (If the number of votes
> > seems too
> > > > > > > small to be representative of a community consensus, the issue
> is
> > > > > > typically
> > > > > > > not pursued.
> > > > > > > >
> > > > > > > >In this case committers have binding votes but other community
> > members
> > > > > > > >are encouraged to state their non-binding votes as well.
> > > > > > > >
> > > > > > > >https://www.apache.org/foundation/voting.html
> > > > > > > >
> > > > > > >
> > >---------------------------------------------------------------------
> > > > > > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > > > > > > >For additional commands, e-mail: dev-help@airflow.apache.org
> > > > > > > >
> > > > > > >
> > > > > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > For additional commands, e-mail: dev-help@airflow.apache.org
> >
> >
>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Pankaj Koti <pa...@astronomer.io.INVALID>.
0 (non-binding)

I agree with most of what has been discussed in the thread
https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
and this email thread.

I did a check on Airflow discussions (
https://github.com/apache/airflow/discussions?discussions_q=dask+)
and Airflow issues(
https://github.com/apache/airflow/issues?q=is%3Aissue+dask+)
and upon reading their descriptions, see that there are some users using
the Dask executor although I personally have never tried it.

If we decide to not pre-install it, in my opinion, a detailed and top-level
news fragment / Changelog would help us here which mentions it to install
the provider additionally in case they would like to continue using it.

Regards,



Pankaj Koti

*Senior Software Engineer, *OSS Engineering Team.
Location: Pune, India

Timezone: Indian Standard Time (IST)

Email: pankaj.koti@astronomer.io

Mobile: +91 9730079985


On Fri, Jul 21, 2023 at 2:20 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> > One possible solution could be allowing users to instruct Airflow not to
> pre-install executors providers by passing an argument via
> `--install-option` during pip install, and add install them explicitly by
> providing the extras.
>
> Yes it would be nice. Unfortunately the dependency mechanisms of
> Python (and `pip`) are not as flexible. You cannot individually remove
> requirements, and having requirements is the only way you can have
> "install also that other package" feature (extras is the mechanism to
> install optional dependencies, but it's not something we should ask
> regular users to do
>
>
>
> On Fri, Jul 21, 2023 at 10:46 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > And just to clarify:  The voting will last till Tuesday 25 July 20023
> 11am CST.
> >
> > On Fri, Jul 21, 2023 at 10:43 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > > Clarification: Ash is right (I totally forgot about it).
> > >
> > > We do have dask in
> > >
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
> > >  and it even explicitly mentions it:
> > >
> > > * "dask pip install 'apache-airflow[dask]' DaskExecutor"
> > >
> > > So technically speaking you **should** use the dask extra to install
> > > distributed, dask and cloudpickle package - all needed to run the dask
> > > executor. And we will keep it this way - it will just move from "core
> > > extras" to "providers". Actually it will move to "deprecated" and we
> > > will alias "dask" it with the new "daskexecutor", because the name
> > > "apache-airflow-providers-dask" is not allowed by PyPI (I guess the
> > > -dask suffix is considered potentially harmful for typosquatting) so I
> > > reserved "apache-airflow-providers-daskexecutor".
> > >
> > > J.
> > >
> > >
> > >
> > > On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hu...@awala.fr>
> wrote:
> > > >
> > > > > based on the premise that your had to install the `dask` extra in
> the
> > > > first place to get dask module of the right version
> > > >
> > > > After reviewing the Dask Executor
> > > > <
> https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html
> >
> > > > section in Airflow documentation, I didn't find any requirement or
> > > > recommendation to add the `dask` extra when using it. So, including
> the
> > > > dependencies explicitly in the requirements file should suffice to
> use the
> > > > executor. This means that this change might potentially break
> existing
> > > > setups for some users when they upgrade Airflow version, which
> contradicts
> > > > the Airflow deprecation policy.
> > > >
> > > > If the premise holds true for Dask, it should also apply to Celery
> and
> > > > Kubernetes. Nevertheless, we decided to pre-install their providers
> after
> > > > the migration.
> > > >
> > > > However, this executor isn't as popular as the others, and as an
> Airflow
> > > > user, I prefer not to have Dask dependencies in my environment when I
> > > > install Airflow with the Celery executor. This helps avoid conflicts
> and
> > > > enables me to upgrade libraries independently from their
> constraints. One
> > > > possible solution could be allowing users to instruct Airflow not to
> > > > pre-install executors providers by passing an argument via
> > > > `--install-option` during pip install, and add install them
> explicitly by
> > > > providing the extras.
> > > >
> > > > Since the flexibility of our deprecation policy isn't entirely clear
> to me,
> > > > I'm neutral on this matter, with a vote of +0.
> > > >
> > > > On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <el...@apache.org>
> wrote:
> > > >
> > > > > I agree with Ash
> > > > > -1 as well.
> > > > >
> > > > >
> > > > > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <as...@apache.org>
> wrote:
> > > > >
> > > > > > -1 - based on the premise that your had to install the `dask`
> extra in
> > > > > the
> > > > > > first place to get dask module of the right version, so if we
> make the
> > > > > > existing extra depends on the new provider then it's good enough.
> > > > > >
> > > > > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com>
> wrote:
> > > > > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0
> (with Dask
> > > > > > >executor) once separated ?
> > > > > > >
> > > > > > >Discussion was here:
> > > > > > >
> https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> > > > > > >
> > > > > > >Please vote:
> > > > > > >
> > > > > > >* +1 -> yes, we want to have dask provider preinstalled
> > > > > > >* -1 -> no, it's fine to make it optional
> > > > > > >* 0 -> no opinion
> > > > > > >
> > > > > > >Consider it my -1: I think we should NOT preinstall Dask
> provider.
> > > > > > >
> > > > > > >Voting guidelines here. This is really a "procedural" matter
> rather
> > > > > > >than code modification:
> > > > > > >
> > > > > > >> Votes on procedural issues follow the common format of
> majority rule
> > > > > > unless otherwise stated. That is, if there are more favourable
> votes than
> > > > > > unfavourable ones, the issue is considered to have passed --
> regardless
> > > > > of
> > > > > > the number of votes in each category. (If the number of votes
> seems too
> > > > > > small to be representative of a community consensus, the issue is
> > > > > typically
> > > > > > not pursued.
> > > > > > >
> > > > > > >In this case committers have binding votes but other community
> members
> > > > > > >are encouraged to state their non-binding votes as well.
> > > > > > >
> > > > > > >https://www.apache.org/foundation/voting.html
> > > > > > >
> > > > > >
> >---------------------------------------------------------------------
> > > > > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > > > > > >For additional commands, e-mail: dev-help@airflow.apache.org
> > > > > > >
> > > > > >
> > > > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> For additional commands, e-mail: dev-help@airflow.apache.org
>
>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Jarek Potiuk <ja...@potiuk.com>.
> One possible solution could be allowing users to instruct Airflow not to
pre-install executors providers by passing an argument via
`--install-option` during pip install, and add install them explicitly by
providing the extras.

Yes it would be nice. Unfortunately the dependency mechanisms of
Python (and `pip`) are not as flexible. You cannot individually remove
requirements, and having requirements is the only way you can have
"install also that other package" feature (extras is the mechanism to
install optional dependencies, but it's not something we should ask
regular users to do



On Fri, Jul 21, 2023 at 10:46 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> And just to clarify:  The voting will last till Tuesday 25 July 20023 11am CST.
>
> On Fri, Jul 21, 2023 at 10:43 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > Clarification: Ash is right (I totally forgot about it).
> >
> > We do have dask in
> > https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
> >  and it even explicitly mentions it:
> >
> > * "dask pip install 'apache-airflow[dask]' DaskExecutor"
> >
> > So technically speaking you **should** use the dask extra to install
> > distributed, dask and cloudpickle package - all needed to run the dask
> > executor. And we will keep it this way - it will just move from "core
> > extras" to "providers". Actually it will move to "deprecated" and we
> > will alias "dask" it with the new "daskexecutor", because the name
> > "apache-airflow-providers-dask" is not allowed by PyPI (I guess the
> > -dask suffix is considered potentially harmful for typosquatting) so I
> > reserved "apache-airflow-providers-daskexecutor".
> >
> > J.
> >
> >
> >
> > On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hu...@awala.fr> wrote:
> > >
> > > > based on the premise that your had to install the `dask` extra in the
> > > first place to get dask module of the right version
> > >
> > > After reviewing the Dask Executor
> > > <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html>
> > > section in Airflow documentation, I didn't find any requirement or
> > > recommendation to add the `dask` extra when using it. So, including the
> > > dependencies explicitly in the requirements file should suffice to use the
> > > executor. This means that this change might potentially break existing
> > > setups for some users when they upgrade Airflow version, which contradicts
> > > the Airflow deprecation policy.
> > >
> > > If the premise holds true for Dask, it should also apply to Celery and
> > > Kubernetes. Nevertheless, we decided to pre-install their providers after
> > > the migration.
> > >
> > > However, this executor isn't as popular as the others, and as an Airflow
> > > user, I prefer not to have Dask dependencies in my environment when I
> > > install Airflow with the Celery executor. This helps avoid conflicts and
> > > enables me to upgrade libraries independently from their constraints. One
> > > possible solution could be allowing users to instruct Airflow not to
> > > pre-install executors providers by passing an argument via
> > > `--install-option` during pip install, and add install them explicitly by
> > > providing the extras.
> > >
> > > Since the flexibility of our deprecation policy isn't entirely clear to me,
> > > I'm neutral on this matter, with a vote of +0.
> > >
> > > On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <el...@apache.org> wrote:
> > >
> > > > I agree with Ash
> > > > -1 as well.
> > > >
> > > >
> > > > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <as...@apache.org> wrote:
> > > >
> > > > > -1 - based on the premise that your had to install the `dask` extra in
> > > > the
> > > > > first place to get dask module of the right version, so if we make the
> > > > > existing extra depends on the new provider then it's good enough.
> > > > >
> > > > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
> > > > > >executor) once separated ?
> > > > > >
> > > > > >Discussion was here:
> > > > > >https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> > > > > >
> > > > > >Please vote:
> > > > > >
> > > > > >* +1 -> yes, we want to have dask provider preinstalled
> > > > > >* -1 -> no, it's fine to make it optional
> > > > > >* 0 -> no opinion
> > > > > >
> > > > > >Consider it my -1: I think we should NOT preinstall Dask provider.
> > > > > >
> > > > > >Voting guidelines here. This is really a "procedural" matter rather
> > > > > >than code modification:
> > > > > >
> > > > > >> Votes on procedural issues follow the common format of majority rule
> > > > > unless otherwise stated. That is, if there are more favourable votes than
> > > > > unfavourable ones, the issue is considered to have passed -- regardless
> > > > of
> > > > > the number of votes in each category. (If the number of votes seems too
> > > > > small to be representative of a community consensus, the issue is
> > > > typically
> > > > > not pursued.
> > > > > >
> > > > > >In this case committers have binding votes but other community members
> > > > > >are encouraged to state their non-binding votes as well.
> > > > > >
> > > > > >https://www.apache.org/foundation/voting.html
> > > > > >
> > > > > >---------------------------------------------------------------------
> > > > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > > > > >For additional commands, e-mail: dev-help@airflow.apache.org
> > > > > >
> > > > >
> > > >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Jarek Potiuk <ja...@potiuk.com>.
And just to clarify:  The voting will last till Tuesday 25 July 20023 11am CST.

On Fri, Jul 21, 2023 at 10:43 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> Clarification: Ash is right (I totally forgot about it).
>
> We do have dask in
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
>  and it even explicitly mentions it:
>
> * "dask pip install 'apache-airflow[dask]' DaskExecutor"
>
> So technically speaking you **should** use the dask extra to install
> distributed, dask and cloudpickle package - all needed to run the dask
> executor. And we will keep it this way - it will just move from "core
> extras" to "providers". Actually it will move to "deprecated" and we
> will alias "dask" it with the new "daskexecutor", because the name
> "apache-airflow-providers-dask" is not allowed by PyPI (I guess the
> -dask suffix is considered potentially harmful for typosquatting) so I
> reserved "apache-airflow-providers-daskexecutor".
>
> J.
>
>
>
> On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hu...@awala.fr> wrote:
> >
> > > based on the premise that your had to install the `dask` extra in the
> > first place to get dask module of the right version
> >
> > After reviewing the Dask Executor
> > <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html>
> > section in Airflow documentation, I didn't find any requirement or
> > recommendation to add the `dask` extra when using it. So, including the
> > dependencies explicitly in the requirements file should suffice to use the
> > executor. This means that this change might potentially break existing
> > setups for some users when they upgrade Airflow version, which contradicts
> > the Airflow deprecation policy.
> >
> > If the premise holds true for Dask, it should also apply to Celery and
> > Kubernetes. Nevertheless, we decided to pre-install their providers after
> > the migration.
> >
> > However, this executor isn't as popular as the others, and as an Airflow
> > user, I prefer not to have Dask dependencies in my environment when I
> > install Airflow with the Celery executor. This helps avoid conflicts and
> > enables me to upgrade libraries independently from their constraints. One
> > possible solution could be allowing users to instruct Airflow not to
> > pre-install executors providers by passing an argument via
> > `--install-option` during pip install, and add install them explicitly by
> > providing the extras.
> >
> > Since the flexibility of our deprecation policy isn't entirely clear to me,
> > I'm neutral on this matter, with a vote of +0.
> >
> > On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <el...@apache.org> wrote:
> >
> > > I agree with Ash
> > > -1 as well.
> > >
> > >
> > > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <as...@apache.org> wrote:
> > >
> > > > -1 - based on the premise that your had to install the `dask` extra in
> > > the
> > > > first place to get dask module of the right version, so if we make the
> > > > existing extra depends on the new provider then it's good enough.
> > > >
> > > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
> > > > >executor) once separated ?
> > > > >
> > > > >Discussion was here:
> > > > >https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> > > > >
> > > > >Please vote:
> > > > >
> > > > >* +1 -> yes, we want to have dask provider preinstalled
> > > > >* -1 -> no, it's fine to make it optional
> > > > >* 0 -> no opinion
> > > > >
> > > > >Consider it my -1: I think we should NOT preinstall Dask provider.
> > > > >
> > > > >Voting guidelines here. This is really a "procedural" matter rather
> > > > >than code modification:
> > > > >
> > > > >> Votes on procedural issues follow the common format of majority rule
> > > > unless otherwise stated. That is, if there are more favourable votes than
> > > > unfavourable ones, the issue is considered to have passed -- regardless
> > > of
> > > > the number of votes in each category. (If the number of votes seems too
> > > > small to be representative of a community consensus, the issue is
> > > typically
> > > > not pursued.
> > > > >
> > > > >In this case committers have binding votes but other community members
> > > > >are encouraged to state their non-binding votes as well.
> > > > >
> > > > >https://www.apache.org/foundation/voting.html
> > > > >
> > > > >---------------------------------------------------------------------
> > > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > > > >For additional commands, e-mail: dev-help@airflow.apache.org
> > > > >
> > > >
> > >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Jarek Potiuk <ja...@potiuk.com>.
Clarification: Ash is right (I totally forgot about it).

We do have dask in
https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html#core-airflow-extras
 and it even explicitly mentions it:

* "dask pip install 'apache-airflow[dask]' DaskExecutor"

So technically speaking you **should** use the dask extra to install
distributed, dask and cloudpickle package - all needed to run the dask
executor. And we will keep it this way - it will just move from "core
extras" to "providers". Actually it will move to "deprecated" and we
will alias "dask" it with the new "daskexecutor", because the name
"apache-airflow-providers-dask" is not allowed by PyPI (I guess the
-dask suffix is considered potentially harmful for typosquatting) so I
reserved "apache-airflow-providers-daskexecutor".

J.



On Fri, Jul 21, 2023 at 10:34 AM Hussein Awala <hu...@awala.fr> wrote:
>
> > based on the premise that your had to install the `dask` extra in the
> first place to get dask module of the right version
>
> After reviewing the Dask Executor
> <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html>
> section in Airflow documentation, I didn't find any requirement or
> recommendation to add the `dask` extra when using it. So, including the
> dependencies explicitly in the requirements file should suffice to use the
> executor. This means that this change might potentially break existing
> setups for some users when they upgrade Airflow version, which contradicts
> the Airflow deprecation policy.
>
> If the premise holds true for Dask, it should also apply to Celery and
> Kubernetes. Nevertheless, we decided to pre-install their providers after
> the migration.
>
> However, this executor isn't as popular as the others, and as an Airflow
> user, I prefer not to have Dask dependencies in my environment when I
> install Airflow with the Celery executor. This helps avoid conflicts and
> enables me to upgrade libraries independently from their constraints. One
> possible solution could be allowing users to instruct Airflow not to
> pre-install executors providers by passing an argument via
> `--install-option` during pip install, and add install them explicitly by
> providing the extras.
>
> Since the flexibility of our deprecation policy isn't entirely clear to me,
> I'm neutral on this matter, with a vote of +0.
>
> On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <el...@apache.org> wrote:
>
> > I agree with Ash
> > -1 as well.
> >
> >
> > On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <as...@apache.org> wrote:
> >
> > > -1 - based on the premise that your had to install the `dask` extra in
> > the
> > > first place to get dask module of the right version, so if we make the
> > > existing extra depends on the new provider then it's good enough.
> > >
> > > On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com> wrote:
> > > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
> > > >executor) once separated ?
> > > >
> > > >Discussion was here:
> > > >https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> > > >
> > > >Please vote:
> > > >
> > > >* +1 -> yes, we want to have dask provider preinstalled
> > > >* -1 -> no, it's fine to make it optional
> > > >* 0 -> no opinion
> > > >
> > > >Consider it my -1: I think we should NOT preinstall Dask provider.
> > > >
> > > >Voting guidelines here. This is really a "procedural" matter rather
> > > >than code modification:
> > > >
> > > >> Votes on procedural issues follow the common format of majority rule
> > > unless otherwise stated. That is, if there are more favourable votes than
> > > unfavourable ones, the issue is considered to have passed -- regardless
> > of
> > > the number of votes in each category. (If the number of votes seems too
> > > small to be representative of a community consensus, the issue is
> > typically
> > > not pursued.
> > > >
> > > >In this case committers have binding votes but other community members
> > > >are encouraged to state their non-binding votes as well.
> > > >
> > > >https://www.apache.org/foundation/voting.html
> > > >
> > > >---------------------------------------------------------------------
> > > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > > >For additional commands, e-mail: dev-help@airflow.apache.org
> > > >
> > >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
For additional commands, e-mail: dev-help@airflow.apache.org


Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Hussein Awala <hu...@awala.fr>.
> based on the premise that your had to install the `dask` extra in the
first place to get dask module of the right version

After reviewing the Dask Executor
<https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/dask.html>
section in Airflow documentation, I didn't find any requirement or
recommendation to add the `dask` extra when using it. So, including the
dependencies explicitly in the requirements file should suffice to use the
executor. This means that this change might potentially break existing
setups for some users when they upgrade Airflow version, which contradicts
the Airflow deprecation policy.

If the premise holds true for Dask, it should also apply to Celery and
Kubernetes. Nevertheless, we decided to pre-install their providers after
the migration.

However, this executor isn't as popular as the others, and as an Airflow
user, I prefer not to have Dask dependencies in my environment when I
install Airflow with the Celery executor. This helps avoid conflicts and
enables me to upgrade libraries independently from their constraints. One
possible solution could be allowing users to instruct Airflow not to
pre-install executors providers by passing an argument via
`--install-option` during pip install, and add install them explicitly by
providing the extras.

Since the flexibility of our deprecation policy isn't entirely clear to me,
I'm neutral on this matter, with a vote of +0.

On Fri, Jul 21, 2023 at 9:10 AM Elad Kalif <el...@apache.org> wrote:

> I agree with Ash
> -1 as well.
>
>
> On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <as...@apache.org> wrote:
>
> > -1 - based on the premise that your had to install the `dask` extra in
> the
> > first place to get dask module of the right version, so if we make the
> > existing extra depends on the new provider then it's good enough.
> >
> > On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com> wrote:
> > >Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
> > >executor) once separated ?
> > >
> > >Discussion was here:
> > >https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> > >
> > >Please vote:
> > >
> > >* +1 -> yes, we want to have dask provider preinstalled
> > >* -1 -> no, it's fine to make it optional
> > >* 0 -> no opinion
> > >
> > >Consider it my -1: I think we should NOT preinstall Dask provider.
> > >
> > >Voting guidelines here. This is really a "procedural" matter rather
> > >than code modification:
> > >
> > >> Votes on procedural issues follow the common format of majority rule
> > unless otherwise stated. That is, if there are more favourable votes than
> > unfavourable ones, the issue is considered to have passed -- regardless
> of
> > the number of votes in each category. (If the number of votes seems too
> > small to be representative of a community consensus, the issue is
> typically
> > not pursued.
> > >
> > >In this case committers have binding votes but other community members
> > >are encouraged to state their non-binding votes as well.
> > >
> > >https://www.apache.org/foundation/voting.html
> > >
> > >---------------------------------------------------------------------
> > >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> > >For additional commands, e-mail: dev-help@airflow.apache.org
> > >
> >
>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Elad Kalif <el...@apache.org>.
I agree with Ash
-1 as well.


On Fri, Jul 21, 2023 at 9:29 AM Ash Berlin-Taylor <as...@apache.org> wrote:

> -1 - based on the premise that your had to install the `dask` extra in the
> first place to get dask module of the right version, so if we make the
> existing extra depends on the new provider then it's good enough.
>
> On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com> wrote:
> >Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
> >executor) once separated ?
> >
> >Discussion was here:
> >https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
> >
> >Please vote:
> >
> >* +1 -> yes, we want to have dask provider preinstalled
> >* -1 -> no, it's fine to make it optional
> >* 0 -> no opinion
> >
> >Consider it my -1: I think we should NOT preinstall Dask provider.
> >
> >Voting guidelines here. This is really a "procedural" matter rather
> >than code modification:
> >
> >> Votes on procedural issues follow the common format of majority rule
> unless otherwise stated. That is, if there are more favourable votes than
> unfavourable ones, the issue is considered to have passed -- regardless of
> the number of votes in each category. (If the number of votes seems too
> small to be representative of a community consensus, the issue is typically
> not pursued.
> >
> >In this case committers have binding votes but other community members
> >are encouraged to state their non-binding votes as well.
> >
> >https://www.apache.org/foundation/voting.html
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
> >For additional commands, e-mail: dev-help@airflow.apache.org
> >
>

Re: [VOTE] Make (soon coming) dask provider preinstalled

Posted by Ash Berlin-Taylor <as...@apache.org>.
-1 - based on the premise that your had to install the `dask` extra in the first place to get dask module of the right version, so if we make the existing extra depends on the new provider then it's good enough.

On 21 July 2023 06:22:28 BST, Jarek Potiuk <ja...@potiuk.com> wrote:
>Q: Do we want to pre-install Dask provider in Airflow 2.7.0 (with Dask
>executor) once separated ?
>
>Discussion was here:
>https://lists.apache.org/thread/0d9x4kl7hc2qzvho2mbdf35ohn5w12l6
>
>Please vote:
>
>* +1 -> yes, we want to have dask provider preinstalled
>* -1 -> no, it's fine to make it optional
>* 0 -> no opinion
>
>Consider it my -1: I think we should NOT preinstall Dask provider.
>
>Voting guidelines here. This is really a "procedural" matter rather
>than code modification:
>
>> Votes on procedural issues follow the common format of majority rule unless otherwise stated. That is, if there are more favourable votes than unfavourable ones, the issue is considered to have passed -- regardless of the number of votes in each category. (If the number of votes seems too small to be representative of a community consensus, the issue is typically not pursued.
>
>In this case committers have binding votes but other community members
>are encouraged to state their non-binding votes as well.
>
>https://www.apache.org/foundation/voting.html
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@airflow.apache.org
>For additional commands, e-mail: dev-help@airflow.apache.org
>