You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <Ja...@polidea.com> on 2020/11/11 09:58:58 UTC

Rewriting Breeze in Python ?

Hello Everyone,

TL; DR; I was thinking for quite a while on this and I think this is the
right time to raise that subject. It's been asked several times, why Breeze
is not written in something else than Bash since it is "that big" or some
people said "monstrous" :). I think it's the right time to start a
"rewrite" project with wide community involvement and Python seems to be
the best choice :).


While I was opposing this while we were focusing on Airflow 2.0, and there
are some good reasons why initially I started Breeze in Bash, I think with
the current state of Airflow 2.0 betas, with Airflow 2.0 fully based on
Python 3.6 and with some "stability" and "good set of features" we have in
Breeze and a good level of modularisation we achieved - it's the right time
to think about a rewrite.

I did not raise this subject to add a distraction on top of what is
already a lot of work for 2.0, but I think having Breeze rewritten in
Python could be the "one more thing" that we could do - as a community to
make 2.0 experience even better, and one that can make the community even
closer.

I was thinking that Breeze is perfect to be split into separate smaller
pieces, describe some assumptions that we will have for its use, and turn
it into a true community effort where a lot of people will contribute and
where we will be able to simplify some of the stuff, and - most importantly
- make more people from the community know about how our CI and development
environment works and be able to solve any problems there.

Breeze (and underlying bash libraries) are crucial, to get our CI working
and I am mostly the single point of contact (and failure!) when it comes to
that - I would love to not be one :) and I think with most of the core
committers busy with 2.0, this is also an opportunity for more of the
contributors to take their part in it (and eventually earn their rank to
become committers!). For the core committers, this is an extra opportunity
to learn how the system works, influence its design, and possibly simplify
some parts of it - even if they will be mostly focused on 2.0.

I would like to do it well - write some assumptions in a design doc, plan
the work and split it into separate issues, and lead the effort - but I
would love if most of the work is done by others, who would then become
familiar with the whole of it.

WDYT? Do you think it is a good idea? Do you thin k it is the right time?
Are there some people in the community who would like to take part in it?

J.

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Daniel Imberman <da...@gmail.com>.
Earthly looks REALLY cool! I am very pro doing as much as possible within the container ecosystem for easier local development/getting as close as humanly possible to parity between local and CI build.
@Kaxil I think it’s ok for us to start the conversation now, but agree that we shouldn’t worry about any definitive decisions or implementation until 2.0 GA is released.

via Newton Mail [https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2]
On Thu, Nov 12, 2020 at 5:54 AM, Jarek Potiuk <Ja...@polidea.com> wrote:
Surely I can defer the decisions until the release :). But I just wanted to start back-burner discussions and getting people interested and excited.
And BTW, I thought a bit about it, and I think those discussions actually *could* include the "user's story" - but rather than as part of Breeze, possibly as a separate tool. Seems that there is a need there and I am happy to lead that part of the discussion as well as long as we clearly separate those two use cases/ user groups.
J.

On Thu, Nov 12, 2020 at 1:37 PM Kaxil Naik < kaxilnaik@gmail.com [kaxilnaik@gmail.com] > wrote:
My point was not only about writing it post 2.0. I am proposing to even start planning/discussion about it after 2.0.
There is a lot going on currently. And any planning / proposal will start discussions and I would propose that we wait after 2.0 to even collect suggestions and proposals to keep our focus completely on 2.0.
Regards, Kaxil
On Thu, Nov 12, 2020 at 11:11 AM Kamil Breguła < kamil.bregula@polidea.com [kamil.bregula@polidea.com] > wrote:
Hello,

I personally tried to make various changes to Breez many times and was always afraid that I would do something wrong because I would miss something. Breeze has too many global variables and tricks to be easily managed through ad-hoc contributions.

Python is a very good idea and I'm already trying to write an all-new feature in Python. Luckily, Bash and Python complement each other well, so it's not a problem for one Bash script to run a Python script and a Python script to run a Bash script. This may allow us to migrate smoothly from Python to Bash.

Best regards,
Kamil Breguła
On Thu, Nov 12, 2020 at 11:52 AM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
My intention is not to rewrite it now, but start doing it when we get a stable 2.0 release, to know what we want to achieve and plan it, and have a team aligned on it - so that we can actually start doing it whenever we feel 2.0 is "stable" and there is nothing of higher priority.
But I will start discussion and doc on "scope", "use cases" and "users" - so that we know what we DO and what we DO NOT do with Breeze.
My goal is simple" "It's a Breeze to develop Airflow". It's not about "using Airflow", it's not about "trying out Airflow", it's not about "writing and testing DAGs" - if there is a need for that, this should be a different tool/project.
The "users" of Breeze are only contributors. Full Stop. For "Airflow users" - if they are not contributors, Breeze will be useless for them. And that's intended.
I would like to clarify that goal and assumptions soon, so I am preparing a short doc where I put my assumptions about that, but in the scope of it, I want to keep the focus of "developing Airflow" only.
This is my primary concern - that there are some ideas on what to do with Breeze that go far beyond that primary goal. But I would like to keep Breeze within those boundaries only.
And I am happy to help with other initiatives to answer other needs, but those should be separate IMHO.
J.

On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman < daniel.imberman@gmail.com [daniel.imberman@gmail.com] > wrote:
I am all for rewriting breeze, but I think waiting until after 2.0 makes the most sense. Python could work, but let’s be intentional about the decision before we choose.

via Newton Mail [https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2]
On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong < xd.deng.r@gmail.com [xd.deng.r@gmail.com] > wrote:
I agree with Kaxil’s point (or even a bit later, say when 2.0 gets relatively more “stable”).
My aspect is more about to concentrate development/community focus.

XD
On Thu, Nov 12, 2020 at 00:05 Kaxil Naik < kaxilnaik@gmail.com [kaxilnaik@gmail.com] > wrote:
I think we should wait until 2.0 is out before discussing or even gathering feedback. As I am sure any feedback will trigger a discussion.
On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
Andrew,
Thanks for chiming in - just to answer your questions and clarify the scope of the discussion:
Breeze is for developing Airflow itself, it's purpose is not to develop and run DAGs. It was never intended to be used by the "users" of Airflow or DAG development or testing the DAGs. And while we were pondering with that thought recently, I think it never will be this, it is simply not fit for the purpose.
Even the "start-airflow" command is there mainly for the developers of Airflow, not for the users of it. For example, it can be quickly used to test if a new release candidate for Apache Aiirflow "works" - thanks to it in a few minutes I can run a released version of Airflow in several combinations of python/backend and see that it generally "works".
So for the docker-compose user production image" - sure, it is needed but this is a different issue, different users, and a completely different use-case (even if "docker-compose" name is there too). Those two are completely different use-cases, starting from the fact that even the docker image used there is different. Maybe this is what both you and Ash are talking about. In which case I fully agree it's needed, but I believe we are not talking about it here.
If you want to have this kind of approach you are talking about, you can take a look at the issue here: https://github.com/apache/airflow/issues/8605 [https://github.com/apache/airflow/issues/8605] . Nobody works on it actively now, but I would love someone who takes a lead on it and completes it. I am happy to help and review it as much as I can. But maybe you would like to take a lead on it Andrew since you have some experience and real use case behind? I think we need people there who are actual users of Airflow - which sadly, I am mostly not one :)
But let's not mix the two please :). I'd love to keep this thread focused on "Breeze, the development environment for Airflow itself" . Even the tagline of Breeze " It's a Breeze to develop Airflow ." rather than "It's a Breeze to develop DAGs"
J.

On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
Tomek:
I started the discussion here, so just everyone is aware of it even if they are not watching GH issues. I now created the GH Issue https://github.com/apache/airflow/issues/12282 [https://github.com/apache/airflow/issues/12282] so that I can gather together people with some interest and I think it's best to continue the discussion there.
What I plan to do within the next few days, is to start a design document and design discussion. I would like to start with defining the actual users of Breeze, the use-cases it should serve, the purpose, and the set of assumptions that it should have. And only after we hash it all out, I would like to define the scope, decide whether we want to have one or many different tools for different users, how much of it is common and whether we can remove some of it completely or simplify it.
I think we've gathered enormous experience from various levels of developers while using Breeze and it's a perfect moment to discuss (with those various users) what is useful, for whom, what makes sense, and how to provide the best interface. I see the current Breeze as a learning platform on what is useful and what is not, and I would love - this time - so that decisions in it are made by the actual users (of a various kind). And I would love to lead it - not as a developer this time, but as a "product manager" - listening to various voices and trying to make the best of it, reaching some consensus and working with others to implement it. I think this is the best use of the experience we had with Breeze and the "crowd-wisdom" of the developers of Airflow of a different kind and with a different experience.

J.


On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon < andrewharmonllc@gmail.com [andrewharmonllc@gmail.com] > wrote:
I would agree as an end user, I’m not really sure what Breeze does. Is it for CI or is it a way to quickly spin up a containerized env for local development. I do think it would be great to have something similar to Puckel that uses official airflow images. Very easy to quickly get started with to give airflow a try, but also a jumping off point for organizations to customize it to their needs. If this is decker-compose or something else, that’s fine. We use a customized version of puckel for all the engineers to do local dag development. It would be great if this was more “official” Airflow. I agree that python would make it easier for others to contribute. Finally, very clear documentation on the Airflow site would be very helpful too.
Thanks, Andrew Harmon

On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek < turbaszek@apache.org [turbaszek@apache.org] > wrote:
+1 for using python.

> I would also say: make breeze do less. Right now it is three major things: > * A local development environment > * CI runner > * It's recently grown the ability to run airflow for developing dags.
My first thought was similar - breeze does too much now. However, I think the problem is not in plenty of functionality but in technology used - bash. Using python or any other language will let us create a nice and clear structure for the project that will be easy to onboard, reason about and manage.
Structuring breeze may allow us to leverage using separate docker images, docker composes for different purposes (CI, DAG dev, Airflow dev). I like the way in which breeze is a "layer over docker" and I think this gives a nice experience. However, breeze has grown so big that I'm not sure even if I use half of the functions it has.
Note: where should we continue the discussion? The official place is devlist, but we have GH issue. Which one should we use to avoid two separate discussions?
Tomek

On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
I also created issue for it: https://github.com/apache/airflow/issues/12282 [https://github.com/apache/airflow/issues/12282]
Anyone interested in taking part - please comment there!
On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
You screamed (among many others) and I listened :). And I think the time is now to act.
I believe the scope of "Breeze 2" should be part of the design discussion, where we will hear other's opinions (especially the first time or fresh contributors).
For now, my vision is quite a bit different than yours Ash :). But I do not want to start a design discussion just yet, I want to make breathing space for others to chime in.
I would love to hear many voices and interests of people before we deep dive into what "Breeze 2" might look like.
What I am interested in is whether:
a) it's the right time b) python is the right choice c) do I have several people who would like to join and offer both - help in designing the vision for it, as well as their time to implement it.
I think it is crucial that those people who will be implementing it, will be the main people who make design decisions about it, as I would love to have a strong group of people who would like to not only take part in developing it but also in maintaining it in the future.
J.

On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor < ash@apache.org [ash@apache.org] > wrote:
Omg yes. I have been screaming out for this for months.
$ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l 6911
That's entirely too much bash for my liking by about an order of magnitude ;)
I would also say : make breeze do less. Right now it is three major things :
* A local development environment * CI runner * It's recently grown the ability to run airflow for developing dags.
That is too much. Yes there is overlap, but it's just too much in one tool, and too complex as a result. Some of this should just be replaced with a docker-compose file (that uses published release images, not floating master/nightly) and users told to run that.
Make it simpler, fitting a core purpose - running CI consistently should be it's only goal.
-ash
On Nov 11 2020, at 9:58 am, Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote: Hello Everyone,
TL; DR; I was thinking for quite a while on this and I think this is the right time to raise that subject. It's been asked several times, why Breeze is not written in something else than Bash since it is "that big" or some people said "monstrous" :). I think it's the right time to start a "rewrite" project with wide community involvement and Python seems to be the best choice :).

While I was opposing this while we were focusing on Airflow 2.0, and there are some good reasons why initially I started Breeze in Bash, I think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based on Python 3.6 and with some "stability" and "good set of features" we have in Breeze and a good level of modularisation we achieved - it's the right time to think about a rewrite.
I did not raise this subject to add a distraction on top of what is already a lot of work for 2.0, but I think having Breeze rewritten in Python could be the "one more thing" that we could do - as a community to make 2.0 experience even better, and one that can make the community even closer.
I was thinking that Breeze is perfect to be split into separate smaller pieces, describe some assumptions that we will have for its use, and turn it into a true community effort where a lot of people will contribute and where we will be able to simplify some of the stuff, and - most importantly - make more people from the community know about how our CI and development environment works and be able to solve any problems there.
Breeze (and underlying bash libraries) are crucial, to get our CI working and I am mostly the single point of contact (and failure!) when it comes to that - I would love to not be one :) and I think with most of the core committers busy with 2.0, this is also an opportunity for more of the contributors to take their part in it (and eventually earn their rank to become committers!). For the core committers, this is an extra opportunity to learn how the system works, influence its design, and possibly simplify some parts of it - even if they will be mostly focused on 2.0.
I would like to do it well - write some assumptions in a design doc, plan the work and split it into separate issues, and lead the effort - but I would love if most of the work is done by others, who would then become familiar with the whole of it.
WDYT? Do you think it is a good idea? Do you thin k it is the right time? Are there some people in the community who would like to take part in it?
J.
--   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              



--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              




--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              





--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              




--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              




--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              




--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
Surely I can defer the decisions until the release :). But I just wanted to
start back-burner discussions and getting people interested and excited.

And BTW, I thought a bit about it, and I think those discussions actually
*could* include the "user's story" - but rather than as part of Breeze,
possibly as a separate tool. Seems that there is a need there and I am
happy to lead that part of the discussion as well as long as we clearly
separate those two use cases/ user groups.

J.


On Thu, Nov 12, 2020 at 1:37 PM Kaxil Naik <ka...@gmail.com> wrote:

> My point was not only about writing it post 2.0. I am proposing to even
> start planning/discussion about it after 2.0.
>
> There is a lot going on currently. And any planning / proposal will start
> discussions and I would propose that we wait after 2.0 to even collect
> suggestions and proposals to keep our focus completely on 2.0.
>
> Regards,
> Kaxil
>
> On Thu, Nov 12, 2020 at 11:11 AM Kamil Breguła <ka...@polidea.com>
> wrote:
>
>> Hello,
>>
>> I personally tried to make various changes to Breez many times and was
>> always afraid that I would do something wrong because I would miss
>> something. Breeze has too many global variables and tricks to be easily
>> managed through ad-hoc contributions.
>>
>> Python is a very good idea and I'm already trying to write an all-new
>> feature in Python. Luckily, Bash and Python complement each other well, so
>> it's not a problem for one Bash script to run a Python script and a Python
>> script to run a Bash script. This may allow us to migrate smoothly from
>> Python to Bash.
>>
>> Best regards,
>> Kamil Breguła
>>
>> On Thu, Nov 12, 2020 at 11:52 AM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> My intention is not to rewrite it now, but start doing it when we get a
>>> stable 2.0 release, to know what we want to achieve and plan it, and have a
>>> team aligned on it -  so that we can actually start doing it whenever we
>>> feel 2.0 is "stable" and there is nothing of higher priority.
>>>
>>> But I will start discussion and doc on "scope", "use cases" and "users"
>>> - so that we know what we DO and what we DO NOT do with Breeze.
>>>
>>> My goal is simple" "It's a Breeze to *develop *Airflow". It's not
>>> about  "using Airflow", it's not about "trying out Airflow", it's not about
>>> "writing and testing DAGs" - if there is a need for that, this should be a
>>> different tool/project.
>>>
>>> The "users" of Breeze are only contributors. Full Stop. For "Airflow
>>> users" - if they are not contributors, Breeze will be useless for them. And
>>> that's intended.
>>>
>>> I would like to clarify that goal and assumptions soon, so I am
>>> preparing a short doc where I put my assumptions about that, but in the
>>> scope of it, I want to keep the focus of "developing Airflow" only.
>>>
>>> This is my primary concern - that there are some ideas on what to do
>>> with Breeze that go far beyond that primary goal. But I would like to keep
>>> Breeze within those boundaries only.
>>>
>>> And I am happy to help with other initiatives to answer other needs, but
>>> those should be separate IMHO.
>>>
>>> J.
>>>
>>>
>>> On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman <
>>> daniel.imberman@gmail.com> wrote:
>>>
>>>> I am all for rewriting breeze, but I think waiting until after 2.0
>>>> makes the most sense. Python could work, but let’s be intentional about the
>>>> decision before we choose.
>>>>
>>>> via Newton Mail
>>>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2>
>>>>
>>>> On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd...@gmail.com>
>>>> wrote:
>>>>
>>>> I agree with Kaxil’s point (or even a bit later, say when 2.0 gets
>>>> relatively more “stable”).
>>>>
>>>> My aspect is more about to concentrate development/community focus.
>>>>
>>>>
>>>> XD
>>>>
>>>> On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <ka...@gmail.com> wrote:
>>>>
>>>>> I think we should wait until 2.0 is out before discussing or even
>>>>> gathering feedback. As I am sure any feedback will trigger a discussion.
>>>>>
>>>>> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
>>>>> wrote:
>>>>>
>>>>>> Andrew,
>>>>>>
>>>>>> Thanks for chiming in - just to answer your questions and clarify the
>>>>>> scope of the discussion:
>>>>>>
>>>>>> Breeze is for developing Airflow itself, it's purpose is not to
>>>>>> develop and run DAGs. It was never intended to be used by the "users" of
>>>>>> Airflow or DAG development or testing the DAGs. And while we were pondering
>>>>>> with that thought recently, I think it never will be this, it is simply not
>>>>>> fit for the purpose.
>>>>>>
>>>>>> Even the "start-airflow" command is there mainly for the developers
>>>>>> of Airflow, not for the users of it. For example, it can be quickly used to
>>>>>> test if a new release candidate for Apache Aiirflow "works" - thanks to it
>>>>>> in a few minutes I can run a released version of Airflow in several
>>>>>> combinations of python/backend and see that it generally "works".
>>>>>>
>>>>>> So for the docker-compose user production image" - sure, it is needed
>>>>>> but this is a different issue, different users, and a completely different
>>>>>> use-case (even if "docker-compose" name is there too). Those two are
>>>>>> completely different use-cases, starting from the fact that even the docker
>>>>>> image used there is different. Maybe this is what both you and Ash are
>>>>>> talking about. In which case I fully agree it's needed, but I believe we
>>>>>> are not talking about it here.
>>>>>>
>>>>>> If you want to have this kind of approach you are talking about, you
>>>>>> can take a look at the issue here:
>>>>>> https://github.com/apache/airflow/issues/8605. Nobody works on it
>>>>>> actively now, but I would love someone who takes a lead on it and completes
>>>>>> it. I am happy to help and review it as much as I can. But maybe you would
>>>>>> like to take a lead on it Andrew since you have some experience and real
>>>>>> use case behind? I think we need people there who are actual users of
>>>>>> Airflow - which sadly, I am mostly not one :)
>>>>>>
>>>>>> But let's not mix the two please :). I'd love to keep this thread
>>>>>> focused on *"Breeze, the development environment for Airflow itself"*.
>>>>>> Even the tagline of Breeze "*It's a Breeze to develop Airflow*."
>>>>>> rather than "It's a Breeze to develop DAGs"
>>>>>>
>>>>>> J.
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <
>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>
>>>>>>> Tomek:
>>>>>>>
>>>>>>> I started the discussion here, so just everyone is aware of it even
>>>>>>> if they are not watching GH issues. I now created the GH Issue
>>>>>>> https://github.com/apache/airflow/issues/12282 so that I can gather
>>>>>>> together people with some interest and I think it's best to continue the
>>>>>>> discussion there.
>>>>>>>
>>>>>>> What I plan to do within the next few days, is to start a design
>>>>>>> document and design discussion. I would like to start with defining the
>>>>>>> actual users of Breeze, the use-cases it should serve, the purpose, and the
>>>>>>> set of assumptions that it should have. And only after we hash it all out,
>>>>>>> I would like to define the scope, decide whether we want to have one or
>>>>>>> many different tools for different users, how much of it is common and
>>>>>>> whether we can remove some of it completely or simplify it.
>>>>>>>
>>>>>>> I think we've gathered enormous experience from various levels of
>>>>>>> developers while using Breeze and it's a perfect moment to discuss (with
>>>>>>> those various users) what is useful, for whom, what makes sense, and how to
>>>>>>> provide the best interface. I see the current Breeze as a learning platform
>>>>>>> on what is useful and what is not, and I would love - this time - so that
>>>>>>> decisions in it are made by the actual users (of a various kind). And I
>>>>>>> would love to lead it - not as a developer this time, but as a "product
>>>>>>> manager" - listening to various voices and trying to make the best of it,
>>>>>>> reaching some consensus and working with others to implement it. I think
>>>>>>> this is the best use of the experience we had with Breeze and the
>>>>>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>>>>>>> different experience.
>>>>>>>
>>>>>>> J.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <
>>>>>>> andrewharmonllc@gmail.com> wrote:
>>>>>>>
>>>>>>>> I would agree as an end user, I’m not really sure what Breeze does.
>>>>>>>> Is it for CI or is it a way to quickly spin up a containerized env for
>>>>>>>> local development. I do think it would be great to have something similar
>>>>>>>> to Puckel that uses official airflow images. Very easy to quickly get
>>>>>>>> started with to give airflow a try, but also a jumping off point for
>>>>>>>> organizations to customize it to their needs. If this is decker-compose or
>>>>>>>> something else, that’s fine. We use a customized version of puckel for all
>>>>>>>> the engineers to do local dag development. It would be great if this was
>>>>>>>> more “official” Airflow. I agree that python would make it easier for
>>>>>>>> others to contribute. Finally, very clear documentation on the Airflow site
>>>>>>>> would be very helpful too.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Andrew Harmon
>>>>>>>>
>>>>>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> +1 for using python.
>>>>>>>>
>>>>>>>> > I would also say: make breeze do less. Right now it is three
>>>>>>>> major things:
>>>>>>>> > * A local development environment
>>>>>>>> > * CI runner
>>>>>>>> > * It's recently grown the ability to run airflow for developing
>>>>>>>> dags.
>>>>>>>>
>>>>>>>> My first thought was similar - breeze does too much now. However, I
>>>>>>>> think the problem is not in plenty of functionality but in technology used
>>>>>>>> - bash. Using python or any other language will let us create a nice and
>>>>>>>> clear structure for the project that will be easy to onboard, reason about
>>>>>>>> and manage.
>>>>>>>>
>>>>>>>> Structuring breeze may allow us to leverage using separate docker
>>>>>>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>>>>>>> I like the way in which breeze is a "layer over docker" and I think this
>>>>>>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>>>>>>> even if I use half of the functions it has.
>>>>>>>>
>>>>>>>> *Note:* where should we continue the discussion? The official
>>>>>>>> place is devlist, but we have GH issue. Which one should we use to avoid
>>>>>>>> two separate discussions?
>>>>>>>>
>>>>>>>> Tomek
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <
>>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>>
>>>>>>>>> I also created issue for it:
>>>>>>>>> https://github.com/apache/airflow/issues/12282
>>>>>>>>>
>>>>>>>>> Anyone interested in taking part - please comment there!
>>>>>>>>>
>>>>>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <
>>>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>>>
>>>>>>>>>> You screamed (among many others) and I listened :). And I think
>>>>>>>>>> the time is now to act.
>>>>>>>>>>
>>>>>>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>>>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>>>>>>> or fresh contributors).
>>>>>>>>>>
>>>>>>>>>> For now, my vision is quite a bit different than yours Ash :).
>>>>>>>>>> But I do not want to start a design discussion just yet, I want to make
>>>>>>>>>> breathing space for others to chime in.
>>>>>>>>>>
>>>>>>>>>> I would love to hear many voices and interests of people before
>>>>>>>>>> we deep dive into what "Breeze 2" might look like.
>>>>>>>>>>
>>>>>>>>>> What I am interested in is whether:
>>>>>>>>>>
>>>>>>>>>> a) it's the right time
>>>>>>>>>> b) python is the right choice
>>>>>>>>>> c) do I have several people who would like to join and offer both
>>>>>>>>>> - help in designing the vision for it, as well as their time to implement
>>>>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> I think it is crucial that those people who will be implementing
>>>>>>>>>> it, will be the main people who make design decisions about it, as I would
>>>>>>>>>> love to have a strong group of people who would like to not only take part
>>>>>>>>>> in developing it but also in maintaining it in the future.
>>>>>>>>>>
>>>>>>>>>> J.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <
>>>>>>>>>> ash@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>>>>>>
>>>>>>>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l
>>>>>>>>>>> 6911
>>>>>>>>>>>
>>>>>>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>>>>>>> magnitude ;)
>>>>>>>>>>>
>>>>>>>>>>> I would also say: make breeze do less. Right now it is three
>>>>>>>>>>> major things:
>>>>>>>>>>>
>>>>>>>>>>> * A local development environment
>>>>>>>>>>> * CI runner
>>>>>>>>>>> * It's recently grown the ability to run airflow for developing
>>>>>>>>>>> dags.
>>>>>>>>>>>
>>>>>>>>>>> That is too much. Yes there is overlap, but it's just too much
>>>>>>>>>>> in one tool, and too complex as a result. Some of this should just be
>>>>>>>>>>> replaced with a docker-compose file (that uses published release images,
>>>>>>>>>>> not floating master/nightly) and users told to run that.
>>>>>>>>>>>
>>>>>>>>>>> Make it simpler, fitting a core purpose - running CI
>>>>>>>>>>> consistently should be it's only goal.
>>>>>>>>>>>
>>>>>>>>>>> -ash
>>>>>>>>>>>
>>>>>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <
>>>>>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello Everyone,
>>>>>>>>>>>
>>>>>>>>>>> TL; DR; I was thinking for quite a while on this and I think
>>>>>>>>>>> this is the right time to raise that subject. It's been asked several
>>>>>>>>>>> times, why Breeze is not written in something else than Bash since it is
>>>>>>>>>>> "that big" or some people said "monstrous" :). I think it's the right time
>>>>>>>>>>> to start a "rewrite" project with wide community involvement and Python
>>>>>>>>>>> seems to be the best choice :).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> While I was opposing this while we were focusing on Airflow 2.0,
>>>>>>>>>>> and there are some good reasons why initially I started Breeze in Bash, I
>>>>>>>>>>> think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully
>>>>>>>>>>> based on Python 3.6 and with some "stability" and "good set of features" we
>>>>>>>>>>> have in Breeze and a good level of modularisation we achieved - it's the
>>>>>>>>>>> right time to think about a rewrite.
>>>>>>>>>>>
>>>>>>>>>>> I did not raise this subject to add a distraction on top of what
>>>>>>>>>>> is already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>>>>>>> closer.
>>>>>>>>>>>
>>>>>>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>>>>>>> and turn it into a true community effort where a lot of people will
>>>>>>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>>>>>>> most importantly - make more people from the community know about how our
>>>>>>>>>>> CI and development environment works and be able to solve any problems
>>>>>>>>>>> there.
>>>>>>>>>>>
>>>>>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our
>>>>>>>>>>> CI working and I am mostly the single point of contact (and failure!) when
>>>>>>>>>>> it comes to that - I would love to not be one :) and I think with most of
>>>>>>>>>>> the core committers busy with 2.0, this is also an opportunity for more of
>>>>>>>>>>> the contributors to take their part in it (and eventually earn their rank
>>>>>>>>>>> to become committers!). For the core committers, this is an extra
>>>>>>>>>>> opportunity to learn how the system works, influence its design, and
>>>>>>>>>>> possibly simplify some parts of it - even if they will be mostly focused on
>>>>>>>>>>> 2.0.
>>>>>>>>>>>
>>>>>>>>>>> I would like to do it well - write some assumptions in a design
>>>>>>>>>>> doc, plan the work and split it into separate issues, and lead the effort -
>>>>>>>>>>> but I would love if most of the work is done by others, who would then
>>>>>>>>>>> become familiar with the whole of it.
>>>>>>>>>>>
>>>>>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the
>>>>>>>>>>> right time? Are there some people in the community who would like to take
>>>>>>>>>>> part in it?
>>>>>>>>>>>
>>>>>>>>>>> J.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jarek Potiuk
>>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Jarek Potiuk
>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jarek Potiuk
>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>
>>>>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Kaxil Naik <ka...@gmail.com>.
My point was not only about writing it post 2.0. I am proposing to even
start planning/discussion about it after 2.0.

There is a lot going on currently. And any planning / proposal will start
discussions and I would propose that we wait after 2.0 to even collect
suggestions and proposals to keep our focus completely on 2.0.

Regards,
Kaxil

On Thu, Nov 12, 2020 at 11:11 AM Kamil Breguła <ka...@polidea.com>
wrote:

> Hello,
>
> I personally tried to make various changes to Breez many times and was
> always afraid that I would do something wrong because I would miss
> something. Breeze has too many global variables and tricks to be easily
> managed through ad-hoc contributions.
>
> Python is a very good idea and I'm already trying to write an all-new
> feature in Python. Luckily, Bash and Python complement each other well, so
> it's not a problem for one Bash script to run a Python script and a Python
> script to run a Bash script. This may allow us to migrate smoothly from
> Python to Bash.
>
> Best regards,
> Kamil Breguła
>
> On Thu, Nov 12, 2020 at 11:52 AM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> My intention is not to rewrite it now, but start doing it when we get a
>> stable 2.0 release, to know what we want to achieve and plan it, and have a
>> team aligned on it -  so that we can actually start doing it whenever we
>> feel 2.0 is "stable" and there is nothing of higher priority.
>>
>> But I will start discussion and doc on "scope", "use cases" and "users" -
>> so that we know what we DO and what we DO NOT do with Breeze.
>>
>> My goal is simple" "It's a Breeze to *develop *Airflow". It's not about
>> "using Airflow", it's not about "trying out Airflow", it's not about
>> "writing and testing DAGs" - if there is a need for that, this should be a
>> different tool/project.
>>
>> The "users" of Breeze are only contributors. Full Stop. For "Airflow
>> users" - if they are not contributors, Breeze will be useless for them. And
>> that's intended.
>>
>> I would like to clarify that goal and assumptions soon, so I am preparing
>> a short doc where I put my assumptions about that, but in the scope of it,
>> I want to keep the focus of "developing Airflow" only.
>>
>> This is my primary concern - that there are some ideas on what to do with
>> Breeze that go far beyond that primary goal. But I would like to keep
>> Breeze within those boundaries only.
>>
>> And I am happy to help with other initiatives to answer other needs, but
>> those should be separate IMHO.
>>
>> J.
>>
>>
>> On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman <
>> daniel.imberman@gmail.com> wrote:
>>
>>> I am all for rewriting breeze, but I think waiting until after 2.0 makes
>>> the most sense. Python could work, but let’s be intentional about the
>>> decision before we choose.
>>>
>>> via Newton Mail
>>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2>
>>>
>>> On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd...@gmail.com>
>>> wrote:
>>>
>>> I agree with Kaxil’s point (or even a bit later, say when 2.0 gets
>>> relatively more “stable”).
>>>
>>> My aspect is more about to concentrate development/community focus.
>>>
>>>
>>> XD
>>>
>>> On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <ka...@gmail.com> wrote:
>>>
>>>> I think we should wait until 2.0 is out before discussing or even
>>>> gathering feedback. As I am sure any feedback will trigger a discussion.
>>>>
>>>> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>>> Andrew,
>>>>>
>>>>> Thanks for chiming in - just to answer your questions and clarify the
>>>>> scope of the discussion:
>>>>>
>>>>> Breeze is for developing Airflow itself, it's purpose is not to
>>>>> develop and run DAGs. It was never intended to be used by the "users" of
>>>>> Airflow or DAG development or testing the DAGs. And while we were pondering
>>>>> with that thought recently, I think it never will be this, it is simply not
>>>>> fit for the purpose.
>>>>>
>>>>> Even the "start-airflow" command is there mainly for the developers of
>>>>> Airflow, not for the users of it. For example, it can be quickly used to
>>>>> test if a new release candidate for Apache Aiirflow "works" - thanks to it
>>>>> in a few minutes I can run a released version of Airflow in several
>>>>> combinations of python/backend and see that it generally "works".
>>>>>
>>>>> So for the docker-compose user production image" - sure, it is needed
>>>>> but this is a different issue, different users, and a completely different
>>>>> use-case (even if "docker-compose" name is there too). Those two are
>>>>> completely different use-cases, starting from the fact that even the docker
>>>>> image used there is different. Maybe this is what both you and Ash are
>>>>> talking about. In which case I fully agree it's needed, but I believe we
>>>>> are not talking about it here.
>>>>>
>>>>> If you want to have this kind of approach you are talking about, you
>>>>> can take a look at the issue here:
>>>>> https://github.com/apache/airflow/issues/8605. Nobody works on it
>>>>> actively now, but I would love someone who takes a lead on it and completes
>>>>> it. I am happy to help and review it as much as I can. But maybe you would
>>>>> like to take a lead on it Andrew since you have some experience and real
>>>>> use case behind? I think we need people there who are actual users of
>>>>> Airflow - which sadly, I am mostly not one :)
>>>>>
>>>>> But let's not mix the two please :). I'd love to keep this thread
>>>>> focused on *"Breeze, the development environment for Airflow itself"*.
>>>>> Even the tagline of Breeze "*It's a Breeze to develop Airflow*."
>>>>> rather than "It's a Breeze to develop DAGs"
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
>>>>> wrote:
>>>>>
>>>>>> Tomek:
>>>>>>
>>>>>> I started the discussion here, so just everyone is aware of it even
>>>>>> if they are not watching GH issues. I now created the GH Issue
>>>>>> https://github.com/apache/airflow/issues/12282 so that I can gather
>>>>>> together people with some interest and I think it's best to continue the
>>>>>> discussion there.
>>>>>>
>>>>>> What I plan to do within the next few days, is to start a design
>>>>>> document and design discussion. I would like to start with defining the
>>>>>> actual users of Breeze, the use-cases it should serve, the purpose, and the
>>>>>> set of assumptions that it should have. And only after we hash it all out,
>>>>>> I would like to define the scope, decide whether we want to have one or
>>>>>> many different tools for different users, how much of it is common and
>>>>>> whether we can remove some of it completely or simplify it.
>>>>>>
>>>>>> I think we've gathered enormous experience from various levels of
>>>>>> developers while using Breeze and it's a perfect moment to discuss (with
>>>>>> those various users) what is useful, for whom, what makes sense, and how to
>>>>>> provide the best interface. I see the current Breeze as a learning platform
>>>>>> on what is useful and what is not, and I would love - this time - so that
>>>>>> decisions in it are made by the actual users (of a various kind). And I
>>>>>> would love to lead it - not as a developer this time, but as a "product
>>>>>> manager" - listening to various voices and trying to make the best of it,
>>>>>> reaching some consensus and working with others to implement it. I think
>>>>>> this is the best use of the experience we had with Breeze and the
>>>>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>>>>>> different experience.
>>>>>>
>>>>>> J.
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <
>>>>>> andrewharmonllc@gmail.com> wrote:
>>>>>>
>>>>>>> I would agree as an end user, I’m not really sure what Breeze does.
>>>>>>> Is it for CI or is it a way to quickly spin up a containerized env for
>>>>>>> local development. I do think it would be great to have something similar
>>>>>>> to Puckel that uses official airflow images. Very easy to quickly get
>>>>>>> started with to give airflow a try, but also a jumping off point for
>>>>>>> organizations to customize it to their needs. If this is decker-compose or
>>>>>>> something else, that’s fine. We use a customized version of puckel for all
>>>>>>> the engineers to do local dag development. It would be great if this was
>>>>>>> more “official” Airflow. I agree that python would make it easier for
>>>>>>> others to contribute. Finally, very clear documentation on the Airflow site
>>>>>>> would be very helpful too.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Andrew Harmon
>>>>>>>
>>>>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>> +1 for using python.
>>>>>>>
>>>>>>> > I would also say: make breeze do less. Right now it is three major
>>>>>>> things:
>>>>>>> > * A local development environment
>>>>>>> > * CI runner
>>>>>>> > * It's recently grown the ability to run airflow for developing
>>>>>>> dags.
>>>>>>>
>>>>>>> My first thought was similar - breeze does too much now. However, I
>>>>>>> think the problem is not in plenty of functionality but in technology used
>>>>>>> - bash. Using python or any other language will let us create a nice and
>>>>>>> clear structure for the project that will be easy to onboard, reason about
>>>>>>> and manage.
>>>>>>>
>>>>>>> Structuring breeze may allow us to leverage using separate docker
>>>>>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>>>>>> I like the way in which breeze is a "layer over docker" and I think this
>>>>>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>>>>>> even if I use half of the functions it has.
>>>>>>>
>>>>>>> *Note:* where should we continue the discussion? The official place
>>>>>>> is devlist, but we have GH issue. Which one should we use to avoid two
>>>>>>> separate discussions?
>>>>>>>
>>>>>>> Tomek
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <
>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>
>>>>>>>> I also created issue for it:
>>>>>>>> https://github.com/apache/airflow/issues/12282
>>>>>>>>
>>>>>>>> Anyone interested in taking part - please comment there!
>>>>>>>>
>>>>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <
>>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>>
>>>>>>>>> You screamed (among many others) and I listened :). And I think
>>>>>>>>> the time is now to act.
>>>>>>>>>
>>>>>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>>>>>> or fresh contributors).
>>>>>>>>>
>>>>>>>>> For now, my vision is quite a bit different than yours Ash :). But
>>>>>>>>> I do not want to start a design discussion just yet, I want to make
>>>>>>>>> breathing space for others to chime in.
>>>>>>>>>
>>>>>>>>> I would love to hear many voices and interests of people before we
>>>>>>>>> deep dive into what "Breeze 2" might look like.
>>>>>>>>>
>>>>>>>>> What I am interested in is whether:
>>>>>>>>>
>>>>>>>>> a) it's the right time
>>>>>>>>> b) python is the right choice
>>>>>>>>> c) do I have several people who would like to join and offer both
>>>>>>>>> - help in designing the vision for it, as well as their time to implement
>>>>>>>>> it.
>>>>>>>>>
>>>>>>>>> I think it is crucial that those people who will be implementing
>>>>>>>>> it, will be the main people who make design decisions about it, as I would
>>>>>>>>> love to have a strong group of people who would like to not only take part
>>>>>>>>> in developing it but also in maintaining it in the future.
>>>>>>>>>
>>>>>>>>> J.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>>>>>
>>>>>>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l
>>>>>>>>>> 6911
>>>>>>>>>>
>>>>>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>>>>>> magnitude ;)
>>>>>>>>>>
>>>>>>>>>> I would also say: make breeze do less. Right now it is three
>>>>>>>>>> major things:
>>>>>>>>>>
>>>>>>>>>> * A local development environment
>>>>>>>>>> * CI runner
>>>>>>>>>> * It's recently grown the ability to run airflow for developing
>>>>>>>>>> dags.
>>>>>>>>>>
>>>>>>>>>> That is too much. Yes there is overlap, but it's just too much in
>>>>>>>>>> one tool, and too complex as a result. Some of this should just be replaced
>>>>>>>>>> with a docker-compose file (that uses published release images, not
>>>>>>>>>> floating master/nightly) and users told to run that.
>>>>>>>>>>
>>>>>>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>>>>>>> should be it's only goal.
>>>>>>>>>>
>>>>>>>>>> -ash
>>>>>>>>>>
>>>>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <
>>>>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hello Everyone,
>>>>>>>>>>
>>>>>>>>>> TL; DR; I was thinking for quite a while on this and I think this
>>>>>>>>>> is the right time to raise that subject. It's been asked several times, why
>>>>>>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>>>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>>>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>>>>>>> the best choice :).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> While I was opposing this while we were focusing on Airflow 2.0,
>>>>>>>>>> and there are some good reasons why initially I started Breeze in Bash, I
>>>>>>>>>> think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully
>>>>>>>>>> based on Python 3.6 and with some "stability" and "good set of features" we
>>>>>>>>>> have in Breeze and a good level of modularisation we achieved - it's the
>>>>>>>>>> right time to think about a rewrite.
>>>>>>>>>>
>>>>>>>>>> I did not raise this subject to add a distraction on top of what
>>>>>>>>>> is already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>>>>>> closer.
>>>>>>>>>>
>>>>>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>>>>>> and turn it into a true community effort where a lot of people will
>>>>>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>>>>>> most importantly - make more people from the community know about how our
>>>>>>>>>> CI and development environment works and be able to solve any problems
>>>>>>>>>> there.
>>>>>>>>>>
>>>>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>>>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>>>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>>>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>>>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>>>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>>>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>>>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>>>>>>
>>>>>>>>>> I would like to do it well - write some assumptions in a design
>>>>>>>>>> doc, plan the work and split it into separate issues, and lead the effort -
>>>>>>>>>> but I would love if most of the work is done by others, who would then
>>>>>>>>>> become familiar with the whole of it.
>>>>>>>>>>
>>>>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the
>>>>>>>>>> right time? Are there some people in the community who would like to take
>>>>>>>>>> part in it?
>>>>>>>>>>
>>>>>>>>>> J.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Jarek Potiuk
>>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jarek Potiuk
>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jarek Potiuk
>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>

Re: Rewriting Breeze in Python ?

Posted by Kamil Breguła <ka...@polidea.com>.
Hello,

I personally tried to make various changes to Breez many times and was
always afraid that I would do something wrong because I would miss
something. Breeze has too many global variables and tricks to be easily
managed through ad-hoc contributions.

Python is a very good idea and I'm already trying to write an all-new
feature in Python. Luckily, Bash and Python complement each other well, so
it's not a problem for one Bash script to run a Python script and a Python
script to run a Bash script. This may allow us to migrate smoothly from
Python to Bash.

Best regards,
Kamil Breguła

On Thu, Nov 12, 2020 at 11:52 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> My intention is not to rewrite it now, but start doing it when we get a
> stable 2.0 release, to know what we want to achieve and plan it, and have a
> team aligned on it -  so that we can actually start doing it whenever we
> feel 2.0 is "stable" and there is nothing of higher priority.
>
> But I will start discussion and doc on "scope", "use cases" and "users" -
> so that we know what we DO and what we DO NOT do with Breeze.
>
> My goal is simple" "It's a Breeze to *develop *Airflow". It's not about
> "using Airflow", it's not about "trying out Airflow", it's not about
> "writing and testing DAGs" - if there is a need for that, this should be a
> different tool/project.
>
> The "users" of Breeze are only contributors. Full Stop. For "Airflow
> users" - if they are not contributors, Breeze will be useless for them. And
> that's intended.
>
> I would like to clarify that goal and assumptions soon, so I am preparing
> a short doc where I put my assumptions about that, but in the scope of it,
> I want to keep the focus of "developing Airflow" only.
>
> This is my primary concern - that there are some ideas on what to do with
> Breeze that go far beyond that primary goal. But I would like to keep
> Breeze within those boundaries only.
>
> And I am happy to help with other initiatives to answer other needs, but
> those should be separate IMHO.
>
> J.
>
>
> On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman <da...@gmail.com>
> wrote:
>
>> I am all for rewriting breeze, but I think waiting until after 2.0 makes
>> the most sense. Python could work, but let’s be intentional about the
>> decision before we choose.
>>
>> via Newton Mail
>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2>
>>
>> On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd...@gmail.com>
>> wrote:
>>
>> I agree with Kaxil’s point (or even a bit later, say when 2.0 gets
>> relatively more “stable”).
>>
>> My aspect is more about to concentrate development/community focus.
>>
>>
>> XD
>>
>> On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <ka...@gmail.com> wrote:
>>
>>> I think we should wait until 2.0 is out before discussing or even
>>> gathering feedback. As I am sure any feedback will trigger a discussion.
>>>
>>> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> Andrew,
>>>>
>>>> Thanks for chiming in - just to answer your questions and clarify the
>>>> scope of the discussion:
>>>>
>>>> Breeze is for developing Airflow itself, it's purpose is not to develop
>>>> and run DAGs. It was never intended to be used by the "users" of Airflow or
>>>> DAG development or testing the DAGs. And while we were pondering with that
>>>> thought recently, I think it never will be this, it is simply not fit for
>>>> the purpose.
>>>>
>>>> Even the "start-airflow" command is there mainly for the developers of
>>>> Airflow, not for the users of it. For example, it can be quickly used to
>>>> test if a new release candidate for Apache Aiirflow "works" - thanks to it
>>>> in a few minutes I can run a released version of Airflow in several
>>>> combinations of python/backend and see that it generally "works".
>>>>
>>>> So for the docker-compose user production image" - sure, it is needed
>>>> but this is a different issue, different users, and a completely different
>>>> use-case (even if "docker-compose" name is there too). Those two are
>>>> completely different use-cases, starting from the fact that even the docker
>>>> image used there is different. Maybe this is what both you and Ash are
>>>> talking about. In which case I fully agree it's needed, but I believe we
>>>> are not talking about it here.
>>>>
>>>> If you want to have this kind of approach you are talking about, you
>>>> can take a look at the issue here:
>>>> https://github.com/apache/airflow/issues/8605. Nobody works on it
>>>> actively now, but I would love someone who takes a lead on it and completes
>>>> it. I am happy to help and review it as much as I can. But maybe you would
>>>> like to take a lead on it Andrew since you have some experience and real
>>>> use case behind? I think we need people there who are actual users of
>>>> Airflow - which sadly, I am mostly not one :)
>>>>
>>>> But let's not mix the two please :). I'd love to keep this thread
>>>> focused on *"Breeze, the development environment for Airflow itself"*.
>>>> Even the tagline of Breeze "*It's a Breeze to develop Airflow*."
>>>> rather than "It's a Breeze to develop DAGs"
>>>>
>>>> J.
>>>>
>>>>
>>>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>>> Tomek:
>>>>>
>>>>> I started the discussion here, so just everyone is aware of it even if
>>>>> they are not watching GH issues. I now created the GH Issue
>>>>> https://github.com/apache/airflow/issues/12282 so that I can gather
>>>>> together people with some interest and I think it's best to continue the
>>>>> discussion there.
>>>>>
>>>>> What I plan to do within the next few days, is to start a design
>>>>> document and design discussion. I would like to start with defining the
>>>>> actual users of Breeze, the use-cases it should serve, the purpose, and the
>>>>> set of assumptions that it should have. And only after we hash it all out,
>>>>> I would like to define the scope, decide whether we want to have one or
>>>>> many different tools for different users, how much of it is common and
>>>>> whether we can remove some of it completely or simplify it.
>>>>>
>>>>> I think we've gathered enormous experience from various levels of
>>>>> developers while using Breeze and it's a perfect moment to discuss (with
>>>>> those various users) what is useful, for whom, what makes sense, and how to
>>>>> provide the best interface. I see the current Breeze as a learning platform
>>>>> on what is useful and what is not, and I would love - this time - so that
>>>>> decisions in it are made by the actual users (of a various kind). And I
>>>>> would love to lead it - not as a developer this time, but as a "product
>>>>> manager" - listening to various voices and trying to make the best of it,
>>>>> reaching some consensus and working with others to implement it. I think
>>>>> this is the best use of the experience we had with Breeze and the
>>>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>>>>> different experience.
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <
>>>>> andrewharmonllc@gmail.com> wrote:
>>>>>
>>>>>> I would agree as an end user, I’m not really sure what Breeze does.
>>>>>> Is it for CI or is it a way to quickly spin up a containerized env for
>>>>>> local development. I do think it would be great to have something similar
>>>>>> to Puckel that uses official airflow images. Very easy to quickly get
>>>>>> started with to give airflow a try, but also a jumping off point for
>>>>>> organizations to customize it to their needs. If this is decker-compose or
>>>>>> something else, that’s fine. We use a customized version of puckel for all
>>>>>> the engineers to do local dag development. It would be great if this was
>>>>>> more “official” Airflow. I agree that python would make it easier for
>>>>>> others to contribute. Finally, very clear documentation on the Airflow site
>>>>>> would be very helpful too.
>>>>>>
>>>>>> Thanks,
>>>>>> Andrew Harmon
>>>>>>
>>>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>> +1 for using python.
>>>>>>
>>>>>> > I would also say: make breeze do less. Right now it is three major
>>>>>> things:
>>>>>> > * A local development environment
>>>>>> > * CI runner
>>>>>> > * It's recently grown the ability to run airflow for developing
>>>>>> dags.
>>>>>>
>>>>>> My first thought was similar - breeze does too much now. However, I
>>>>>> think the problem is not in plenty of functionality but in technology used
>>>>>> - bash. Using python or any other language will let us create a nice and
>>>>>> clear structure for the project that will be easy to onboard, reason about
>>>>>> and manage.
>>>>>>
>>>>>> Structuring breeze may allow us to leverage using separate docker
>>>>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>>>>> I like the way in which breeze is a "layer over docker" and I think this
>>>>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>>>>> even if I use half of the functions it has.
>>>>>>
>>>>>> *Note:* where should we continue the discussion? The official place
>>>>>> is devlist, but we have GH issue. Which one should we use to avoid two
>>>>>> separate discussions?
>>>>>>
>>>>>> Tomek
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <
>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>
>>>>>>> I also created issue for it:
>>>>>>> https://github.com/apache/airflow/issues/12282
>>>>>>>
>>>>>>> Anyone interested in taking part - please comment there!
>>>>>>>
>>>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <
>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>
>>>>>>>> You screamed (among many others) and I listened :). And I think the
>>>>>>>> time is now to act.
>>>>>>>>
>>>>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>>>>> or fresh contributors).
>>>>>>>>
>>>>>>>> For now, my vision is quite a bit different than yours Ash :). But
>>>>>>>> I do not want to start a design discussion just yet, I want to make
>>>>>>>> breathing space for others to chime in.
>>>>>>>>
>>>>>>>> I would love to hear many voices and interests of people before we
>>>>>>>> deep dive into what "Breeze 2" might look like.
>>>>>>>>
>>>>>>>> What I am interested in is whether:
>>>>>>>>
>>>>>>>> a) it's the right time
>>>>>>>> b) python is the right choice
>>>>>>>> c) do I have several people who would like to join and offer both -
>>>>>>>> help in designing the vision for it, as well as their time to implement it.
>>>>>>>>
>>>>>>>> I think it is crucial that those people who will be implementing
>>>>>>>> it, will be the main people who make design decisions about it, as I would
>>>>>>>> love to have a strong group of people who would like to not only take part
>>>>>>>> in developing it but also in maintaining it in the future.
>>>>>>>>
>>>>>>>> J.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>>>>
>>>>>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l
>>>>>>>>> 6911
>>>>>>>>>
>>>>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>>>>> magnitude ;)
>>>>>>>>>
>>>>>>>>> I would also say: make breeze do less. Right now it is three
>>>>>>>>> major things:
>>>>>>>>>
>>>>>>>>> * A local development environment
>>>>>>>>> * CI runner
>>>>>>>>> * It's recently grown the ability to run airflow for developing
>>>>>>>>> dags.
>>>>>>>>>
>>>>>>>>> That is too much. Yes there is overlap, but it's just too much in
>>>>>>>>> one tool, and too complex as a result. Some of this should just be replaced
>>>>>>>>> with a docker-compose file (that uses published release images, not
>>>>>>>>> floating master/nightly) and users told to run that.
>>>>>>>>>
>>>>>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>>>>>> should be it's only goal.
>>>>>>>>>
>>>>>>>>> -ash
>>>>>>>>>
>>>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello Everyone,
>>>>>>>>>
>>>>>>>>> TL; DR; I was thinking for quite a while on this and I think this
>>>>>>>>> is the right time to raise that subject. It's been asked several times, why
>>>>>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>>>>>> the best choice :).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> While I was opposing this while we were focusing on Airflow 2.0,
>>>>>>>>> and there are some good reasons why initially I started Breeze in Bash, I
>>>>>>>>> think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully
>>>>>>>>> based on Python 3.6 and with some "stability" and "good set of features" we
>>>>>>>>> have in Breeze and a good level of modularisation we achieved - it's the
>>>>>>>>> right time to think about a rewrite.
>>>>>>>>>
>>>>>>>>> I did not raise this subject to add a distraction on top of what
>>>>>>>>> is already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>>>>> closer.
>>>>>>>>>
>>>>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>>>>> and turn it into a true community effort where a lot of people will
>>>>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>>>>> most importantly - make more people from the community know about how our
>>>>>>>>> CI and development environment works and be able to solve any problems
>>>>>>>>> there.
>>>>>>>>>
>>>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>>>>>
>>>>>>>>> I would like to do it well - write some assumptions in a design
>>>>>>>>> doc, plan the work and split it into separate issues, and lead the effort -
>>>>>>>>> but I would love if most of the work is done by others, who would then
>>>>>>>>> become familiar with the whole of it.
>>>>>>>>>
>>>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the
>>>>>>>>> right time? Are there some people in the community who would like to take
>>>>>>>>> part in it?
>>>>>>>>>
>>>>>>>>> J.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jarek Potiuk
>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jarek Potiuk
>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
BTW. I've also learned recently about this project
https://github.com/earthly/earthly.

For those who are interested - they have a really nice description of what
motivations they have, where it fits, what problem it solves, and what
"niche" in the development process it fills.

From what I see - the needs they address are very, very close to what
Breeze does. And It might even be that we will propose Earthly to be used
as a foundation for the new Breeze2.

We might not necessarily have to write all of it from the scratch - but
rather "stand on the shoulders of giants".

I think all options are open as long as we focus on the "needs", "users"
and "use cases" that we want to address.

J.


On Thu, Nov 12, 2020 at 11:52 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> My intention is not to rewrite it now, but start doing it when we get a
> stable 2.0 release, to know what we want to achieve and plan it, and have a
> team aligned on it -  so that we can actually start doing it whenever we
> feel 2.0 is "stable" and there is nothing of higher priority.
>
> But I will start discussion and doc on "scope", "use cases" and "users" -
> so that we know what we DO and what we DO NOT do with Breeze.
>
> My goal is simple" "It's a Breeze to *develop *Airflow". It's not about
> "using Airflow", it's not about "trying out Airflow", it's not about
> "writing and testing DAGs" - if there is a need for that, this should be a
> different tool/project.
>
> The "users" of Breeze are only contributors. Full Stop. For "Airflow
> users" - if they are not contributors, Breeze will be useless for them. And
> that's intended.
>
> I would like to clarify that goal and assumptions soon, so I am preparing
> a short doc where I put my assumptions about that, but in the scope of it,
> I want to keep the focus of "developing Airflow" only.
>
> This is my primary concern - that there are some ideas on what to do with
> Breeze that go far beyond that primary goal. But I would like to keep
> Breeze within those boundaries only.
>
> And I am happy to help with other initiatives to answer other needs, but
> those should be separate IMHO.
>
> J.
>
>
> On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman <da...@gmail.com>
> wrote:
>
>> I am all for rewriting breeze, but I think waiting until after 2.0 makes
>> the most sense. Python could work, but let’s be intentional about the
>> decision before we choose.
>>
>> via Newton Mail
>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2>
>>
>> On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd...@gmail.com>
>> wrote:
>>
>> I agree with Kaxil’s point (or even a bit later, say when 2.0 gets
>> relatively more “stable”).
>>
>> My aspect is more about to concentrate development/community focus.
>>
>>
>> XD
>>
>> On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <ka...@gmail.com> wrote:
>>
>>> I think we should wait until 2.0 is out before discussing or even
>>> gathering feedback. As I am sure any feedback will trigger a discussion.
>>>
>>> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> Andrew,
>>>>
>>>> Thanks for chiming in - just to answer your questions and clarify the
>>>> scope of the discussion:
>>>>
>>>> Breeze is for developing Airflow itself, it's purpose is not to develop
>>>> and run DAGs. It was never intended to be used by the "users" of Airflow or
>>>> DAG development or testing the DAGs. And while we were pondering with that
>>>> thought recently, I think it never will be this, it is simply not fit for
>>>> the purpose.
>>>>
>>>> Even the "start-airflow" command is there mainly for the developers of
>>>> Airflow, not for the users of it. For example, it can be quickly used to
>>>> test if a new release candidate for Apache Aiirflow "works" - thanks to it
>>>> in a few minutes I can run a released version of Airflow in several
>>>> combinations of python/backend and see that it generally "works".
>>>>
>>>> So for the docker-compose user production image" - sure, it is needed
>>>> but this is a different issue, different users, and a completely different
>>>> use-case (even if "docker-compose" name is there too). Those two are
>>>> completely different use-cases, starting from the fact that even the docker
>>>> image used there is different. Maybe this is what both you and Ash are
>>>> talking about. In which case I fully agree it's needed, but I believe we
>>>> are not talking about it here.
>>>>
>>>> If you want to have this kind of approach you are talking about, you
>>>> can take a look at the issue here:
>>>> https://github.com/apache/airflow/issues/8605. Nobody works on it
>>>> actively now, but I would love someone who takes a lead on it and completes
>>>> it. I am happy to help and review it as much as I can. But maybe you would
>>>> like to take a lead on it Andrew since you have some experience and real
>>>> use case behind? I think we need people there who are actual users of
>>>> Airflow - which sadly, I am mostly not one :)
>>>>
>>>> But let's not mix the two please :). I'd love to keep this thread
>>>> focused on *"Breeze, the development environment for Airflow itself"*.
>>>> Even the tagline of Breeze "*It's a Breeze to develop Airflow*."
>>>> rather than "It's a Breeze to develop DAGs"
>>>>
>>>> J.
>>>>
>>>>
>>>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>>> Tomek:
>>>>>
>>>>> I started the discussion here, so just everyone is aware of it even if
>>>>> they are not watching GH issues. I now created the GH Issue
>>>>> https://github.com/apache/airflow/issues/12282 so that I can gather
>>>>> together people with some interest and I think it's best to continue the
>>>>> discussion there.
>>>>>
>>>>> What I plan to do within the next few days, is to start a design
>>>>> document and design discussion. I would like to start with defining the
>>>>> actual users of Breeze, the use-cases it should serve, the purpose, and the
>>>>> set of assumptions that it should have. And only after we hash it all out,
>>>>> I would like to define the scope, decide whether we want to have one or
>>>>> many different tools for different users, how much of it is common and
>>>>> whether we can remove some of it completely or simplify it.
>>>>>
>>>>> I think we've gathered enormous experience from various levels of
>>>>> developers while using Breeze and it's a perfect moment to discuss (with
>>>>> those various users) what is useful, for whom, what makes sense, and how to
>>>>> provide the best interface. I see the current Breeze as a learning platform
>>>>> on what is useful and what is not, and I would love - this time - so that
>>>>> decisions in it are made by the actual users (of a various kind). And I
>>>>> would love to lead it - not as a developer this time, but as a "product
>>>>> manager" - listening to various voices and trying to make the best of it,
>>>>> reaching some consensus and working with others to implement it. I think
>>>>> this is the best use of the experience we had with Breeze and the
>>>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>>>>> different experience.
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <
>>>>> andrewharmonllc@gmail.com> wrote:
>>>>>
>>>>>> I would agree as an end user, I’m not really sure what Breeze does.
>>>>>> Is it for CI or is it a way to quickly spin up a containerized env for
>>>>>> local development. I do think it would be great to have something similar
>>>>>> to Puckel that uses official airflow images. Very easy to quickly get
>>>>>> started with to give airflow a try, but also a jumping off point for
>>>>>> organizations to customize it to their needs. If this is decker-compose or
>>>>>> something else, that’s fine. We use a customized version of puckel for all
>>>>>> the engineers to do local dag development. It would be great if this was
>>>>>> more “official” Airflow. I agree that python would make it easier for
>>>>>> others to contribute. Finally, very clear documentation on the Airflow site
>>>>>> would be very helpful too.
>>>>>>
>>>>>> Thanks,
>>>>>> Andrew Harmon
>>>>>>
>>>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>> +1 for using python.
>>>>>>
>>>>>> > I would also say: make breeze do less. Right now it is three major
>>>>>> things:
>>>>>> > * A local development environment
>>>>>> > * CI runner
>>>>>> > * It's recently grown the ability to run airflow for developing
>>>>>> dags.
>>>>>>
>>>>>> My first thought was similar - breeze does too much now. However, I
>>>>>> think the problem is not in plenty of functionality but in technology used
>>>>>> - bash. Using python or any other language will let us create a nice and
>>>>>> clear structure for the project that will be easy to onboard, reason about
>>>>>> and manage.
>>>>>>
>>>>>> Structuring breeze may allow us to leverage using separate docker
>>>>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>>>>> I like the way in which breeze is a "layer over docker" and I think this
>>>>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>>>>> even if I use half of the functions it has.
>>>>>>
>>>>>> *Note:* where should we continue the discussion? The official place
>>>>>> is devlist, but we have GH issue. Which one should we use to avoid two
>>>>>> separate discussions?
>>>>>>
>>>>>> Tomek
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <
>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>
>>>>>>> I also created issue for it:
>>>>>>> https://github.com/apache/airflow/issues/12282
>>>>>>>
>>>>>>> Anyone interested in taking part - please comment there!
>>>>>>>
>>>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <
>>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>>
>>>>>>>> You screamed (among many others) and I listened :). And I think the
>>>>>>>> time is now to act.
>>>>>>>>
>>>>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>>>>> or fresh contributors).
>>>>>>>>
>>>>>>>> For now, my vision is quite a bit different than yours Ash :). But
>>>>>>>> I do not want to start a design discussion just yet, I want to make
>>>>>>>> breathing space for others to chime in.
>>>>>>>>
>>>>>>>> I would love to hear many voices and interests of people before we
>>>>>>>> deep dive into what "Breeze 2" might look like.
>>>>>>>>
>>>>>>>> What I am interested in is whether:
>>>>>>>>
>>>>>>>> a) it's the right time
>>>>>>>> b) python is the right choice
>>>>>>>> c) do I have several people who would like to join and offer both -
>>>>>>>> help in designing the vision for it, as well as their time to implement it.
>>>>>>>>
>>>>>>>> I think it is crucial that those people who will be implementing
>>>>>>>> it, will be the main people who make design decisions about it, as I would
>>>>>>>> love to have a strong group of people who would like to not only take part
>>>>>>>> in developing it but also in maintaining it in the future.
>>>>>>>>
>>>>>>>> J.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>>>>
>>>>>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l
>>>>>>>>> 6911
>>>>>>>>>
>>>>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>>>>> magnitude ;)
>>>>>>>>>
>>>>>>>>> I would also say: make breeze do less. Right now it is three
>>>>>>>>> major things:
>>>>>>>>>
>>>>>>>>> * A local development environment
>>>>>>>>> * CI runner
>>>>>>>>> * It's recently grown the ability to run airflow for developing
>>>>>>>>> dags.
>>>>>>>>>
>>>>>>>>> That is too much. Yes there is overlap, but it's just too much in
>>>>>>>>> one tool, and too complex as a result. Some of this should just be replaced
>>>>>>>>> with a docker-compose file (that uses published release images, not
>>>>>>>>> floating master/nightly) and users told to run that.
>>>>>>>>>
>>>>>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>>>>>> should be it's only goal.
>>>>>>>>>
>>>>>>>>> -ash
>>>>>>>>>
>>>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello Everyone,
>>>>>>>>>
>>>>>>>>> TL; DR; I was thinking for quite a while on this and I think this
>>>>>>>>> is the right time to raise that subject. It's been asked several times, why
>>>>>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>>>>>> the best choice :).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> While I was opposing this while we were focusing on Airflow 2.0,
>>>>>>>>> and there are some good reasons why initially I started Breeze in Bash, I
>>>>>>>>> think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully
>>>>>>>>> based on Python 3.6 and with some "stability" and "good set of features" we
>>>>>>>>> have in Breeze and a good level of modularisation we achieved - it's the
>>>>>>>>> right time to think about a rewrite.
>>>>>>>>>
>>>>>>>>> I did not raise this subject to add a distraction on top of what
>>>>>>>>> is already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>>>>> closer.
>>>>>>>>>
>>>>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>>>>> and turn it into a true community effort where a lot of people will
>>>>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>>>>> most importantly - make more people from the community know about how our
>>>>>>>>> CI and development environment works and be able to solve any problems
>>>>>>>>> there.
>>>>>>>>>
>>>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>>>>>
>>>>>>>>> I would like to do it well - write some assumptions in a design
>>>>>>>>> doc, plan the work and split it into separate issues, and lead the effort -
>>>>>>>>> but I would love if most of the work is done by others, who would then
>>>>>>>>> become familiar with the whole of it.
>>>>>>>>>
>>>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the
>>>>>>>>> right time? Are there some people in the community who would like to take
>>>>>>>>> part in it?
>>>>>>>>>
>>>>>>>>> J.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jarek Potiuk
>>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jarek Potiuk
>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
My intention is not to rewrite it now, but start doing it when we get a
stable 2.0 release, to know what we want to achieve and plan it, and have a
team aligned on it -  so that we can actually start doing it whenever we
feel 2.0 is "stable" and there is nothing of higher priority.

But I will start discussion and doc on "scope", "use cases" and "users" -
so that we know what we DO and what we DO NOT do with Breeze.

My goal is simple" "It's a Breeze to *develop *Airflow". It's not about
"using Airflow", it's not about "trying out Airflow", it's not about
"writing and testing DAGs" - if there is a need for that, this should be a
different tool/project.

The "users" of Breeze are only contributors. Full Stop. For "Airflow users"
- if they are not contributors, Breeze will be useless for them. And that's
intended.

I would like to clarify that goal and assumptions soon, so I am preparing a
short doc where I put my assumptions about that, but in the scope of it, I
want to keep the focus of "developing Airflow" only.

This is my primary concern - that there are some ideas on what to do with
Breeze that go far beyond that primary goal. But I would like to keep
Breeze within those boundaries only.

And I am happy to help with other initiatives to answer other needs, but
those should be separate IMHO.

J.


On Thu, Nov 12, 2020 at 1:22 AM Daniel Imberman <da...@gmail.com>
wrote:

> I am all for rewriting breeze, but I think waiting until after 2.0 makes
> the most sense. Python could work, but let’s be intentional about the
> decision before we choose.
>
> via Newton Mail
> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2>
>
> On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd...@gmail.com>
> wrote:
>
> I agree with Kaxil’s point (or even a bit later, say when 2.0 gets
> relatively more “stable”).
>
> My aspect is more about to concentrate development/community focus.
>
>
> XD
>
> On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <ka...@gmail.com> wrote:
>
>> I think we should wait until 2.0 is out before discussing or even
>> gathering feedback. As I am sure any feedback will trigger a discussion.
>>
>> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> Andrew,
>>>
>>> Thanks for chiming in - just to answer your questions and clarify the
>>> scope of the discussion:
>>>
>>> Breeze is for developing Airflow itself, it's purpose is not to develop
>>> and run DAGs. It was never intended to be used by the "users" of Airflow or
>>> DAG development or testing the DAGs. And while we were pondering with that
>>> thought recently, I think it never will be this, it is simply not fit for
>>> the purpose.
>>>
>>> Even the "start-airflow" command is there mainly for the developers of
>>> Airflow, not for the users of it. For example, it can be quickly used to
>>> test if a new release candidate for Apache Aiirflow "works" - thanks to it
>>> in a few minutes I can run a released version of Airflow in several
>>> combinations of python/backend and see that it generally "works".
>>>
>>> So for the docker-compose user production image" - sure, it is needed
>>> but this is a different issue, different users, and a completely different
>>> use-case (even if "docker-compose" name is there too). Those two are
>>> completely different use-cases, starting from the fact that even the docker
>>> image used there is different. Maybe this is what both you and Ash are
>>> talking about. In which case I fully agree it's needed, but I believe we
>>> are not talking about it here.
>>>
>>> If you want to have this kind of approach you are talking about, you can
>>> take a look at the issue here:
>>> https://github.com/apache/airflow/issues/8605. Nobody works on it
>>> actively now, but I would love someone who takes a lead on it and completes
>>> it. I am happy to help and review it as much as I can. But maybe you would
>>> like to take a lead on it Andrew since you have some experience and real
>>> use case behind? I think we need people there who are actual users of
>>> Airflow - which sadly, I am mostly not one :)
>>>
>>> But let's not mix the two please :). I'd love to keep this thread
>>> focused on *"Breeze, the development environment for Airflow itself"*.
>>> Even the tagline of Breeze "*It's a Breeze to develop Airflow*." rather
>>> than "It's a Breeze to develop DAGs"
>>>
>>> J.
>>>
>>>
>>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> Tomek:
>>>>
>>>> I started the discussion here, so just everyone is aware of it even if
>>>> they are not watching GH issues. I now created the GH Issue
>>>> https://github.com/apache/airflow/issues/12282 so that I can gather
>>>> together people with some interest and I think it's best to continue the
>>>> discussion there.
>>>>
>>>> What I plan to do within the next few days, is to start a design
>>>> document and design discussion. I would like to start with defining the
>>>> actual users of Breeze, the use-cases it should serve, the purpose, and the
>>>> set of assumptions that it should have. And only after we hash it all out,
>>>> I would like to define the scope, decide whether we want to have one or
>>>> many different tools for different users, how much of it is common and
>>>> whether we can remove some of it completely or simplify it.
>>>>
>>>> I think we've gathered enormous experience from various levels of
>>>> developers while using Breeze and it's a perfect moment to discuss (with
>>>> those various users) what is useful, for whom, what makes sense, and how to
>>>> provide the best interface. I see the current Breeze as a learning platform
>>>> on what is useful and what is not, and I would love - this time - so that
>>>> decisions in it are made by the actual users (of a various kind). And I
>>>> would love to lead it - not as a developer this time, but as a "product
>>>> manager" - listening to various voices and trying to make the best of it,
>>>> reaching some consensus and working with others to implement it. I think
>>>> this is the best use of the experience we had with Breeze and the
>>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>>>> different experience.
>>>>
>>>> J.
>>>>
>>>>
>>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <
>>>> andrewharmonllc@gmail.com> wrote:
>>>>
>>>>> I would agree as an end user, I’m not really sure what Breeze does. Is
>>>>> it for CI or is it a way to quickly spin up a containerized env for local
>>>>> development. I do think it would be great to have something similar to
>>>>> Puckel that uses official airflow images. Very easy to quickly get started
>>>>> with to give airflow a try, but also a jumping off point for organizations
>>>>> to customize it to their needs. If this is decker-compose or something
>>>>> else, that’s fine. We use a customized version of puckel for all the
>>>>> engineers to do local dag development. It would be great if this was more
>>>>> “official” Airflow. I agree that python would make it easier for others to
>>>>> contribute. Finally, very clear documentation on the Airflow site would be
>>>>> very helpful too.
>>>>>
>>>>> Thanks,
>>>>> Andrew Harmon
>>>>>
>>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>>>> wrote:
>>>>>
>>>>> +1 for using python.
>>>>>
>>>>> > I would also say: make breeze do less. Right now it is three major
>>>>> things:
>>>>> > * A local development environment
>>>>> > * CI runner
>>>>> > * It's recently grown the ability to run airflow for developing dags.
>>>>>
>>>>> My first thought was similar - breeze does too much now. However, I
>>>>> think the problem is not in plenty of functionality but in technology used
>>>>> - bash. Using python or any other language will let us create a nice and
>>>>> clear structure for the project that will be easy to onboard, reason about
>>>>> and manage.
>>>>>
>>>>> Structuring breeze may allow us to leverage using separate docker
>>>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>>>> I like the way in which breeze is a "layer over docker" and I think this
>>>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>>>> even if I use half of the functions it has.
>>>>>
>>>>> *Note:* where should we continue the discussion? The official place
>>>>> is devlist, but we have GH issue. Which one should we use to avoid two
>>>>> separate discussions?
>>>>>
>>>>> Tomek
>>>>>
>>>>>
>>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <
>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>
>>>>>> I also created issue for it:
>>>>>> https://github.com/apache/airflow/issues/12282
>>>>>>
>>>>>> Anyone interested in taking part - please comment there!
>>>>>>
>>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <
>>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>>
>>>>>>> You screamed (among many others) and I listened :). And I think the
>>>>>>> time is now to act.
>>>>>>>
>>>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>>>> or fresh contributors).
>>>>>>>
>>>>>>> For now, my vision is quite a bit different than yours Ash :). But I
>>>>>>> do not want to start a design discussion just yet, I want to make breathing
>>>>>>> space for others to chime in.
>>>>>>>
>>>>>>> I would love to hear many voices and interests of people before we
>>>>>>> deep dive into what "Breeze 2" might look like.
>>>>>>>
>>>>>>> What I am interested in is whether:
>>>>>>>
>>>>>>> a) it's the right time
>>>>>>> b) python is the right choice
>>>>>>> c) do I have several people who would like to join and offer both -
>>>>>>> help in designing the vision for it, as well as their time to implement it.
>>>>>>>
>>>>>>> I think it is crucial that those people who will be implementing it,
>>>>>>> will be the main people who make design decisions about it, as I would love
>>>>>>> to have a strong group of people who would like to not only take part in
>>>>>>> developing it but also in maintaining it in the future.
>>>>>>>
>>>>>>> J.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>>>
>>>>>>>> $ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l
>>>>>>>> 6911
>>>>>>>>
>>>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>>>> magnitude ;)
>>>>>>>>
>>>>>>>> I would also say: make breeze do less. Right now it is three major
>>>>>>>> things:
>>>>>>>>
>>>>>>>> * A local development environment
>>>>>>>> * CI runner
>>>>>>>> * It's recently grown the ability to run airflow for developing
>>>>>>>> dags.
>>>>>>>>
>>>>>>>> That is too much. Yes there is overlap, but it's just too much in
>>>>>>>> one tool, and too complex as a result. Some of this should just be replaced
>>>>>>>> with a docker-compose file (that uses published release images, not
>>>>>>>> floating master/nightly) and users told to run that.
>>>>>>>>
>>>>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>>>>> should be it's only goal.
>>>>>>>>
>>>>>>>> -ash
>>>>>>>>
>>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hello Everyone,
>>>>>>>>
>>>>>>>> TL; DR; I was thinking for quite a while on this and I think this
>>>>>>>> is the right time to raise that subject. It's been asked several times, why
>>>>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>>>>> the best choice :).
>>>>>>>>
>>>>>>>>
>>>>>>>> While I was opposing this while we were focusing on Airflow 2.0,
>>>>>>>> and there are some good reasons why initially I started Breeze in Bash, I
>>>>>>>> think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully
>>>>>>>> based on Python 3.6 and with some "stability" and "good set of features" we
>>>>>>>> have in Breeze and a good level of modularisation we achieved - it's the
>>>>>>>> right time to think about a rewrite.
>>>>>>>>
>>>>>>>> I did not raise this subject to add a distraction on top of what is
>>>>>>>> already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>>>> closer.
>>>>>>>>
>>>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>>>> and turn it into a true community effort where a lot of people will
>>>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>>>> most importantly - make more people from the community know about how our
>>>>>>>> CI and development environment works and be able to solve any problems
>>>>>>>> there.
>>>>>>>>
>>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>>>>
>>>>>>>> I would like to do it well - write some assumptions in a design
>>>>>>>> doc, plan the work and split it into separate issues, and lead the effort -
>>>>>>>> but I would love if most of the work is done by others, who would then
>>>>>>>> become familiar with the whole of it.
>>>>>>>>
>>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the right
>>>>>>>> time? Are there some people in the community who would like to take part in
>>>>>>>> it?
>>>>>>>>
>>>>>>>> J.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jarek Potiuk
>>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Daniel Imberman <da...@gmail.com>.
I am all for rewriting breeze, but I think waiting until after 2.0 makes the most sense. Python could work, but let’s be intentional about the decision before we choose.

via Newton Mail [https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.7&source=email_footer_2]
On Wed, Nov 11, 2020 at 3:12 PM, Deng Xiaodong <xd...@gmail.com> wrote:
I agree with Kaxil’s point (or even a bit later, say when 2.0 gets relatively more “stable”).
My aspect is more about to concentrate development/community focus.

XD
On Thu, Nov 12, 2020 at 00:05 Kaxil Naik < kaxilnaik@gmail.com [kaxilnaik@gmail.com] > wrote:
I think we should wait until 2.0 is out before discussing or even gathering feedback. As I am sure any feedback will trigger a discussion.
On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
Andrew,
Thanks for chiming in - just to answer your questions and clarify the scope of the discussion:
Breeze is for developing Airflow itself, it's purpose is not to develop and run DAGs. It was never intended to be used by the "users" of Airflow or DAG development or testing the DAGs. And while we were pondering with that thought recently, I think it never will be this, it is simply not fit for the purpose.
Even the "start-airflow" command is there mainly for the developers of Airflow, not for the users of it. For example, it can be quickly used to test if a new release candidate for Apache Aiirflow "works" - thanks to it in a few minutes I can run a released version of Airflow in several combinations of python/backend and see that it generally "works".
So for the docker-compose user production image" - sure, it is needed but this is a different issue, different users, and a completely different use-case (even if "docker-compose" name is there too). Those two are completely different use-cases, starting from the fact that even the docker image used there is different. Maybe this is what both you and Ash are talking about. In which case I fully agree it's needed, but I believe we are not talking about it here.
If you want to have this kind of approach you are talking about, you can take a look at the issue here: https://github.com/apache/airflow/issues/8605 [https://github.com/apache/airflow/issues/8605] . Nobody works on it actively now, but I would love someone who takes a lead on it and completes it. I am happy to help and review it as much as I can. But maybe you would like to take a lead on it Andrew since you have some experience and real use case behind? I think we need people there who are actual users of Airflow - which sadly, I am mostly not one :)
But let's not mix the two please :). I'd love to keep this thread focused on "Breeze, the development environment for Airflow itself" . Even the tagline of Breeze " It's a Breeze to develop Airflow ." rather than "It's a Breeze to develop DAGs"
J.

On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
Tomek:
I started the discussion here, so just everyone is aware of it even if they are not watching GH issues. I now created the GH Issue https://github.com/apache/airflow/issues/12282 [https://github.com/apache/airflow/issues/12282] so that I can gather together people with some interest and I think it's best to continue the discussion there.
What I plan to do within the next few days, is to start a design document and design discussion. I would like to start with defining the actual users of Breeze, the use-cases it should serve, the purpose, and the set of assumptions that it should have. And only after we hash it all out, I would like to define the scope, decide whether we want to have one or many different tools for different users, how much of it is common and whether we can remove some of it completely or simplify it.
I think we've gathered enormous experience from various levels of developers while using Breeze and it's a perfect moment to discuss (with those various users) what is useful, for whom, what makes sense, and how to provide the best interface. I see the current Breeze as a learning platform on what is useful and what is not, and I would love - this time - so that decisions in it are made by the actual users (of a various kind). And I would love to lead it - not as a developer this time, but as a "product manager" - listening to various voices and trying to make the best of it, reaching some consensus and working with others to implement it. I think this is the best use of the experience we had with Breeze and the "crowd-wisdom" of the developers of Airflow of a different kind and with a different experience.

J.


On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon < andrewharmonllc@gmail.com [andrewharmonllc@gmail.com] > wrote:
I would agree as an end user, I’m not really sure what Breeze does. Is it for CI or is it a way to quickly spin up a containerized env for local development. I do think it would be great to have something similar to Puckel that uses official airflow images. Very easy to quickly get started with to give airflow a try, but also a jumping off point for organizations to customize it to their needs. If this is decker-compose or something else, that’s fine. We use a customized version of puckel for all the engineers to do local dag development. It would be great if this was more “official” Airflow. I agree that python would make it easier for others to contribute. Finally, very clear documentation on the Airflow site would be very helpful too.
Thanks, Andrew Harmon

On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek < turbaszek@apache.org [turbaszek@apache.org] > wrote:
+1 for using python.

> I would also say: make breeze do less. Right now it is three major things: > * A local development environment > * CI runner > * It's recently grown the ability to run airflow for developing dags.
My first thought was similar - breeze does too much now. However, I think the problem is not in plenty of functionality but in technology used - bash. Using python or any other language will let us create a nice and clear structure for the project that will be easy to onboard, reason about and manage.
Structuring breeze may allow us to leverage using separate docker images, docker composes for different purposes (CI, DAG dev, Airflow dev). I like the way in which breeze is a "layer over docker" and I think this gives a nice experience. However, breeze has grown so big that I'm not sure even if I use half of the functions it has.
Note: where should we continue the discussion? The official place is devlist, but we have GH issue. Which one should we use to avoid two separate discussions?
Tomek

On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
I also created issue for it: https://github.com/apache/airflow/issues/12282 [https://github.com/apache/airflow/issues/12282]
Anyone interested in taking part - please comment there!
On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote:
You screamed (among many others) and I listened :). And I think the time is now to act.
I believe the scope of "Breeze 2" should be part of the design discussion, where we will hear other's opinions (especially the first time or fresh contributors).
For now, my vision is quite a bit different than yours Ash :). But I do not want to start a design discussion just yet, I want to make breathing space for others to chime in.
I would love to hear many voices and interests of people before we deep dive into what "Breeze 2" might look like.
What I am interested in is whether:
a) it's the right time b) python is the right choice c) do I have several people who would like to join and offer both - help in designing the vision for it, as well as their time to implement it.
I think it is crucial that those people who will be implementing it, will be the main people who make design decisions about it, as I would love to have a strong group of people who would like to not only take part in developing it but also in maintaining it in the future.
J.

On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor < ash@apache.org [ash@apache.org] > wrote:
Omg yes. I have been screaming out for this for months.
$ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l 6911
That's entirely too much bash for my liking by about an order of magnitude ;)
I would also say : make breeze do less. Right now it is three major things :
* A local development environment * CI runner * It's recently grown the ability to run airflow for developing dags.
That is too much. Yes there is overlap, but it's just too much in one tool, and too complex as a result. Some of this should just be replaced with a docker-compose file (that uses published release images, not floating master/nightly) and users told to run that.
Make it simpler, fitting a core purpose - running CI consistently should be it's only goal.
-ash
On Nov 11 2020, at 9:58 am, Jarek Potiuk < Jarek.Potiuk@polidea.com [Jarek.Potiuk@polidea.com] > wrote: Hello Everyone,
TL; DR; I was thinking for quite a while on this and I think this is the right time to raise that subject. It's been asked several times, why Breeze is not written in something else than Bash since it is "that big" or some people said "monstrous" :). I think it's the right time to start a "rewrite" project with wide community involvement and Python seems to be the best choice :).

While I was opposing this while we were focusing on Airflow 2.0, and there are some good reasons why initially I started Breeze in Bash, I think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based on Python 3.6 and with some "stability" and "good set of features" we have in Breeze and a good level of modularisation we achieved - it's the right time to think about a rewrite.
I did not raise this subject to add a distraction on top of what is already a lot of work for 2.0, but I think having Breeze rewritten in Python could be the "one more thing" that we could do - as a community to make 2.0 experience even better, and one that can make the community even closer.
I was thinking that Breeze is perfect to be split into separate smaller pieces, describe some assumptions that we will have for its use, and turn it into a true community effort where a lot of people will contribute and where we will be able to simplify some of the stuff, and - most importantly - make more people from the community know about how our CI and development environment works and be able to solve any problems there.
Breeze (and underlying bash libraries) are crucial, to get our CI working and I am mostly the single point of contact (and failure!) when it comes to that - I would love to not be one :) and I think with most of the core committers busy with 2.0, this is also an opportunity for more of the contributors to take their part in it (and eventually earn their rank to become committers!). For the core committers, this is an extra opportunity to learn how the system works, influence its design, and possibly simplify some parts of it - even if they will be mostly focused on 2.0.
I would like to do it well - write some assumptions in a design doc, plan the work and split it into separate issues, and lead the effort - but I would love if most of the work is done by others, who would then become familiar with the whole of it.
WDYT? Do you think it is a good idea? Do you thin k it is the right time? Are there some people in the community who would like to take part in it?
J.
--   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              



--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              




--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              





--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]              




--
   Jarek Potiuk                                                       
   Polidea [https://www.polidea.com/] | Principal Software Engineer   

M: +48 660 796 129 [tel:+48660796129]   
[https://www.polidea.com/]

Re: Rewriting Breeze in Python ?

Posted by Deng Xiaodong <xd...@gmail.com>.
I agree with Kaxil’s point (or even a bit later, say when 2.0 gets
relatively more “stable”).

My aspect is more about to concentrate development/community focus.


XD

On Thu, Nov 12, 2020 at 00:05 Kaxil Naik <ka...@gmail.com> wrote:

> I think we should wait until 2.0 is out before discussing or even
> gathering feedback. As I am sure any feedback will trigger a discussion.
>
> On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> Andrew,
>>
>> Thanks for chiming in - just to answer your questions and clarify the
>> scope of the discussion:
>>
>> Breeze is for developing Airflow itself, it's purpose is not to develop
>> and run DAGs. It was never intended to be used by the "users" of Airflow or
>> DAG development or testing the DAGs. And while we were pondering with that
>> thought recently, I think it never will be this, it is simply not fit for
>> the purpose.
>>
>> Even the "start-airflow" command is there mainly for the developers of
>> Airflow, not for the users of it. For example, it can be quickly used to
>> test if a new release candidate for Apache Aiirflow "works" - thanks to it
>> in a few minutes I can run a released version of Airflow in several
>> combinations of python/backend and see that it generally "works".
>>
>> So for the docker-compose user production image" - sure, it is needed but
>> this is a different issue, different users, and a completely different
>> use-case (even if "docker-compose" name is there too). Those two are
>> completely different use-cases, starting from the fact that even the docker
>> image used there is different. Maybe this is what both you and Ash are
>> talking about. In which case I fully agree it's needed, but I believe we
>> are not talking about it here.
>>
>> If you want to have this kind of approach you are talking about, you can
>> take a look at the issue here:
>> https://github.com/apache/airflow/issues/8605.  Nobody works on it
>> actively now, but I would love someone who takes a lead on it and completes
>> it. I am happy to help and review it as much as I can. But maybe you would
>> like to take a lead on it Andrew since you have some experience and
>> real use case behind? I think we need people there who are actual users of
>> Airflow - which sadly, I am mostly not one :)
>>
>> But let's not mix the two please :). I'd love to keep this thread focused
>> on *"Breeze, the development environment for Airflow itself"*. Even the
>> tagline of Breeze "*It's a Breeze to develop Airflow*." rather than
>> "It's a Breeze to develop DAGs"
>>
>> J.
>>
>>
>> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> Tomek:
>>>
>>> I started the discussion here, so just everyone is aware of it even if
>>> they are not watching GH issues. I now created the GH Issue
>>> https://github.com/apache/airflow/issues/12282 so that I can gather
>>> together people with some interest and I think it's best to continue the
>>> discussion there.
>>>
>>> What I plan to do within the next few days, is to start a design
>>> document and design discussion. I would like to start with defining the
>>> actual users of Breeze, the use-cases it should serve, the purpose, and the
>>> set of assumptions that it should have. And only after we hash it all out,
>>> I would like to define the scope, decide whether we want to have one or
>>> many different tools for different users, how much of it is common and
>>> whether we can remove some of it completely or simplify it.
>>>
>>> I think we've gathered enormous experience from various levels of
>>> developers while using Breeze and it's a perfect moment to discuss (with
>>> those various users) what is useful, for whom, what makes sense, and how to
>>> provide the best interface. I see the current Breeze as a learning platform
>>> on what is useful and what is not, and I would love - this time - so that
>>> decisions in it are made by the actual users (of a various kind). And I
>>> would love to lead it - not as a developer this time, but as a "product
>>> manager" - listening to various voices and trying to make the best of
>>> it, reaching some consensus and working with others to implement it. I
>>> think this is the best use of the experience we had with Breeze and the
>>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>>> different experience.
>>>
>>> J.
>>>
>>>
>>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <an...@gmail.com>
>>> wrote:
>>>
>>>> I would agree as an end user, I’m not really sure what Breeze does. Is
>>>> it for CI or is it a way to quickly spin up a containerized env for local
>>>> development. I do think it would be great to have something similar to
>>>> Puckel that uses official airflow images. Very easy to quickly get started
>>>> with to give airflow a try, but also a jumping off point for organizations
>>>> to customize it to their needs. If this is decker-compose or something
>>>> else, that’s fine. We use a customized version of puckel for all the
>>>> engineers to do local dag development. It would be great if this was more
>>>> “official” Airflow. I agree that python would make it easier for others to
>>>> contribute. Finally, very clear documentation on the Airflow site would be
>>>> very helpful too.
>>>>
>>>> Thanks,
>>>> Andrew Harmon
>>>>
>>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>>> wrote:
>>>>
>>>> +1 for using python.
>>>>
>>>> > I would also say: make breeze do less. Right now it is three major
>>>> things:
>>>> > * A local development environment
>>>> > * CI runner
>>>> > * It's recently grown the ability to run airflow for developing dags.
>>>>
>>>> My first thought was similar - breeze does too much now. However, I
>>>> think the problem is not in plenty of functionality but in technology used
>>>> - bash. Using python or any other language will let us create a nice and
>>>> clear structure for the project that will be easy to onboard, reason about
>>>> and manage.
>>>>
>>>> Structuring breeze may allow us to leverage using separate docker
>>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>>> I like the way in which breeze is a "layer over docker" and I think this
>>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>>> even if I use half of the functions it has.
>>>>
>>>> *Note:* where should we continue the discussion? The official place is
>>>> devlist, but we have GH issue. Which one should we use to avoid two
>>>> separate discussions?
>>>>
>>>> Tomek
>>>>
>>>>
>>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>>> I also created issue for it:
>>>>> https://github.com/apache/airflow/issues/12282
>>>>>
>>>>> Anyone interested in taking part - please comment there!
>>>>>
>>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <
>>>>> Jarek.Potiuk@polidea.com> wrote:
>>>>>
>>>>>> You screamed (among many others) and I listened :). And I think the
>>>>>> time is now to act.
>>>>>>
>>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>>> or fresh contributors).
>>>>>>
>>>>>> For now, my vision is quite a bit different than yours Ash :). But I
>>>>>> do not want to start a design discussion just yet, I want to make breathing
>>>>>> space for others to chime in.
>>>>>>
>>>>>> I would love to hear many voices and interests of people before we
>>>>>> deep dive into what "Breeze 2" might look like.
>>>>>>
>>>>>> What I am interested in is whether:
>>>>>>
>>>>>> a) it's the right time
>>>>>> b) python is the right choice
>>>>>> c) do I have several people who would like to join and offer both -
>>>>>> help in designing the vision for it, as well as their time to implement it.
>>>>>>
>>>>>> I think it is crucial that those people who will be implementing it,
>>>>>> will be the main people who make design decisions about it, as I would love
>>>>>> to have a strong group of people who would like to not only take part in
>>>>>> developing it but also in maintaining it in the future.
>>>>>>
>>>>>> J.
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>>
>>>>>>> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
>>>>>>> 6911
>>>>>>>
>>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>>> magnitude ;)
>>>>>>>
>>>>>>> I would also say: make breeze do less. Right now it is three major
>>>>>>> things:
>>>>>>>
>>>>>>> * A local development environment
>>>>>>> * CI runner
>>>>>>> * It's recently grown the ability to run airflow for developing dags.
>>>>>>>
>>>>>>> That is too much. Yes there is overlap, but it's just too much in
>>>>>>> one tool, and too complex as a result. Some of this should just be replaced
>>>>>>> with a docker-compose file (that uses published release images, not
>>>>>>> floating master/nightly) and users told to run that.
>>>>>>>
>>>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>>>> should be it's only goal.
>>>>>>>
>>>>>>> -ash
>>>>>>>
>>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hello Everyone,
>>>>>>>
>>>>>>> TL; DR; I was thinking for quite a while on this and I think this is
>>>>>>> the right time to raise that subject. It's been asked several times, why
>>>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>>>> the best choice :).
>>>>>>>
>>>>>>>
>>>>>>> While I was opposing this while we were focusing on Airflow 2.0, and
>>>>>>> there are some good reasons why initially I started Breeze in Bash, I think
>>>>>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based
>>>>>>> on Python 3.6 and with some "stability" and "good set of features" we have
>>>>>>> in Breeze and a good level of modularisation we achieved - it's the right
>>>>>>> time to think about a rewrite.
>>>>>>>
>>>>>>> I did not raise this subject to add a distraction on top of what is
>>>>>>> already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>>> closer.
>>>>>>>
>>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>>> and turn it into a true community effort where a lot of people will
>>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>>> most importantly - make more people from the community know about how our
>>>>>>> CI and development environment works and be able to solve any problems
>>>>>>> there.
>>>>>>>
>>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>>>
>>>>>>> I would like to do it well - write some assumptions in a design doc,
>>>>>>> plan the work and split it into separate issues, and lead the effort - but
>>>>>>> I would love if most of the work is done by others, who would then become
>>>>>>> familiar with the whole of it.
>>>>>>>
>>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the right
>>>>>>> time? Are there some people in the community who would like to take part in
>>>>>>> it?
>>>>>>>
>>>>>>> J.
>>>>>>>
>>>>>>> --
>>>>>>> Jarek Potiuk
>>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>

Re: Rewriting Breeze in Python ?

Posted by Kaxil Naik <ka...@gmail.com>.
I think we should wait until 2.0 is out before discussing or even gathering
feedback. As I am sure any feedback will trigger a discussion.

On Wed, Nov 11, 2020 at 5:52 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Andrew,
>
> Thanks for chiming in - just to answer your questions and clarify the
> scope of the discussion:
>
> Breeze is for developing Airflow itself, it's purpose is not to develop
> and run DAGs. It was never intended to be used by the "users" of Airflow or
> DAG development or testing the DAGs. And while we were pondering with that
> thought recently, I think it never will be this, it is simply not fit for
> the purpose.
>
> Even the "start-airflow" command is there mainly for the developers of
> Airflow, not for the users of it. For example, it can be quickly used to
> test if a new release candidate for Apache Aiirflow "works" - thanks to it
> in a few minutes I can run a released version of Airflow in several
> combinations of python/backend and see that it generally "works".
>
> So for the docker-compose user production image" - sure, it is needed but
> this is a different issue, different users, and a completely different
> use-case (even if "docker-compose" name is there too). Those two are
> completely different use-cases, starting from the fact that even the docker
> image used there is different. Maybe this is what both you and Ash are
> talking about. In which case I fully agree it's needed, but I believe we
> are not talking about it here.
>
> If you want to have this kind of approach you are talking about, you can
> take a look at the issue here:
> https://github.com/apache/airflow/issues/8605.  Nobody works on it
> actively now, but I would love someone who takes a lead on it and completes
> it. I am happy to help and review it as much as I can. But maybe you would
> like to take a lead on it Andrew since you have some experience and
> real use case behind? I think we need people there who are actual users of
> Airflow - which sadly, I am mostly not one :)
>
> But let's not mix the two please :). I'd love to keep this thread focused
> on *"Breeze, the development environment for Airflow itself"*. Even the
> tagline of Breeze "*It's a Breeze to develop Airflow*." rather than "It's
> a Breeze to develop DAGs"
>
> J.
>
>
> On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> Tomek:
>>
>> I started the discussion here, so just everyone is aware of it even if
>> they are not watching GH issues. I now created the GH Issue
>> https://github.com/apache/airflow/issues/12282 so that I can gather
>> together people with some interest and I think it's best to continue the
>> discussion there.
>>
>> What I plan to do within the next few days, is to start a design document
>> and design discussion. I would like to start with defining the actual users
>> of Breeze, the use-cases it should serve, the purpose, and the set of
>> assumptions that it should have. And only after we hash it all out, I would
>> like to define the scope, decide whether we want to have one or many
>> different tools for different users, how much of it is common and whether
>> we can remove some of it completely or simplify it.
>>
>> I think we've gathered enormous experience from various levels of
>> developers while using Breeze and it's a perfect moment to discuss (with
>> those various users) what is useful, for whom, what makes sense, and how to
>> provide the best interface. I see the current Breeze as a learning platform
>> on what is useful and what is not, and I would love - this time - so that
>> decisions in it are made by the actual users (of a various kind). And I
>> would love to lead it - not as a developer this time, but as a "product
>> manager" - listening to various voices and trying to make the best of
>> it, reaching some consensus and working with others to implement it. I
>> think this is the best use of the experience we had with Breeze and the
>> "crowd-wisdom" of the developers of Airflow of a different kind and with a
>> different experience.
>>
>> J.
>>
>>
>> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <an...@gmail.com>
>> wrote:
>>
>>> I would agree as an end user, I’m not really sure what Breeze does. Is
>>> it for CI or is it a way to quickly spin up a containerized env for local
>>> development. I do think it would be great to have something similar to
>>> Puckel that uses official airflow images. Very easy to quickly get started
>>> with to give airflow a try, but also a jumping off point for organizations
>>> to customize it to their needs. If this is decker-compose or something
>>> else, that’s fine. We use a customized version of puckel for all the
>>> engineers to do local dag development. It would be great if this was more
>>> “official” Airflow. I agree that python would make it easier for others to
>>> contribute. Finally, very clear documentation on the Airflow site would be
>>> very helpful too.
>>>
>>> Thanks,
>>> Andrew Harmon
>>>
>>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>>> wrote:
>>>
>>> +1 for using python.
>>>
>>> > I would also say: make breeze do less. Right now it is three major
>>> things:
>>> > * A local development environment
>>> > * CI runner
>>> > * It's recently grown the ability to run airflow for developing dags.
>>>
>>> My first thought was similar - breeze does too much now. However, I
>>> think the problem is not in plenty of functionality but in technology used
>>> - bash. Using python or any other language will let us create a nice and
>>> clear structure for the project that will be easy to onboard, reason about
>>> and manage.
>>>
>>> Structuring breeze may allow us to leverage using separate docker
>>> images, docker composes for different purposes (CI, DAG dev, Airflow dev).
>>> I like the way in which breeze is a "layer over docker" and I think this
>>> gives a nice experience. However, breeze has grown so big that I'm not sure
>>> even if I use half of the functions it has.
>>>
>>> *Note:* where should we continue the discussion? The official place is
>>> devlist, but we have GH issue. Which one should we use to avoid two
>>> separate discussions?
>>>
>>> Tomek
>>>
>>>
>>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> I also created issue for it:
>>>> https://github.com/apache/airflow/issues/12282
>>>>
>>>> Anyone interested in taking part - please comment there!
>>>>
>>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>>> You screamed (among many others) and I listened :). And I think the
>>>>> time is now to act.
>>>>>
>>>>> I believe the scope of "Breeze 2" should be part of the design
>>>>> discussion, where we will hear other's opinions (especially the first time
>>>>> or fresh contributors).
>>>>>
>>>>> For now, my vision is quite a bit different than yours Ash :). But I
>>>>> do not want to start a design discussion just yet, I want to make breathing
>>>>> space for others to chime in.
>>>>>
>>>>> I would love to hear many voices and interests of people before we
>>>>> deep dive into what "Breeze 2" might look like.
>>>>>
>>>>> What I am interested in is whether:
>>>>>
>>>>> a) it's the right time
>>>>> b) python is the right choice
>>>>> c) do I have several people who would like to join and offer both -
>>>>> help in designing the vision for it, as well as their time to implement it.
>>>>>
>>>>> I think it is crucial that those people who will be implementing it,
>>>>> will be the main people who make design decisions about it, as I would love
>>>>> to have a strong group of people who would like to not only take part in
>>>>> developing it but also in maintaining it in the future.
>>>>>
>>>>> J.
>>>>>
>>>>>
>>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Omg yes. I have been screaming out for this for months.
>>>>>>
>>>>>> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
>>>>>> 6911
>>>>>>
>>>>>> That's entirely too much bash for my liking by about an order of
>>>>>> magnitude ;)
>>>>>>
>>>>>> I would also say: make breeze do less. Right now it is three major
>>>>>> things:
>>>>>>
>>>>>> * A local development environment
>>>>>> * CI runner
>>>>>> * It's recently grown the ability to run airflow for developing dags.
>>>>>>
>>>>>> That is too much. Yes there is overlap, but it's just too much in one
>>>>>> tool, and too complex as a result. Some of this should just be replaced
>>>>>> with a docker-compose file (that uses published release images, not
>>>>>> floating master/nightly) and users told to run that.
>>>>>>
>>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>>> should be it's only goal.
>>>>>>
>>>>>> -ash
>>>>>>
>>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hello Everyone,
>>>>>>
>>>>>> TL; DR; I was thinking for quite a while on this and I think this is
>>>>>> the right time to raise that subject. It's been asked several times, why
>>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>>> the best choice :).
>>>>>>
>>>>>>
>>>>>> While I was opposing this while we were focusing on Airflow 2.0, and
>>>>>> there are some good reasons why initially I started Breeze in Bash, I think
>>>>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based
>>>>>> on Python 3.6 and with some "stability" and "good set of features" we have
>>>>>> in Breeze and a good level of modularisation we achieved - it's the right
>>>>>> time to think about a rewrite.
>>>>>>
>>>>>> I did not raise this subject to add a distraction on top of what is
>>>>>> already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>>> make 2.0 experience even better, and one that can make the community even
>>>>>> closer.
>>>>>>
>>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>>> and turn it into a true community effort where a lot of people will
>>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>>> most importantly - make more people from the community know about how our
>>>>>> CI and development environment works and be able to solve any problems
>>>>>> there.
>>>>>>
>>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>>
>>>>>> I would like to do it well - write some assumptions in a design doc,
>>>>>> plan the work and split it into separate issues, and lead the effort - but
>>>>>> I would love if most of the work is done by others, who would then become
>>>>>> familiar with the whole of it.
>>>>>>
>>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the right
>>>>>> time? Are there some people in the community who would like to take part in
>>>>>> it?
>>>>>>
>>>>>> J.
>>>>>>
>>>>>> --
>>>>>> Jarek Potiuk
>>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>> M: +48 660 796 129 <+48660796129>
>>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
Andrew,

Thanks for chiming in - just to answer your questions and clarify the scope
of the discussion:

Breeze is for developing Airflow itself, it's purpose is not to develop and
run DAGs. It was never intended to be used by the "users" of Airflow or DAG
development or testing the DAGs. And while we were pondering with that
thought recently, I think it never will be this, it is simply not fit for
the purpose.

Even the "start-airflow" command is there mainly for the developers of
Airflow, not for the users of it. For example, it can be quickly used to
test if a new release candidate for Apache Aiirflow "works" - thanks to it
in a few minutes I can run a released version of Airflow in several
combinations of python/backend and see that it generally "works".

So for the docker-compose user production image" - sure, it is needed but
this is a different issue, different users, and a completely different
use-case (even if "docker-compose" name is there too). Those two are
completely different use-cases, starting from the fact that even the docker
image used there is different. Maybe this is what both you and Ash are
talking about. In which case I fully agree it's needed, but I believe we
are not talking about it here.

If you want to have this kind of approach you are talking about, you can
take a look at the issue here: https://github.com/apache/airflow/issues/8605.
Nobody works on it actively now, but I would love someone who takes a lead
on it and completes it. I am happy to help and review it as much as I can.
But maybe you would like to take a lead on it Andrew since you have some
experience and real use case behind? I think we need people there who are
actual users of Airflow - which sadly, I am mostly not one :)

But let's not mix the two please :). I'd love to keep this thread focused
on *"Breeze, the development environment for Airflow itself"*. Even the
tagline of Breeze "*It's a Breeze to develop Airflow*." rather than "It's a
Breeze to develop DAGs"

J.


On Wed, Nov 11, 2020 at 6:48 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> Tomek:
>
> I started the discussion here, so just everyone is aware of it even if
> they are not watching GH issues. I now created the GH Issue
> https://github.com/apache/airflow/issues/12282 so that I can gather
> together people with some interest and I think it's best to continue the
> discussion there.
>
> What I plan to do within the next few days, is to start a design document
> and design discussion. I would like to start with defining the actual users
> of Breeze, the use-cases it should serve, the purpose, and the set of
> assumptions that it should have. And only after we hash it all out, I would
> like to define the scope, decide whether we want to have one or many
> different tools for different users, how much of it is common and whether
> we can remove some of it completely or simplify it.
>
> I think we've gathered enormous experience from various levels of
> developers while using Breeze and it's a perfect moment to discuss (with
> those various users) what is useful, for whom, what makes sense, and how to
> provide the best interface. I see the current Breeze as a learning platform
> on what is useful and what is not, and I would love - this time - so that
> decisions in it are made by the actual users (of a various kind). And I
> would love to lead it - not as a developer this time, but as a "product
> manager" - listening to various voices and trying to make the best of
> it, reaching some consensus and working with others to implement it. I
> think this is the best use of the experience we had with Breeze and the
> "crowd-wisdom" of the developers of Airflow of a different kind and with a
> different experience.
>
> J.
>
>
> On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <an...@gmail.com>
> wrote:
>
>> I would agree as an end user, I’m not really sure what Breeze does. Is it
>> for CI or is it a way to quickly spin up a containerized env for local
>> development. I do think it would be great to have something similar to
>> Puckel that uses official airflow images. Very easy to quickly get started
>> with to give airflow a try, but also a jumping off point for organizations
>> to customize it to their needs. If this is decker-compose or something
>> else, that’s fine. We use a customized version of puckel for all the
>> engineers to do local dag development. It would be great if this was more
>> “official” Airflow. I agree that python would make it easier for others to
>> contribute. Finally, very clear documentation on the Airflow site would be
>> very helpful too.
>>
>> Thanks,
>> Andrew Harmon
>>
>> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org>
>> wrote:
>>
>> +1 for using python.
>>
>> > I would also say: make breeze do less. Right now it is three major
>> things:
>> > * A local development environment
>> > * CI runner
>> > * It's recently grown the ability to run airflow for developing dags.
>>
>> My first thought was similar - breeze does too much now. However, I think
>> the problem is not in plenty of functionality but in technology used -
>> bash. Using python or any other language will let us create a nice and
>> clear structure for the project that will be easy to onboard, reason about
>> and manage.
>>
>> Structuring breeze may allow us to leverage using separate docker images,
>> docker composes for different purposes (CI, DAG dev, Airflow dev). I like
>> the way in which breeze is a "layer over docker" and I think this gives a
>> nice experience. However, breeze has grown so big that I'm not sure even if
>> I use half of the functions it has.
>>
>> *Note:* where should we continue the discussion? The official place is
>> devlist, but we have GH issue. Which one should we use to avoid two
>> separate discussions?
>>
>> Tomek
>>
>>
>> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> I also created issue for it:
>>> https://github.com/apache/airflow/issues/12282
>>>
>>> Anyone interested in taking part - please comment there!
>>>
>>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> You screamed (among many others) and I listened :). And I think the
>>>> time is now to act.
>>>>
>>>> I believe the scope of "Breeze 2" should be part of the design
>>>> discussion, where we will hear other's opinions (especially the first time
>>>> or fresh contributors).
>>>>
>>>> For now, my vision is quite a bit different than yours Ash :). But I do
>>>> not want to start a design discussion just yet, I want to make breathing
>>>> space for others to chime in.
>>>>
>>>> I would love to hear many voices and interests of people before we deep
>>>> dive into what "Breeze 2" might look like.
>>>>
>>>> What I am interested in is whether:
>>>>
>>>> a) it's the right time
>>>> b) python is the right choice
>>>> c) do I have several people who would like to join and offer both -
>>>> help in designing the vision for it, as well as their time to implement it.
>>>>
>>>> I think it is crucial that those people who will be implementing it,
>>>> will be the main people who make design decisions about it, as I would love
>>>> to have a strong group of people who would like to not only take part in
>>>> developing it but also in maintaining it in the future.
>>>>
>>>> J.
>>>>
>>>>
>>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>>> wrote:
>>>>
>>>>> Omg yes. I have been screaming out for this for months.
>>>>>
>>>>> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
>>>>> 6911
>>>>>
>>>>> That's entirely too much bash for my liking by about an order of
>>>>> magnitude ;)
>>>>>
>>>>> I would also say: make breeze do less. Right now it is three major
>>>>> things:
>>>>>
>>>>> * A local development environment
>>>>> * CI runner
>>>>> * It's recently grown the ability to run airflow for developing dags.
>>>>>
>>>>> That is too much. Yes there is overlap, but it's just too much in one
>>>>> tool, and too complex as a result. Some of this should just be replaced
>>>>> with a docker-compose file (that uses published release images, not
>>>>> floating master/nightly) and users told to run that.
>>>>>
>>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>>> should be it's only goal.
>>>>>
>>>>> -ash
>>>>>
>>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>>> wrote:
>>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> TL; DR; I was thinking for quite a while on this and I think this is
>>>>> the right time to raise that subject. It's been asked several times, why
>>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>>> some people said "monstrous" :). I think it's the right time to start a
>>>>> "rewrite" project with wide community involvement and Python seems to be
>>>>> the best choice :).
>>>>>
>>>>>
>>>>> While I was opposing this while we were focusing on Airflow 2.0, and
>>>>> there are some good reasons why initially I started Breeze in Bash, I think
>>>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based
>>>>> on Python 3.6 and with some "stability" and "good set of features" we have
>>>>> in Breeze and a good level of modularisation we achieved - it's the right
>>>>> time to think about a rewrite.
>>>>>
>>>>> I did not raise this subject to add a distraction on top of what is
>>>>> already a lot of work for 2.0, but I think having Breeze rewritten in
>>>>> Python could be the "one more thing" that we could do - as a community to
>>>>> make 2.0 experience even better, and one that can make the community even
>>>>> closer.
>>>>>
>>>>> I was thinking that Breeze is perfect to be split into separate
>>>>> smaller pieces, describe some assumptions that we will have for its use,
>>>>> and turn it into a true community effort where a lot of people will
>>>>> contribute and where we will be able to simplify some of the stuff, and -
>>>>> most importantly - make more people from the community know about how our
>>>>> CI and development environment works and be able to solve any problems
>>>>> there.
>>>>>
>>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>>> working and I am mostly the single point of contact (and failure!) when it
>>>>> comes to that - I would love to not be one :) and I think with most of the
>>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>>> contributors to take their part in it (and eventually earn their rank to
>>>>> become committers!). For the core committers, this is an extra opportunity
>>>>> to learn how the system works, influence its design, and possibly simplify
>>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>>
>>>>> I would like to do it well - write some assumptions in a design doc,
>>>>> plan the work and split it into separate issues, and lead the effort - but
>>>>> I would love if most of the work is done by others, who would then become
>>>>> familiar with the whole of it.
>>>>>
>>>>> WDYT? Do you think it is a good idea? Do you thin k it is the right
>>>>> time? Are there some people in the community who would like to take part in
>>>>> it?
>>>>>
>>>>> J.
>>>>>
>>>>> --
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>> --
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
Tomek:

I started the discussion here, so just everyone is aware of it even if they
are not watching GH issues. I now created the GH Issue
https://github.com/apache/airflow/issues/12282 so that I can gather
together people with some interest and I think it's best to continue the
discussion there.

What I plan to do within the next few days, is to start a design document
and design discussion. I would like to start with defining the actual users
of Breeze, the use-cases it should serve, the purpose, and the set of
assumptions that it should have. And only after we hash it all out, I would
like to define the scope, decide whether we want to have one or many
different tools for different users, how much of it is common and whether
we can remove some of it completely or simplify it.

I think we've gathered enormous experience from various levels of
developers while using Breeze and it's a perfect moment to discuss (with
those various users) what is useful, for whom, what makes sense, and how to
provide the best interface. I see the current Breeze as a learning platform
on what is useful and what is not, and I would love - this time - so that
decisions in it are made by the actual users (of a various kind). And I
would love to lead it - not as a developer this time, but as a "product
manager" - listening to various voices and trying to make the best of
it, reaching some consensus and working with others to implement it. I
think this is the best use of the experience we had with Breeze and the
"crowd-wisdom" of the developers of Airflow of a different kind and with a
different experience.

J.


On Wed, Nov 11, 2020 at 4:09 PM Andrew Harmon <an...@gmail.com>
wrote:

> I would agree as an end user, I’m not really sure what Breeze does. Is it
> for CI or is it a way to quickly spin up a containerized env for local
> development. I do think it would be great to have something similar to
> Puckel that uses official airflow images. Very easy to quickly get started
> with to give airflow a try, but also a jumping off point for organizations
> to customize it to their needs. If this is decker-compose or something
> else, that’s fine. We use a customized version of puckel for all the
> engineers to do local dag development. It would be great if this was more
> “official” Airflow. I agree that python would make it easier for others to
> contribute. Finally, very clear documentation on the Airflow site would be
> very helpful too.
>
> Thanks,
> Andrew Harmon
>
> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org> wrote:
>
> +1 for using python.
>
> > I would also say: make breeze do less. Right now it is three major
> things:
> > * A local development environment
> > * CI runner
> > * It's recently grown the ability to run airflow for developing dags.
>
> My first thought was similar - breeze does too much now. However, I think
> the problem is not in plenty of functionality but in technology used -
> bash. Using python or any other language will let us create a nice and
> clear structure for the project that will be easy to onboard, reason about
> and manage.
>
> Structuring breeze may allow us to leverage using separate docker images,
> docker composes for different purposes (CI, DAG dev, Airflow dev). I like
> the way in which breeze is a "layer over docker" and I think this gives a
> nice experience. However, breeze has grown so big that I'm not sure even if
> I use half of the functions it has.
>
> *Note:* where should we continue the discussion? The official place is
> devlist, but we have GH issue. Which one should we use to avoid two
> separate discussions?
>
> Tomek
>
>
> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> I also created issue for it:
>> https://github.com/apache/airflow/issues/12282
>>
>> Anyone interested in taking part - please comment there!
>>
>> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>>> You screamed (among many others) and I listened :). And I think the time
>>> is now to act.
>>>
>>> I believe the scope of "Breeze 2" should be part of the design
>>> discussion, where we will hear other's opinions (especially the first time
>>> or fresh contributors).
>>>
>>> For now, my vision is quite a bit different than yours Ash :). But I do
>>> not want to start a design discussion just yet, I want to make breathing
>>> space for others to chime in.
>>>
>>> I would love to hear many voices and interests of people before we deep
>>> dive into what "Breeze 2" might look like.
>>>
>>> What I am interested in is whether:
>>>
>>> a) it's the right time
>>> b) python is the right choice
>>> c) do I have several people who would like to join and offer both - help
>>> in designing the vision for it, as well as their time to implement it.
>>>
>>> I think it is crucial that those people who will be implementing it,
>>> will be the main people who make design decisions about it, as I would love
>>> to have a strong group of people who would like to not only take part in
>>> developing it but also in maintaining it in the future.
>>>
>>> J.
>>>
>>>
>>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>>> wrote:
>>>
>>>> Omg yes. I have been screaming out for this for months.
>>>>
>>>> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
>>>> 6911
>>>>
>>>> That's entirely too much bash for my liking by about an order of
>>>> magnitude ;)
>>>>
>>>> I would also say: make breeze do less. Right now it is three major
>>>> things:
>>>>
>>>> * A local development environment
>>>> * CI runner
>>>> * It's recently grown the ability to run airflow for developing dags.
>>>>
>>>> That is too much. Yes there is overlap, but it's just too much in one
>>>> tool, and too complex as a result. Some of this should just be replaced
>>>> with a docker-compose file (that uses published release images, not
>>>> floating master/nightly) and users told to run that.
>>>>
>>>> Make it simpler, fitting a core purpose - running CI consistently
>>>> should be it's only goal.
>>>>
>>>> -ash
>>>>
>>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>> Hello Everyone,
>>>>
>>>> TL; DR; I was thinking for quite a while on this and I think this is
>>>> the right time to raise that subject. It's been asked several times, why
>>>> Breeze is not written in something else than Bash since it is "that big" or
>>>> some people said "monstrous" :). I think it's the right time to start a
>>>> "rewrite" project with wide community involvement and Python seems to be
>>>> the best choice :).
>>>>
>>>>
>>>> While I was opposing this while we were focusing on Airflow 2.0, and
>>>> there are some good reasons why initially I started Breeze in Bash, I think
>>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based
>>>> on Python 3.6 and with some "stability" and "good set of features" we have
>>>> in Breeze and a good level of modularisation we achieved - it's the right
>>>> time to think about a rewrite.
>>>>
>>>> I did not raise this subject to add a distraction on top of what is
>>>> already a lot of work for 2.0, but I think having Breeze rewritten in
>>>> Python could be the "one more thing" that we could do - as a community to
>>>> make 2.0 experience even better, and one that can make the community even
>>>> closer.
>>>>
>>>> I was thinking that Breeze is perfect to be split into separate smaller
>>>> pieces, describe some assumptions that we will have for its use, and turn
>>>> it into a true community effort where a lot of people will contribute and
>>>> where we will be able to simplify some of the stuff, and - most importantly
>>>> - make more people from the community know about how our CI and development
>>>> environment works and be able to solve any problems there.
>>>>
>>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>>> working and I am mostly the single point of contact (and failure!) when it
>>>> comes to that - I would love to not be one :) and I think with most of the
>>>> core committers busy with 2.0, this is also an opportunity for more of the
>>>> contributors to take their part in it (and eventually earn their rank to
>>>> become committers!). For the core committers, this is an extra opportunity
>>>> to learn how the system works, influence its design, and possibly simplify
>>>> some parts of it - even if they will be mostly focused on 2.0.
>>>>
>>>> I would like to do it well - write some assumptions in a design doc,
>>>> plan the work and split it into separate issues, and lead the effort - but
>>>> I would love if most of the work is done by others, who would then become
>>>> familiar with the whole of it.
>>>>
>>>> WDYT? Do you think it is a good idea? Do you thin k it is the right
>>>> time? Are there some people in the community who would like to take part in
>>>> it?
>>>>
>>>> J.
>>>>
>>>> --
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>> --
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>> --
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Andrew Harmon <an...@gmail.com>.
I would agree as an end user, I’m not really sure what Breeze does. Is it for CI or is it a way to quickly spin up a containerized env for local development. I do think it would be great to have something similar to Puckel that uses official airflow images. Very easy to quickly get started with to give airflow a try, but also a jumping off point for organizations to customize it to their needs. If this is decker-compose or something else, that’s fine. We use a customized version of puckel for all the engineers to do local dag development. It would be great if this was more “official” Airflow. I agree that python would make it easier for others to contribute. Finally, very clear documentation on the Airflow site would be very helpful too. 

Thanks,
Andrew Harmon

> On Nov 11, 2020, at 6:58 AM, Tomasz Urbaszek <tu...@apache.org> wrote:
> 
> +1 for using python.
> 
> > I would also say: make breeze do less. Right now it is three major things:
> > * A local development environment
> > * CI runner
> > * It's recently grown the ability to run airflow for developing dags.
> 
> My first thought was similar - breeze does too much now. However, I think the problem is not in plenty of functionality but in technology used - bash. Using python or any other language will let us create a nice and clear structure for the project that will be easy to onboard, reason about and manage.
> 
> Structuring breeze may allow us to leverage using separate docker images, docker composes for different purposes (CI, DAG dev, Airflow dev). I like the way in which breeze is a "layer over docker" and I think this gives a nice experience. However, breeze has grown so big that I'm not sure even if I use half of the functions it has. 
> 
> Note: where should we continue the discussion? The official place is devlist, but we have GH issue. Which one should we use to avoid two separate discussions?
> 
> Tomek
> 
> 
> On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <Jarek.Potiuk@polidea.com <ma...@polidea.com>> wrote:
> I also created issue for it: https://github.com/apache/airflow/issues/12282 <https://github.com/apache/airflow/issues/12282> 
> 
> Anyone interested in taking part - please comment there!
> 
> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <Jarek.Potiuk@polidea.com <ma...@polidea.com>> wrote:
> You screamed (among many others) and I listened :). And I think the time is now to act.
> 
> I believe the scope of "Breeze 2" should be part of the design discussion, where we will hear other's opinions (especially the first time or fresh contributors). 
> 
> For now, my vision is quite a bit different than yours Ash :). But I do not want to start a design discussion just yet, I want to make breathing space for others to chime in.
> 
> I would love to hear many voices and interests of people before we deep dive into what "Breeze 2" might look like.
> 
> What I am interested in is whether:
> 
> a) it's the right time
> b) python is the right choice
> c) do I have several people who would like to join and offer both - help in designing the vision for it, as well as their time to implement it.
> 
> I think it is crucial that those people who will be implementing it, will be the main people who make design decisions about it, as I would love to have a strong group of people who would like to not only take part in developing it but also in maintaining it in the future. 
> 
> J.
> 
> 
> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <ash@apache.org <ma...@apache.org>> wrote:
> Omg yes. I have been screaming out for this for months.
> 
> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
> 6911
> 
> That's entirely too much bash for my liking by about an order of magnitude ;)
> 
> I would also say: make breeze do less. Right now it is three major things:
> 
> * A local development environment
> * CI runner
> * It's recently grown the ability to run airflow for developing dags.
> 
> That is too much. Yes there is overlap, but it's just too much in one tool, and too complex as a result. Some of this should just be replaced with a docker-compose file (that uses published release images, not floating master/nightly) and users told to run that.
> 
> Make it simpler, fitting a core purpose - running CI consistently should be it's only goal.
> 
> -ash
> 
> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Jarek.Potiuk@polidea.com <ma...@polidea.com>> wrote:
> Hello Everyone,
> 
> TL; DR; I was thinking for quite a while on this and I think this is the right time to raise that subject. It's been asked several times, why Breeze is not written in something else than Bash since it is "that big" or some people said "monstrous" :). I think it's the right time to start a "rewrite" project with wide community involvement and Python seems to be the best choice :).
> 
> 
> While I was opposing this while we were focusing on Airflow 2.0, and there are some good reasons why initially I started Breeze in Bash, I think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based on Python 3.6 and with some "stability" and "good set of features" we have in Breeze and a good level of modularisation we achieved - it's the right time to think about a rewrite.
> 
> I did not raise this subject to add a distraction on top of what is already a lot of work for 2.0, but I think having Breeze rewritten in Python could be the "one more thing" that we could do - as a community to make 2.0 experience even better, and one that can make the community even closer.
> 
> I was thinking that Breeze is perfect to be split into separate smaller pieces, describe some assumptions that we will have for its use, and turn it into a true community effort where a lot of people will contribute and where we will be able to simplify some of the stuff, and - most importantly - make more people from the community know about how our CI and development environment works and be able to solve any problems there.
> 
> Breeze (and underlying bash libraries) are crucial, to get our CI working and I am mostly the single point of contact (and failure!) when it comes to that - I would love to not be one :) and I think with most of the core committers busy with 2.0, this is also an opportunity for more of the contributors to take their part in it (and eventually earn their rank to become committers!). For the core committers, this is an extra opportunity to learn how the system works, influence its design, and possibly simplify some parts of it - even if they will be mostly focused on 2.0.
> 
> I would like to do it well - write some assumptions in a design doc, plan the work and split it into separate issues, and lead the effort - but I would love if most of the work is done by others, who would then become familiar with the whole of it.
> 
> WDYT? Do you think it is a good idea? Do you thin k it is the right time? Are there some people in the community who would like to take part in it?
> 
> J.
> 
> --
> 	
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
> M: +48 660 796 129 <tel:+48660796129>
>  <https://www.polidea.com/>
> 
> -- 
> 	
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
> M: +48 660 796 129 <tel:+48660796129>
>  <https://www.polidea.com/>
> 
> 
> -- 
> 	
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
> M: +48 660 796 129 <tel:+48660796129>
>  <https://www.polidea.com/>


Re: Rewriting Breeze in Python ?

Posted by Tomasz Urbaszek <tu...@apache.org>.
+1 for using python.

> I would also say: make breeze do less. Right now it is three major things:
> * A local development environment
> * CI runner
> * It's recently grown the ability to run airflow for developing dags.

My first thought was similar - breeze does too much now. However, I think
the problem is not in plenty of functionality but in technology used -
bash. Using python or any other language will let us create a nice and
clear structure for the project that will be easy to onboard, reason about
and manage.

Structuring breeze may allow us to leverage using separate docker images,
docker composes for different purposes (CI, DAG dev, Airflow dev). I like
the way in which breeze is a "layer over docker" and I think this gives a
nice experience. However, breeze has grown so big that I'm not sure even if
I use half of the functions it has.

*Note:* where should we continue the discussion? The official place is
devlist, but we have GH issue. Which one should we use to avoid two
separate discussions?

Tomek


On Wed, Nov 11, 2020 at 12:13 PM Jarek Potiuk <Ja...@polidea.com>
wrote:

> I also created issue for it:
> https://github.com/apache/airflow/issues/12282
>
> Anyone interested in taking part - please comment there!
>
> On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> You screamed (among many others) and I listened :). And I think the time
>> is now to act.
>>
>> I believe the scope of "Breeze 2" should be part of the design
>> discussion, where we will hear other's opinions (especially the first time
>> or fresh contributors).
>>
>> For now, my vision is quite a bit different than yours Ash :). But I do
>> not want to start a design discussion just yet, I want to make breathing
>> space for others to chime in.
>>
>> I would love to hear many voices and interests of people before we deep
>> dive into what "Breeze 2" might look like.
>>
>> What I am interested in is whether:
>>
>> a) it's the right time
>> b) python is the right choice
>> c) do I have several people who would like to join and offer both - help
>> in designing the vision for it, as well as their time to implement it.
>>
>> I think it is crucial that those people who will be implementing it, will
>> be the main people who make design decisions about it, as I would love to
>> have a strong group of people who would like to not only take part in
>> developing it but also in maintaining it in the future.
>>
>> J.
>>
>>
>> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org>
>> wrote:
>>
>>> Omg yes. I have been screaming out for this for months.
>>>
>>> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
>>> 6911
>>>
>>> That's entirely too much bash for my liking by about an order of
>>> magnitude ;)
>>>
>>> I would also say: make breeze do less. Right now it is three major
>>> things:
>>>
>>> * A local development environment
>>> * CI runner
>>> * It's recently grown the ability to run airflow for developing dags.
>>>
>>> That is too much. Yes there is overlap, but it's just too much in one
>>> tool, and too complex as a result. Some of this should just be replaced
>>> with a docker-compose file (that uses published release images, not
>>> floating master/nightly) and users told to run that.
>>>
>>> Make it simpler, fitting a core purpose - running CI consistently should
>>> be it's only goal.
>>>
>>> -ash
>>>
>>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>> Hello Everyone,
>>>
>>> TL; DR; I was thinking for quite a while on this and I think this is the
>>> right time to raise that subject. It's been asked several times, why Breeze
>>> is not written in something else than Bash since it is "that big" or some
>>> people said "monstrous" :). I think it's the right time to start a
>>> "rewrite" project with wide community involvement and Python seems to be
>>> the best choice :).
>>>
>>>
>>> While I was opposing this while we were focusing on Airflow 2.0, and
>>> there are some good reasons why initially I started Breeze in Bash, I think
>>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based
>>> on Python 3.6 and with some "stability" and "good set of features" we have
>>> in Breeze and a good level of modularisation we achieved - it's the right
>>> time to think about a rewrite.
>>>
>>> I did not raise this subject to add a distraction on top of what is
>>> already a lot of work for 2.0, but I think having Breeze rewritten in
>>> Python could be the "one more thing" that we could do - as a community to
>>> make 2.0 experience even better, and one that can make the community even
>>> closer.
>>>
>>> I was thinking that Breeze is perfect to be split into separate smaller
>>> pieces, describe some assumptions that we will have for its use, and turn
>>> it into a true community effort where a lot of people will contribute and
>>> where we will be able to simplify some of the stuff, and - most importantly
>>> - make more people from the community know about how our CI and development
>>> environment works and be able to solve any problems there.
>>>
>>> Breeze (and underlying bash libraries) are crucial, to get our CI
>>> working and I am mostly the single point of contact (and failure!) when it
>>> comes to that - I would love to not be one :) and I think with most of the
>>> core committers busy with 2.0, this is also an opportunity for more of the
>>> contributors to take their part in it (and eventually earn their rank to
>>> become committers!). For the core committers, this is an extra opportunity
>>> to learn how the system works, influence its design, and possibly simplify
>>> some parts of it - even if they will be mostly focused on 2.0.
>>>
>>> I would like to do it well - write some assumptions in a design doc,
>>> plan the work and split it into separate issues, and lead the effort - but
>>> I would love if most of the work is done by others, who would then become
>>> familiar with the whole of it.
>>>
>>> WDYT? Do you think it is a good idea? Do you thin k it is the right
>>> time? Are there some people in the community who would like to take part in
>>> it?
>>>
>>> J.
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
I also created issue for it: https://github.com/apache/airflow/issues/12282

Anyone interested in taking part - please comment there!

On Wed, Nov 11, 2020 at 11:59 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> You screamed (among many others) and I listened :). And I think the time
> is now to act.
>
> I believe the scope of "Breeze 2" should be part of the design discussion,
> where we will hear other's opinions (especially the first time or fresh
> contributors).
>
> For now, my vision is quite a bit different than yours Ash :). But I do
> not want to start a design discussion just yet, I want to make breathing
> space for others to chime in.
>
> I would love to hear many voices and interests of people before we deep
> dive into what "Breeze 2" might look like.
>
> What I am interested in is whether:
>
> a) it's the right time
> b) python is the right choice
> c) do I have several people who would like to join and offer both - help
> in designing the vision for it, as well as their time to implement it.
>
> I think it is crucial that those people who will be implementing it, will
> be the main people who make design decisions about it, as I would love to
> have a strong group of people who would like to not only take part in
> developing it but also in maintaining it in the future.
>
> J.
>
>
> On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org> wrote:
>
>> Omg yes. I have been screaming out for this for months.
>>
>> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
>> 6911
>>
>> That's entirely too much bash for my liking by about an order of
>> magnitude ;)
>>
>> I would also say: make breeze do less. Right now it is three major things
>> :
>>
>> * A local development environment
>> * CI runner
>> * It's recently grown the ability to run airflow for developing dags.
>>
>> That is too much. Yes there is overlap, but it's just too much in one
>> tool, and too complex as a result. Some of this should just be replaced
>> with a docker-compose file (that uses published release images, not
>> floating master/nightly) and users told to run that.
>>
>> Make it simpler, fitting a core purpose - running CI consistently should
>> be it's only goal.
>>
>> -ash
>>
>> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com>
>> wrote:
>>
>> Hello Everyone,
>>
>> TL; DR; I was thinking for quite a while on this and I think this is the
>> right time to raise that subject. It's been asked several times, why Breeze
>> is not written in something else than Bash since it is "that big" or some
>> people said "monstrous" :). I think it's the right time to start a
>> "rewrite" project with wide community involvement and Python seems to be
>> the best choice :).
>>
>>
>> While I was opposing this while we were focusing on Airflow 2.0, and
>> there are some good reasons why initially I started Breeze in Bash, I think
>> with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based
>> on Python 3.6 and with some "stability" and "good set of features" we have
>> in Breeze and a good level of modularisation we achieved - it's the right
>> time to think about a rewrite.
>>
>> I did not raise this subject to add a distraction on top of what is
>> already a lot of work for 2.0, but I think having Breeze rewritten in
>> Python could be the "one more thing" that we could do - as a community to
>> make 2.0 experience even better, and one that can make the community even
>> closer.
>>
>> I was thinking that Breeze is perfect to be split into separate smaller
>> pieces, describe some assumptions that we will have for its use, and turn
>> it into a true community effort where a lot of people will contribute and
>> where we will be able to simplify some of the stuff, and - most importantly
>> - make more people from the community know about how our CI and development
>> environment works and be able to solve any problems there.
>>
>> Breeze (and underlying bash libraries) are crucial, to get our CI working
>> and I am mostly the single point of contact (and failure!) when it comes to
>> that - I would love to not be one :) and I think with most of the core
>> committers busy with 2.0, this is also an opportunity for more of the
>> contributors to take their part in it (and eventually earn their rank to
>> become committers!). For the core committers, this is an extra opportunity
>> to learn how the system works, influence its design, and possibly simplify
>> some parts of it - even if they will be mostly focused on 2.0.
>>
>> I would like to do it well - write some assumptions in a design doc, plan
>> the work and split it into separate issues, and lead the effort - but I
>> would love if most of the work is done by others, who would then become
>> familiar with the whole of it.
>>
>> WDYT? Do you think it is a good idea? Do you thin k it is the right time?
>> Are there some people in the community who would like to take part in it?
>>
>> J.
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Jarek Potiuk <Ja...@polidea.com>.
You screamed (among many others) and I listened :). And I think the time is
now to act.

I believe the scope of "Breeze 2" should be part of the design discussion,
where we will hear other's opinions (especially the first time or fresh
contributors).

For now, my vision is quite a bit different than yours Ash :). But I do not
want to start a design discussion just yet, I want to make breathing space
for others to chime in.

I would love to hear many voices and interests of people before we deep
dive into what "Breeze 2" might look like.

What I am interested in is whether:

a) it's the right time
b) python is the right choice
c) do I have several people who would like to join and offer both - help in
designing the vision for it, as well as their time to implement it.

I think it is crucial that those people who will be implementing it, will
be the main people who make design decisions about it, as I would love to
have a strong group of people who would like to not only take part in
developing it but also in maintaining it in the future.

J.


On Wed, Nov 11, 2020 at 11:11 AM Ash Berlin-Taylor <as...@apache.org> wrote:

> Omg yes. I have been screaming out for this for months.
>
> $ find scripts -name '*.sh'  | xargs egrep -v '^#' | wc -l
> 6911
>
> That's entirely too much bash for my liking by about an order of magnitude
> ;)
>
> I would also say: make breeze do less. Right now it is three major things:
>
> * A local development environment
> * CI runner
> * It's recently grown the ability to run airflow for developing dags.
>
> That is too much. Yes there is overlap, but it's just too much in one
> tool, and too complex as a result. Some of this should just be replaced
> with a docker-compose file (that uses published release images, not
> floating master/nightly) and users told to run that.
>
> Make it simpler, fitting a core purpose - running CI consistently should
> be it's only goal.
>
> -ash
>
> On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com> wrote:
>
> Hello Everyone,
>
> TL; DR; I was thinking for quite a while on this and I think this is the
> right time to raise that subject. It's been asked several times, why Breeze
> is not written in something else than Bash since it is "that big" or some
> people said "monstrous" :). I think it's the right time to start a
> "rewrite" project with wide community involvement and Python seems to be
> the best choice :).
>
>
> While I was opposing this while we were focusing on Airflow 2.0, and there
> are some good reasons why initially I started Breeze in Bash, I think with
> the current state of Airflow 2.0 betas, with Airflow 2.0 fully based on
> Python 3.6 and with some "stability" and "good set of features" we have in
> Breeze and a good level of modularisation we achieved - it's the right time
> to think about a rewrite.
>
> I did not raise this subject to add a distraction on top of what is
> already a lot of work for 2.0, but I think having Breeze rewritten in
> Python could be the "one more thing" that we could do - as a community to
> make 2.0 experience even better, and one that can make the community even
> closer.
>
> I was thinking that Breeze is perfect to be split into separate smaller
> pieces, describe some assumptions that we will have for its use, and turn
> it into a true community effort where a lot of people will contribute and
> where we will be able to simplify some of the stuff, and - most importantly
> - make more people from the community know about how our CI and development
> environment works and be able to solve any problems there.
>
> Breeze (and underlying bash libraries) are crucial, to get our CI working
> and I am mostly the single point of contact (and failure!) when it comes to
> that - I would love to not be one :) and I think with most of the core
> committers busy with 2.0, this is also an opportunity for more of the
> contributors to take their part in it (and eventually earn their rank to
> become committers!). For the core committers, this is an extra opportunity
> to learn how the system works, influence its design, and possibly simplify
> some parts of it - even if they will be mostly focused on 2.0.
>
> I would like to do it well - write some assumptions in a design doc, plan
> the work and split it into separate issues, and lead the effort - but I
> would love if most of the work is done by others, who would then become
> familiar with the whole of it.
>
> WDYT? Do you think it is a good idea? Do you thin k it is the right time?
> Are there some people in the community who would like to take part in it?
>
> J.
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Rewriting Breeze in Python ?

Posted by Ash Berlin-Taylor <as...@apache.org>.
Omg yes. I have been screaming out for this for months.

$ find scripts -name '*.sh' | xargs egrep -v '^#' | wc -l
6911

That's entirely too much bash for my liking by about an order of magnitude ;)
I would also say: make breeze do less. Right now it is three major things:
* A local development environment
* CI runner
* It's recently grown the ability to run airflow for developing dags.

That is too much. Yes there is overlap, but it's just too much in one tool, and too complex as a result. Some of this should just be replaced with a docker-compose file (that uses published release images, not floating master/nightly) and users told to run that.
Make it simpler, fitting a core purpose - running CI consistently should be it's only goal.
-ash
On Nov 11 2020, at 9:58 am, Jarek Potiuk <Ja...@polidea.com> wrote:
> Hello Everyone,
>
> TL; DR; I was thinking for quite a while on this and I think this is the right time to raise that subject. It's been asked several times, why Breeze is not written in something else than Bash since it is "that big" or some people said "monstrous" :). I think it's the right time to start a "rewrite" project with wide community involvement and Python seems to be the best choice :).
>
>
> While I was opposing this while we were focusing on Airflow 2.0, and there are some good reasons why initially I started Breeze in Bash, I think with the current state of Airflow 2.0 betas, with Airflow 2.0 fully based on Python 3.6 and with some "stability" and "good set of features" we have in Breeze and a good level of modularisation we achieved - it's the right time to think about a rewrite.
>
> I did not raise this subject to add a distraction on top of what is already a lot of work for 2.0, but I think having Breeze rewritten in Python could be the "one more thing" that we could do - as a community to make 2.0 experience even better, and one that can make the community even closer.
>
> I was thinking that Breeze is perfect to be split into separate smaller pieces, describe some assumptions that we will have for its use, and turn it into a true community effort where a lot of people will contribute and where we will be able to simplify some of the stuff, and - most importantly - make more people from the community know about how our CI and development environment works and be able to solve any problems there.
>
> Breeze (and underlying bash libraries) are crucial, to get our CI working and I am mostly the single point of contact (and failure!) when it comes to that - I would love to not be one :) and I think with most of the core committers busy with 2.0, this is also an opportunity for more of the contributors to take their part in it (and eventually earn their rank to become committers!). For the core committers, this is an extra opportunity to learn how the system works, influence its design, and possibly simplify some parts of it - even if they will be mostly focused on 2.0.
>
> I would like to do it well - write some assumptions in a design doc, plan the work and split it into separate issues, and lead the effort - but I would love if most of the work is done by others, who would then become familiar with the whole of it.
>
> WDYT? Do you think it is a good idea? Do you thin k it is the right time? Are there some people in the community who would like to take part in it?
>
> J.
>
> --
>
>
> Jarek Potiuk
> Polidea (https://www.polidea.com/) | Principal Software Engineer
>
>
>
>
>
>
>
> M: +48 660 796 129 (tel:+48660796129)
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>