You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Stephan Ewen <se...@apache.org> on 2019/10/15 21:17:07 UTC

[DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Hi Flink folks!

After the positive reaction to the contribution proposal for Stateful
Functions, I would like to kick off the discussion for the big question: In
which form should it go into Flink?

Before jumping into the "repository" question directly, let's get some
clarity on what would be our high-level goal with this project and the
contribution.
My thinking so far was:

  - Stateful Functions is a way for Flink and stream processing to become
applicable for more general application development. That is a chance to
grow our community to a new crowd of developers.

  - While adding this to Flink gives synergies with the runtime it build on
top of, it makes sense to offer the new developers a lightweight way to get
involved. Simple setup, easy contributions.

  - This is a new project, the API and many designs are not frozen at this
point and may still change heavily.
    To become really good, the project needs to still make a bunch of
iterations (no pun intended) and change many things quickly.

  - The Stateful Functions project will likely try to release very
frequently in its early days, to improve quickly and gather feedback fast.
Being bound to Flink core release cycle would hurt here.


I believe that with all those goals, adding Stateful Functions to the Flink
core repository would not make sense. Flink core has processes that make
sense for an established project that needs to guarantee stability. These
processes are simply prohibitive for new projects to develop.
In addition, the Flink main repository is gigantic, has a build system and
CI system that cannot handle the size of the project any more. Not the best
way to start expanding into a new community.

In some sense, Stateful Functions could make sense as an independent
project, but it is so tightly coupled to Flink right now that I think an
even better fit is a separate repository in Flink.
Think Hive and Hadoop in the early days. That way, we get the synergy
between the two (the same community drives them) while letting both move at
their own speed.
It would somehow mean two closely related projects shepherded by the same
community.

It might be possible at a later stage to either merge this into Flink core
(once Stateful Functions is more settled) or even spin this out as a
standalone Apache project, if that is how the community develops.

That is my main motivation. It is not driven primarily by technicalities
like code versioning and dependencies, but much rather by what is the best
setup to develop this as Flink's way to expand its community towards new
users from a different background.

Curious to hear if that makes sense to you.

Best,
Stephan

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Stephan Ewen <se...@apache.org>.

Yes, all code managed by the Flink project will be "org.apache.flink."

On Wed, Oct 16, 2019 at 9:57 AM Jark Wu <im...@gmail.com> wrote:

> I think it makes sense to keep it in a separate repo. It's a good chance to
> test the pros and cons of "splitting flink repository".
>
> Btw, I think we will change the package path from "com.ververica" to
> "org.apache.flink" even if it goes into a separate repo, right?
>
> Best,
> Jark
>
> On Wed, 16 Oct 2019 at 15:15, Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > I would keep statefun in a separate repo in the beginning, for the
> reasons
> > you mentioned.
> >
> > Best,
> > Aljoscha
> >
> > > On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> > wrote:
> > >
> > > Definitely on the same page..+1 to keep it in a separate repo (at least
> > > until the cose becomes "stable" and widely adopted from the community)
> > >
> > > Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> > >
> > >> Hi Flink folks!
> > >>
> > >> After the positive reaction to the contribution proposal for Stateful
> > >> Functions, I would like to kick off the discussion for the big
> > question: In
> > >> which form should it go into Flink?
> > >>
> > >> Before jumping into the "repository" question directly, let's get some
> > >> clarity on what would be our high-level goal with this project and the
> > >> contribution.
> > >> My thinking so far was:
> > >>
> > >>  - Stateful Functions is a way for Flink and stream processing to
> become
> > >> applicable for more general application development. That is a chance
> to
> > >> grow our community to a new crowd of developers.
> > >>
> > >>  - While adding this to Flink gives synergies with the runtime it
> build
> > on
> > >> top of, it makes sense to offer the new developers a lightweight way
> to
> > get
> > >> involved. Simple setup, easy contributions.
> > >>
> > >>  - This is a new project, the API and many designs are not frozen at
> > this
> > >> point and may still change heavily.
> > >>    To become really good, the project needs to still make a bunch of
> > >> iterations (no pun intended) and change many things quickly.
> > >>
> > >>  - The Stateful Functions project will likely try to release very
> > >> frequently in its early days, to improve quickly and gather feedback
> > fast.
> > >> Being bound to Flink core release cycle would hurt here.
> > >>
> > >>
> > >> I believe that with all those goals, adding Stateful Functions to the
> > Flink
> > >> core repository would not make sense. Flink core has processes that
> make
> > >> sense for an established project that needs to guarantee stability.
> > These
> > >> processes are simply prohibitive for new projects to develop.
> > >> In addition, the Flink main repository is gigantic, has a build system
> > and
> > >> CI system that cannot handle the size of the project any more. Not the
> > best
> > >> way to start expanding into a new community.
> > >>
> > >> In some sense, Stateful Functions could make sense as an independent
> > >> project, but it is so tightly coupled to Flink right now that I think
> an
> > >> even better fit is a separate repository in Flink.
> > >> Think Hive and Hadoop in the early days. That way, we get the synergy
> > >> between the two (the same community drives them) while letting both
> > move at
> > >> their own speed.
> > >> It would somehow mean two closely related projects shepherded by the
> > same
> > >> community.
> > >>
> > >> It might be possible at a later stage to either merge this into Flink
> > core
> > >> (once Stateful Functions is more settled) or even spin this out as a
> > >> standalone Apache project, if that is how the community develops.
> > >>
> > >> That is my main motivation. It is not driven primarily by
> technicalities
> > >> like code versioning and dependencies, but much rather by what is the
> > best
> > >> setup to develop this as Flink's way to expand its community towards
> new
> > >> users from a different background.
> > >>
> > >> Curious to hear if that makes sense to you.
> > >>
> > >> Best,
> > >> Stephan
> > >>
> >
> >
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Jark Wu <im...@gmail.com>.

I think it makes sense to keep it in a separate repo. It's a good chance to
test the pros and cons of "splitting flink repository".

Btw, I think we will change the package path from "com.ververica" to
"org.apache.flink" even if it goes into a separate repo, right?

Best,
Jark

On Wed, 16 Oct 2019 at 15:15, Aljoscha Krettek <al...@apache.org> wrote:

> I would keep statefun in a separate repo in the beginning, for the reasons
> you mentioned.
>
> Best,
> Aljoscha
>
> > On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> wrote:
> >
> > Definitely on the same page..+1 to keep it in a separate repo (at least
> > until the cose becomes "stable" and widely adopted from the community)
> >
> > Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> >
> >> Hi Flink folks!
> >>
> >> After the positive reaction to the contribution proposal for Stateful
> >> Functions, I would like to kick off the discussion for the big
> question: In
> >> which form should it go into Flink?
> >>
> >> Before jumping into the "repository" question directly, let's get some
> >> clarity on what would be our high-level goal with this project and the
> >> contribution.
> >> My thinking so far was:
> >>
> >>  - Stateful Functions is a way for Flink and stream processing to become
> >> applicable for more general application development. That is a chance to
> >> grow our community to a new crowd of developers.
> >>
> >>  - While adding this to Flink gives synergies with the runtime it build
> on
> >> top of, it makes sense to offer the new developers a lightweight way to
> get
> >> involved. Simple setup, easy contributions.
> >>
> >>  - This is a new project, the API and many designs are not frozen at
> this
> >> point and may still change heavily.
> >>    To become really good, the project needs to still make a bunch of
> >> iterations (no pun intended) and change many things quickly.
> >>
> >>  - The Stateful Functions project will likely try to release very
> >> frequently in its early days, to improve quickly and gather feedback
> fast.
> >> Being bound to Flink core release cycle would hurt here.
> >>
> >>
> >> I believe that with all those goals, adding Stateful Functions to the
> Flink
> >> core repository would not make sense. Flink core has processes that make
> >> sense for an established project that needs to guarantee stability.
> These
> >> processes are simply prohibitive for new projects to develop.
> >> In addition, the Flink main repository is gigantic, has a build system
> and
> >> CI system that cannot handle the size of the project any more. Not the
> best
> >> way to start expanding into a new community.
> >>
> >> In some sense, Stateful Functions could make sense as an independent
> >> project, but it is so tightly coupled to Flink right now that I think an
> >> even better fit is a separate repository in Flink.
> >> Think Hive and Hadoop in the early days. That way, we get the synergy
> >> between the two (the same community drives them) while letting both
> move at
> >> their own speed.
> >> It would somehow mean two closely related projects shepherded by the
> same
> >> community.
> >>
> >> It might be possible at a later stage to either merge this into Flink
> core
> >> (once Stateful Functions is more settled) or even spin this out as a
> >> standalone Apache project, if that is how the community develops.
> >>
> >> That is my main motivation. It is not driven primarily by technicalities
> >> like code versioning and dependencies, but much rather by what is the
> best
> >> setup to develop this as Flink's way to expand its community towards new
> >> users from a different background.
> >>
> >> Curious to hear if that makes sense to you.
> >>
> >> Best,
> >> Stephan
> >>
>
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Bowen Li <bo...@gmail.com>.

+1 for separate repo right now for all the good discussed

On Wed, Nov 6, 2019 at 3:35 PM Becket Qin <be...@gmail.com> wrote:

> +1 on having a separate repository.
>
> I am always an advocate of separate repositories. All the substantial
> benefits of doing that are quite convincing. The only reason we might want
> to make Stateful Function in main repo is probably because it looks just
> like CEP, Gelly and other libraries that are for specific use cases. It is
> kind of philosophical. But given Stateful Function seems no longer a "data
> processing" use case, it looks also reasonable to treat it differently. And
> as others mentioned, we can always put it into the main repo later if we
> want to.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Nov 6, 2019 at 6:25 PM Stephan Ewen <se...@apache.org> wrote:
>
> > Are still open questions here?
> >
> > Or can I treat this discussion as converged in the sense of concluding
> > that:
> >   - we start initially with a separate repository to allow for individual
> > releases in the early stages
> >   - we later revisit this discussion once the project is a bit further
> > along and more converged
> >
> > Best,
> > Stephan
> >
> >
> > On Wed, Oct 16, 2019 at 3:03 PM Stephan Ewen <se...@apache.org> wrote:
> >
> > > Whether the side project will be overlooked of not will depends a lot
> on
> > > how we integrate it with the current Flink website and documentation.
> > >
> > > I would think that a separate repository is not necessarily a big
> problem
> > > there.
> > > It might also help, because a link to that repo shows prominently that
> > > particular angle of the project (application development), rather than
> it
> > > being an API hidden between 100 modules.
> > >
> > > On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <tw...@apache.org>
> > wrote:
> > >
> > >> Hi Stephan,
> > >>
> > >> +1 for keeping it in a separate repository for fast release cycles and
> > >> stability until it is mature enough. But we should definitely merge it
> > >> back to the core repo also for marketing reasons.
> > >>
> > >> IMHO side projects tend to be overlooked by the outside world even
> > >> though they are great technology.
> > >>
> > >> Would we still document the code in our main documentation or on a
> > >> separate website?
> > >>
> > >> Thanks,
> > >> Timo
> > >>
> > >>
> > >> On 16.10.19 09:15, Aljoscha Krettek wrote:
> > >> > I would keep statefun in a separate repo in the beginning, for the
> > >> reasons you mentioned.
> > >> >
> > >> > Best,
> > >> > Aljoscha
> > >> >
> > >> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <
> pompermaier@okkam.it>
> > >> wrote:
> > >> >>
> > >> >> Definitely on the same page..+1 to keep it in a separate repo (at
> > least
> > >> >> until the cose becomes "stable" and widely adopted from the
> > community)
> > >> >>
> > >> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha
> > scritto:
> > >> >>
> > >> >>> Hi Flink folks!
> > >> >>>
> > >> >>> After the positive reaction to the contribution proposal for
> > Stateful
> > >> >>> Functions, I would like to kick off the discussion for the big
> > >> question: In
> > >> >>> which form should it go into Flink?
> > >> >>>
> > >> >>> Before jumping into the "repository" question directly, let's get
> > some
> > >> >>> clarity on what would be our high-level goal with this project and
> > the
> > >> >>> contribution.
> > >> >>> My thinking so far was:
> > >> >>>
> > >> >>>   - Stateful Functions is a way for Flink and stream processing to
> > >> become
> > >> >>> applicable for more general application development. That is a
> > chance
> > >> to
> > >> >>> grow our community to a new crowd of developers.
> > >> >>>
> > >> >>>   - While adding this to Flink gives synergies with the runtime it
> > >> build on
> > >> >>> top of, it makes sense to offer the new developers a lightweight
> way
> > >> to get
> > >> >>> involved. Simple setup, easy contributions.
> > >> >>>
> > >> >>>   - This is a new project, the API and many designs are not frozen
> > at
> > >> this
> > >> >>> point and may still change heavily.
> > >> >>>     To become really good, the project needs to still make a bunch
> > of
> > >> >>> iterations (no pun intended) and change many things quickly.
> > >> >>>
> > >> >>>   - The Stateful Functions project will likely try to release very
> > >> >>> frequently in its early days, to improve quickly and gather
> feedback
> > >> fast.
> > >> >>> Being bound to Flink core release cycle would hurt here.
> > >> >>>
> > >> >>>
> > >> >>> I believe that with all those goals, adding Stateful Functions to
> > the
> > >> Flink
> > >> >>> core repository would not make sense. Flink core has processes
> that
> > >> make
> > >> >>> sense for an established project that needs to guarantee
> stability.
> > >> These
> > >> >>> processes are simply prohibitive for new projects to develop.
> > >> >>> In addition, the Flink main repository is gigantic, has a build
> > >> system and
> > >> >>> CI system that cannot handle the size of the project any more. Not
> > >> the best
> > >> >>> way to start expanding into a new community.
> > >> >>>
> > >> >>> In some sense, Stateful Functions could make sense as an
> independent
> > >> >>> project, but it is so tightly coupled to Flink right now that I
> > think
> > >> an
> > >> >>> even better fit is a separate repository in Flink.
> > >> >>> Think Hive and Hadoop in the early days. That way, we get the
> > synergy
> > >> >>> between the two (the same community drives them) while letting
> both
> > >> move at
> > >> >>> their own speed.
> > >> >>> It would somehow mean two closely related projects shepherded by
> the
> > >> same
> > >> >>> community.
> > >> >>>
> > >> >>> It might be possible at a later stage to either merge this into
> > Flink
> > >> core
> > >> >>> (once Stateful Functions is more settled) or even spin this out
> as a
> > >> >>> standalone Apache project, if that is how the community develops.
> > >> >>>
> > >> >>> That is my main motivation. It is not driven primarily by
> > >> technicalities
> > >> >>> like code versioning and dependencies, but much rather by what is
> > the
> > >> best
> > >> >>> setup to develop this as Flink's way to expand its community
> towards
> > >> new
> > >> >>> users from a different background.
> > >> >>>
> > >> >>> Curious to hear if that makes sense to you.
> > >> >>>
> > >> >>> Best,
> > >> >>> Stephan
> > >> >>>
> > >>
> > >>
> >
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Becket Qin <be...@gmail.com>.

+1 on having a separate repository.

I am always an advocate of separate repositories. All the substantial
benefits of doing that are quite convincing. The only reason we might want
to make Stateful Function in main repo is probably because it looks just
like CEP, Gelly and other libraries that are for specific use cases. It is
kind of philosophical. But given Stateful Function seems no longer a "data
processing" use case, it looks also reasonable to treat it differently. And
as others mentioned, we can always put it into the main repo later if we
want to.

Thanks,

Jiangjie (Becket) Qin

On Wed, Nov 6, 2019 at 6:25 PM Stephan Ewen <se...@apache.org> wrote:

> Are still open questions here?
>
> Or can I treat this discussion as converged in the sense of concluding
> that:
>   - we start initially with a separate repository to allow for individual
> releases in the early stages
>   - we later revisit this discussion once the project is a bit further
> along and more converged
>
> Best,
> Stephan
>
>
> On Wed, Oct 16, 2019 at 3:03 PM Stephan Ewen <se...@apache.org> wrote:
>
> > Whether the side project will be overlooked of not will depends a lot on
> > how we integrate it with the current Flink website and documentation.
> >
> > I would think that a separate repository is not necessarily a big problem
> > there.
> > It might also help, because a link to that repo shows prominently that
> > particular angle of the project (application development), rather than it
> > being an API hidden between 100 modules.
> >
> > On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <tw...@apache.org>
> wrote:
> >
> >> Hi Stephan,
> >>
> >> +1 for keeping it in a separate repository for fast release cycles and
> >> stability until it is mature enough. But we should definitely merge it
> >> back to the core repo also for marketing reasons.
> >>
> >> IMHO side projects tend to be overlooked by the outside world even
> >> though they are great technology.
> >>
> >> Would we still document the code in our main documentation or on a
> >> separate website?
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 16.10.19 09:15, Aljoscha Krettek wrote:
> >> > I would keep statefun in a separate repo in the beginning, for the
> >> reasons you mentioned.
> >> >
> >> > Best,
> >> > Aljoscha
> >> >
> >> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> >> wrote:
> >> >>
> >> >> Definitely on the same page..+1 to keep it in a separate repo (at
> least
> >> >> until the cose becomes "stable" and widely adopted from the
> community)
> >> >>
> >> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha
> scritto:
> >> >>
> >> >>> Hi Flink folks!
> >> >>>
> >> >>> After the positive reaction to the contribution proposal for
> Stateful
> >> >>> Functions, I would like to kick off the discussion for the big
> >> question: In
> >> >>> which form should it go into Flink?
> >> >>>
> >> >>> Before jumping into the "repository" question directly, let's get
> some
> >> >>> clarity on what would be our high-level goal with this project and
> the
> >> >>> contribution.
> >> >>> My thinking so far was:
> >> >>>
> >> >>>   - Stateful Functions is a way for Flink and stream processing to
> >> become
> >> >>> applicable for more general application development. That is a
> chance
> >> to
> >> >>> grow our community to a new crowd of developers.
> >> >>>
> >> >>>   - While adding this to Flink gives synergies with the runtime it
> >> build on
> >> >>> top of, it makes sense to offer the new developers a lightweight way
> >> to get
> >> >>> involved. Simple setup, easy contributions.
> >> >>>
> >> >>>   - This is a new project, the API and many designs are not frozen
> at
> >> this
> >> >>> point and may still change heavily.
> >> >>>     To become really good, the project needs to still make a bunch
> of
> >> >>> iterations (no pun intended) and change many things quickly.
> >> >>>
> >> >>>   - The Stateful Functions project will likely try to release very
> >> >>> frequently in its early days, to improve quickly and gather feedback
> >> fast.
> >> >>> Being bound to Flink core release cycle would hurt here.
> >> >>>
> >> >>>
> >> >>> I believe that with all those goals, adding Stateful Functions to
> the
> >> Flink
> >> >>> core repository would not make sense. Flink core has processes that
> >> make
> >> >>> sense for an established project that needs to guarantee stability.
> >> These
> >> >>> processes are simply prohibitive for new projects to develop.
> >> >>> In addition, the Flink main repository is gigantic, has a build
> >> system and
> >> >>> CI system that cannot handle the size of the project any more. Not
> >> the best
> >> >>> way to start expanding into a new community.
> >> >>>
> >> >>> In some sense, Stateful Functions could make sense as an independent
> >> >>> project, but it is so tightly coupled to Flink right now that I
> think
> >> an
> >> >>> even better fit is a separate repository in Flink.
> >> >>> Think Hive and Hadoop in the early days. That way, we get the
> synergy
> >> >>> between the two (the same community drives them) while letting both
> >> move at
> >> >>> their own speed.
> >> >>> It would somehow mean two closely related projects shepherded by the
> >> same
> >> >>> community.
> >> >>>
> >> >>> It might be possible at a later stage to either merge this into
> Flink
> >> core
> >> >>> (once Stateful Functions is more settled) or even spin this out as a
> >> >>> standalone Apache project, if that is how the community develops.
> >> >>>
> >> >>> That is my main motivation. It is not driven primarily by
> >> technicalities
> >> >>> like code versioning and dependencies, but much rather by what is
> the
> >> best
> >> >>> setup to develop this as Flink's way to expand its community towards
> >> new
> >> >>> users from a different background.
> >> >>>
> >> >>> Curious to hear if that makes sense to you.
> >> >>>
> >> >>> Best,
> >> >>> Stephan
> >> >>>
> >>
> >>
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Stephan Ewen <se...@apache.org>.

Are still open questions here?

Or can I treat this discussion as converged in the sense of concluding that:
  - we start initially with a separate repository to allow for individual
releases in the early stages
  - we later revisit this discussion once the project is a bit further
along and more converged

Best,
Stephan


On Wed, Oct 16, 2019 at 3:03 PM Stephan Ewen <se...@apache.org> wrote:

> Whether the side project will be overlooked of not will depends a lot on
> how we integrate it with the current Flink website and documentation.
>
> I would think that a separate repository is not necessarily a big problem
> there.
> It might also help, because a link to that repo shows prominently that
> particular angle of the project (application development), rather than it
> being an API hidden between 100 modules.
>
> On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <tw...@apache.org> wrote:
>
>> Hi Stephan,
>>
>> +1 for keeping it in a separate repository for fast release cycles and
>> stability until it is mature enough. But we should definitely merge it
>> back to the core repo also for marketing reasons.
>>
>> IMHO side projects tend to be overlooked by the outside world even
>> though they are great technology.
>>
>> Would we still document the code in our main documentation or on a
>> separate website?
>>
>> Thanks,
>> Timo
>>
>>
>> On 16.10.19 09:15, Aljoscha Krettek wrote:
>> > I would keep statefun in a separate repo in the beginning, for the
>> reasons you mentioned.
>> >
>> > Best,
>> > Aljoscha
>> >
>> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
>> wrote:
>> >>
>> >> Definitely on the same page..+1 to keep it in a separate repo (at least
>> >> until the cose becomes "stable" and widely adopted from the community)
>> >>
>> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
>> >>
>> >>> Hi Flink folks!
>> >>>
>> >>> After the positive reaction to the contribution proposal for Stateful
>> >>> Functions, I would like to kick off the discussion for the big
>> question: In
>> >>> which form should it go into Flink?
>> >>>
>> >>> Before jumping into the "repository" question directly, let's get some
>> >>> clarity on what would be our high-level goal with this project and the
>> >>> contribution.
>> >>> My thinking so far was:
>> >>>
>> >>>   - Stateful Functions is a way for Flink and stream processing to
>> become
>> >>> applicable for more general application development. That is a chance
>> to
>> >>> grow our community to a new crowd of developers.
>> >>>
>> >>>   - While adding this to Flink gives synergies with the runtime it
>> build on
>> >>> top of, it makes sense to offer the new developers a lightweight way
>> to get
>> >>> involved. Simple setup, easy contributions.
>> >>>
>> >>>   - This is a new project, the API and many designs are not frozen at
>> this
>> >>> point and may still change heavily.
>> >>>     To become really good, the project needs to still make a bunch of
>> >>> iterations (no pun intended) and change many things quickly.
>> >>>
>> >>>   - The Stateful Functions project will likely try to release very
>> >>> frequently in its early days, to improve quickly and gather feedback
>> fast.
>> >>> Being bound to Flink core release cycle would hurt here.
>> >>>
>> >>>
>> >>> I believe that with all those goals, adding Stateful Functions to the
>> Flink
>> >>> core repository would not make sense. Flink core has processes that
>> make
>> >>> sense for an established project that needs to guarantee stability.
>> These
>> >>> processes are simply prohibitive for new projects to develop.
>> >>> In addition, the Flink main repository is gigantic, has a build
>> system and
>> >>> CI system that cannot handle the size of the project any more. Not
>> the best
>> >>> way to start expanding into a new community.
>> >>>
>> >>> In some sense, Stateful Functions could make sense as an independent
>> >>> project, but it is so tightly coupled to Flink right now that I think
>> an
>> >>> even better fit is a separate repository in Flink.
>> >>> Think Hive and Hadoop in the early days. That way, we get the synergy
>> >>> between the two (the same community drives them) while letting both
>> move at
>> >>> their own speed.
>> >>> It would somehow mean two closely related projects shepherded by the
>> same
>> >>> community.
>> >>>
>> >>> It might be possible at a later stage to either merge this into Flink
>> core
>> >>> (once Stateful Functions is more settled) or even spin this out as a
>> >>> standalone Apache project, if that is how the community develops.
>> >>>
>> >>> That is my main motivation. It is not driven primarily by
>> technicalities
>> >>> like code versioning and dependencies, but much rather by what is the
>> best
>> >>> setup to develop this as Flink's way to expand its community towards
>> new
>> >>> users from a different background.
>> >>>
>> >>> Curious to hear if that makes sense to you.
>> >>>
>> >>> Best,
>> >>> Stephan
>> >>>
>>
>>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Stephan Ewen <se...@apache.org>.

Whether the side project will be overlooked of not will depends a lot on
how we integrate it with the current Flink website and documentation.

I would think that a separate repository is not necessarily a big problem
there.
It might also help, because a link to that repo shows prominently that
particular angle of the project (application development), rather than it
being an API hidden between 100 modules.

On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <tw...@apache.org> wrote:

> Hi Stephan,
>
> +1 for keeping it in a separate repository for fast release cycles and
> stability until it is mature enough. But we should definitely merge it
> back to the core repo also for marketing reasons.
>
> IMHO side projects tend to be overlooked by the outside world even
> though they are great technology.
>
> Would we still document the code in our main documentation or on a
> separate website?
>
> Thanks,
> Timo
>
>
> On 16.10.19 09:15, Aljoscha Krettek wrote:
> > I would keep statefun in a separate repo in the beginning, for the
> reasons you mentioned.
> >
> > Best,
> > Aljoscha
> >
> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> wrote:
> >>
> >> Definitely on the same page..+1 to keep it in a separate repo (at least
> >> until the cose becomes "stable" and widely adopted from the community)
> >>
> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> >>
> >>> Hi Flink folks!
> >>>
> >>> After the positive reaction to the contribution proposal for Stateful
> >>> Functions, I would like to kick off the discussion for the big
> question: In
> >>> which form should it go into Flink?
> >>>
> >>> Before jumping into the "repository" question directly, let's get some
> >>> clarity on what would be our high-level goal with this project and the
> >>> contribution.
> >>> My thinking so far was:
> >>>
> >>>   - Stateful Functions is a way for Flink and stream processing to
> become
> >>> applicable for more general application development. That is a chance
> to
> >>> grow our community to a new crowd of developers.
> >>>
> >>>   - While adding this to Flink gives synergies with the runtime it
> build on
> >>> top of, it makes sense to offer the new developers a lightweight way
> to get
> >>> involved. Simple setup, easy contributions.
> >>>
> >>>   - This is a new project, the API and many designs are not frozen at
> this
> >>> point and may still change heavily.
> >>>     To become really good, the project needs to still make a bunch of
> >>> iterations (no pun intended) and change many things quickly.
> >>>
> >>>   - The Stateful Functions project will likely try to release very
> >>> frequently in its early days, to improve quickly and gather feedback
> fast.
> >>> Being bound to Flink core release cycle would hurt here.
> >>>
> >>>
> >>> I believe that with all those goals, adding Stateful Functions to the
> Flink
> >>> core repository would not make sense. Flink core has processes that
> make
> >>> sense for an established project that needs to guarantee stability.
> These
> >>> processes are simply prohibitive for new projects to develop.
> >>> In addition, the Flink main repository is gigantic, has a build system
> and
> >>> CI system that cannot handle the size of the project any more. Not the
> best
> >>> way to start expanding into a new community.
> >>>
> >>> In some sense, Stateful Functions could make sense as an independent
> >>> project, but it is so tightly coupled to Flink right now that I think
> an
> >>> even better fit is a separate repository in Flink.
> >>> Think Hive and Hadoop in the early days. That way, we get the synergy
> >>> between the two (the same community drives them) while letting both
> move at
> >>> their own speed.
> >>> It would somehow mean two closely related projects shepherded by the
> same
> >>> community.
> >>>
> >>> It might be possible at a later stage to either merge this into Flink
> core
> >>> (once Stateful Functions is more settled) or even spin this out as a
> >>> standalone Apache project, if that is how the community develops.
> >>>
> >>> That is my main motivation. It is not driven primarily by
> technicalities
> >>> like code versioning and dependencies, but much rather by what is the
> best
> >>> setup to develop this as Flink's way to expand its community towards
> new
> >>> users from a different background.
> >>>
> >>> Curious to hear if that makes sense to you.
> >>>
> >>> Best,
> >>> Stephan
> >>>
>
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Kostas Kloudas <kk...@gmail.com>.

Hi all,

Although in the initial thread I said that, in general, I would prefer
having one repository, I understand that arguments presented here and
I think it makes sense for such a young project to have its own
repository.

So +1 from my side, with an asterisk about hoping that eventually the
project is going to be merged in the main flink repo.

For the website, I agree with Till, i.e. separate website but with the
prominent link from the main Flink docs.

Cheers,
Kostas

On Wed, Oct 16, 2019 at 11:07 AM Till Rohrmann <tr...@apache.org> wrote:
>
> I think it makes sense to keep the stateful functions code in a separate
> repository in the beginning as described. At a later point in time we could
> revisit this topic if we see that the split codebase becomes a problem or
> if there are other benefits such as better visibility.
>
> For the website, we could keep them separate but put a prominent link from
> the Flink documentation to the stateful functions documentation.
>
> Cheers,
> Till
>
> On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <tw...@apache.org> wrote:
>
> > Hi Stephan,
> >
> > +1 for keeping it in a separate repository for fast release cycles and
> > stability until it is mature enough. But we should definitely merge it
> > back to the core repo also for marketing reasons.
> >
> > IMHO side projects tend to be overlooked by the outside world even
> > though they are great technology.
> >
> > Would we still document the code in our main documentation or on a
> > separate website?
> >
> > Thanks,
> > Timo
> >
> >
> > On 16.10.19 09:15, Aljoscha Krettek wrote:
> > > I would keep statefun in a separate repo in the beginning, for the
> > reasons you mentioned.
> > >
> > > Best,
> > > Aljoscha
> > >
> > >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> > wrote:
> > >>
> > >> Definitely on the same page..+1 to keep it in a separate repo (at least
> > >> until the cose becomes "stable" and widely adopted from the community)
> > >>
> > >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> > >>
> > >>> Hi Flink folks!
> > >>>
> > >>> After the positive reaction to the contribution proposal for Stateful
> > >>> Functions, I would like to kick off the discussion for the big
> > question: In
> > >>> which form should it go into Flink?
> > >>>
> > >>> Before jumping into the "repository" question directly, let's get some
> > >>> clarity on what would be our high-level goal with this project and the
> > >>> contribution.
> > >>> My thinking so far was:
> > >>>
> > >>>   - Stateful Functions is a way for Flink and stream processing to
> > become
> > >>> applicable for more general application development. That is a chance
> > to
> > >>> grow our community to a new crowd of developers.
> > >>>
> > >>>   - While adding this to Flink gives synergies with the runtime it
> > build on
> > >>> top of, it makes sense to offer the new developers a lightweight way
> > to get
> > >>> involved. Simple setup, easy contributions.
> > >>>
> > >>>   - This is a new project, the API and many designs are not frozen at
> > this
> > >>> point and may still change heavily.
> > >>>     To become really good, the project needs to still make a bunch of
> > >>> iterations (no pun intended) and change many things quickly.
> > >>>
> > >>>   - The Stateful Functions project will likely try to release very
> > >>> frequently in its early days, to improve quickly and gather feedback
> > fast.
> > >>> Being bound to Flink core release cycle would hurt here.
> > >>>
> > >>>
> > >>> I believe that with all those goals, adding Stateful Functions to the
> > Flink
> > >>> core repository would not make sense. Flink core has processes that
> > make
> > >>> sense for an established project that needs to guarantee stability.
> > These
> > >>> processes are simply prohibitive for new projects to develop.
> > >>> In addition, the Flink main repository is gigantic, has a build system
> > and
> > >>> CI system that cannot handle the size of the project any more. Not the
> > best
> > >>> way to start expanding into a new community.
> > >>>
> > >>> In some sense, Stateful Functions could make sense as an independent
> > >>> project, but it is so tightly coupled to Flink right now that I think
> > an
> > >>> even better fit is a separate repository in Flink.
> > >>> Think Hive and Hadoop in the early days. That way, we get the synergy
> > >>> between the two (the same community drives them) while letting both
> > move at
> > >>> their own speed.
> > >>> It would somehow mean two closely related projects shepherded by the
> > same
> > >>> community.
> > >>>
> > >>> It might be possible at a later stage to either merge this into Flink
> > core
> > >>> (once Stateful Functions is more settled) or even spin this out as a
> > >>> standalone Apache project, if that is how the community develops.
> > >>>
> > >>> That is my main motivation. It is not driven primarily by
> > technicalities
> > >>> like code versioning and dependencies, but much rather by what is the
> > best
> > >>> setup to develop this as Flink's way to expand its community towards
> > new
> > >>> users from a different background.
> > >>>
> > >>> Curious to hear if that makes sense to you.
> > >>>
> > >>> Best,
> > >>> Stephan
> > >>>
> >
> >

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Till Rohrmann <tr...@apache.org>.

I think it makes sense to keep the stateful functions code in a separate
repository in the beginning as described. At a later point in time we could
revisit this topic if we see that the split codebase becomes a problem or
if there are other benefits such as better visibility.

For the website, we could keep them separate but put a prominent link from
the Flink documentation to the stateful functions documentation.

Cheers,
Till

On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <tw...@apache.org> wrote:

> Hi Stephan,
>
> +1 for keeping it in a separate repository for fast release cycles and
> stability until it is mature enough. But we should definitely merge it
> back to the core repo also for marketing reasons.
>
> IMHO side projects tend to be overlooked by the outside world even
> though they are great technology.
>
> Would we still document the code in our main documentation or on a
> separate website?
>
> Thanks,
> Timo
>
>
> On 16.10.19 09:15, Aljoscha Krettek wrote:
> > I would keep statefun in a separate repo in the beginning, for the
> reasons you mentioned.
> >
> > Best,
> > Aljoscha
> >
> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> wrote:
> >>
> >> Definitely on the same page..+1 to keep it in a separate repo (at least
> >> until the cose becomes "stable" and widely adopted from the community)
> >>
> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> >>
> >>> Hi Flink folks!
> >>>
> >>> After the positive reaction to the contribution proposal for Stateful
> >>> Functions, I would like to kick off the discussion for the big
> question: In
> >>> which form should it go into Flink?
> >>>
> >>> Before jumping into the "repository" question directly, let's get some
> >>> clarity on what would be our high-level goal with this project and the
> >>> contribution.
> >>> My thinking so far was:
> >>>
> >>>   - Stateful Functions is a way for Flink and stream processing to
> become
> >>> applicable for more general application development. That is a chance
> to
> >>> grow our community to a new crowd of developers.
> >>>
> >>>   - While adding this to Flink gives synergies with the runtime it
> build on
> >>> top of, it makes sense to offer the new developers a lightweight way
> to get
> >>> involved. Simple setup, easy contributions.
> >>>
> >>>   - This is a new project, the API and many designs are not frozen at
> this
> >>> point and may still change heavily.
> >>>     To become really good, the project needs to still make a bunch of
> >>> iterations (no pun intended) and change many things quickly.
> >>>
> >>>   - The Stateful Functions project will likely try to release very
> >>> frequently in its early days, to improve quickly and gather feedback
> fast.
> >>> Being bound to Flink core release cycle would hurt here.
> >>>
> >>>
> >>> I believe that with all those goals, adding Stateful Functions to the
> Flink
> >>> core repository would not make sense. Flink core has processes that
> make
> >>> sense for an established project that needs to guarantee stability.
> These
> >>> processes are simply prohibitive for new projects to develop.
> >>> In addition, the Flink main repository is gigantic, has a build system
> and
> >>> CI system that cannot handle the size of the project any more. Not the
> best
> >>> way to start expanding into a new community.
> >>>
> >>> In some sense, Stateful Functions could make sense as an independent
> >>> project, but it is so tightly coupled to Flink right now that I think
> an
> >>> even better fit is a separate repository in Flink.
> >>> Think Hive and Hadoop in the early days. That way, we get the synergy
> >>> between the two (the same community drives them) while letting both
> move at
> >>> their own speed.
> >>> It would somehow mean two closely related projects shepherded by the
> same
> >>> community.
> >>>
> >>> It might be possible at a later stage to either merge this into Flink
> core
> >>> (once Stateful Functions is more settled) or even spin this out as a
> >>> standalone Apache project, if that is how the community develops.
> >>>
> >>> That is my main motivation. It is not driven primarily by
> technicalities
> >>> like code versioning and dependencies, but much rather by what is the
> best
> >>> setup to develop this as Flink's way to expand its community towards
> new
> >>> users from a different background.
> >>>
> >>> Curious to hear if that makes sense to you.
> >>>
> >>> Best,
> >>> Stephan
> >>>
>
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Robert Metzger <rm...@apache.org>.

+1 for separate repositories.
This is also good for the community to collect some experience for a
potential repository split effort at some later point.

On Wed, Oct 16, 2019 at 12:01 PM vino yang <ya...@gmail.com> wrote:

> Hi,
>
> Fast release cycles seems a good viewpoint to support keeping it in a
> separate repository.
>
> IMO, the placement of documentation should keep consistency with the
> repository.
>
> Best,
> Vino
>
> Timo Walther <tw...@apache.org> 于2019年10月16日周三 下午4:02写道：
>
> > Hi Stephan,
> >
> > +1 for keeping it in a separate repository for fast release cycles and
> > stability until it is mature enough. But we should definitely merge it
> > back to the core repo also for marketing reasons.
> >
> > IMHO side projects tend to be overlooked by the outside world even
> > though they are great technology.
> >
> > Would we still document the code in our main documentation or on a
> > separate website?
> >
> > Thanks,
> > Timo
> >
> >
> > On 16.10.19 09:15, Aljoscha Krettek wrote:
> > > I would keep statefun in a separate repo in the beginning, for the
> > reasons you mentioned.
> > >
> > > Best,
> > > Aljoscha
> > >
> > >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> > wrote:
> > >>
> > >> Definitely on the same page..+1 to keep it in a separate repo (at
> least
> > >> until the cose becomes "stable" and widely adopted from the community)
> > >>
> > >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> > >>
> > >>> Hi Flink folks!
> > >>>
> > >>> After the positive reaction to the contribution proposal for Stateful
> > >>> Functions, I would like to kick off the discussion for the big
> > question: In
> > >>> which form should it go into Flink?
> > >>>
> > >>> Before jumping into the "repository" question directly, let's get
> some
> > >>> clarity on what would be our high-level goal with this project and
> the
> > >>> contribution.
> > >>> My thinking so far was:
> > >>>
> > >>>   - Stateful Functions is a way for Flink and stream processing to
> > become
> > >>> applicable for more general application development. That is a chance
> > to
> > >>> grow our community to a new crowd of developers.
> > >>>
> > >>>   - While adding this to Flink gives synergies with the runtime it
> > build on
> > >>> top of, it makes sense to offer the new developers a lightweight way
> > to get
> > >>> involved. Simple setup, easy contributions.
> > >>>
> > >>>   - This is a new project, the API and many designs are not frozen at
> > this
> > >>> point and may still change heavily.
> > >>>     To become really good, the project needs to still make a bunch of
> > >>> iterations (no pun intended) and change many things quickly.
> > >>>
> > >>>   - The Stateful Functions project will likely try to release very
> > >>> frequently in its early days, to improve quickly and gather feedback
> > fast.
> > >>> Being bound to Flink core release cycle would hurt here.
> > >>>
> > >>>
> > >>> I believe that with all those goals, adding Stateful Functions to the
> > Flink
> > >>> core repository would not make sense. Flink core has processes that
> > make
> > >>> sense for an established project that needs to guarantee stability.
> > These
> > >>> processes are simply prohibitive for new projects to develop.
> > >>> In addition, the Flink main repository is gigantic, has a build
> system
> > and
> > >>> CI system that cannot handle the size of the project any more. Not
> the
> > best
> > >>> way to start expanding into a new community.
> > >>>
> > >>> In some sense, Stateful Functions could make sense as an independent
> > >>> project, but it is so tightly coupled to Flink right now that I think
> > an
> > >>> even better fit is a separate repository in Flink.
> > >>> Think Hive and Hadoop in the early days. That way, we get the synergy
> > >>> between the two (the same community drives them) while letting both
> > move at
> > >>> their own speed.
> > >>> It would somehow mean two closely related projects shepherded by the
> > same
> > >>> community.
> > >>>
> > >>> It might be possible at a later stage to either merge this into Flink
> > core
> > >>> (once Stateful Functions is more settled) or even spin this out as a
> > >>> standalone Apache project, if that is how the community develops.
> > >>>
> > >>> That is my main motivation. It is not driven primarily by
> > technicalities
> > >>> like code versioning and dependencies, but much rather by what is the
> > best
> > >>> setup to develop this as Flink's way to expand its community towards
> > new
> > >>> users from a different background.
> > >>>
> > >>> Curious to hear if that makes sense to you.
> > >>>
> > >>> Best,
> > >>> Stephan
> > >>>
> >
> >
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by vino yang <ya...@gmail.com>.

Hi,

Fast release cycles seems a good viewpoint to support keeping it in a
separate repository.

IMO, the placement of documentation should keep consistency with the
repository.

Best,
Vino

Timo Walther <tw...@apache.org> 于2019年10月16日周三 下午4:02写道：

> Hi Stephan,
>
> +1 for keeping it in a separate repository for fast release cycles and
> stability until it is mature enough. But we should definitely merge it
> back to the core repo also for marketing reasons.
>
> IMHO side projects tend to be overlooked by the outside world even
> though they are great technology.
>
> Would we still document the code in our main documentation or on a
> separate website?
>
> Thanks,
> Timo
>
>
> On 16.10.19 09:15, Aljoscha Krettek wrote:
> > I would keep statefun in a separate repo in the beginning, for the
> reasons you mentioned.
> >
> > Best,
> > Aljoscha
> >
> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it>
> wrote:
> >>
> >> Definitely on the same page..+1 to keep it in a separate repo (at least
> >> until the cose becomes "stable" and widely adopted from the community)
> >>
> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> >>
> >>> Hi Flink folks!
> >>>
> >>> After the positive reaction to the contribution proposal for Stateful
> >>> Functions, I would like to kick off the discussion for the big
> question: In
> >>> which form should it go into Flink?
> >>>
> >>> Before jumping into the "repository" question directly, let's get some
> >>> clarity on what would be our high-level goal with this project and the
> >>> contribution.
> >>> My thinking so far was:
> >>>
> >>>   - Stateful Functions is a way for Flink and stream processing to
> become
> >>> applicable for more general application development. That is a chance
> to
> >>> grow our community to a new crowd of developers.
> >>>
> >>>   - While adding this to Flink gives synergies with the runtime it
> build on
> >>> top of, it makes sense to offer the new developers a lightweight way
> to get
> >>> involved. Simple setup, easy contributions.
> >>>
> >>>   - This is a new project, the API and many designs are not frozen at
> this
> >>> point and may still change heavily.
> >>>     To become really good, the project needs to still make a bunch of
> >>> iterations (no pun intended) and change many things quickly.
> >>>
> >>>   - The Stateful Functions project will likely try to release very
> >>> frequently in its early days, to improve quickly and gather feedback
> fast.
> >>> Being bound to Flink core release cycle would hurt here.
> >>>
> >>>
> >>> I believe that with all those goals, adding Stateful Functions to the
> Flink
> >>> core repository would not make sense. Flink core has processes that
> make
> >>> sense for an established project that needs to guarantee stability.
> These
> >>> processes are simply prohibitive for new projects to develop.
> >>> In addition, the Flink main repository is gigantic, has a build system
> and
> >>> CI system that cannot handle the size of the project any more. Not the
> best
> >>> way to start expanding into a new community.
> >>>
> >>> In some sense, Stateful Functions could make sense as an independent
> >>> project, but it is so tightly coupled to Flink right now that I think
> an
> >>> even better fit is a separate repository in Flink.
> >>> Think Hive and Hadoop in the early days. That way, we get the synergy
> >>> between the two (the same community drives them) while letting both
> move at
> >>> their own speed.
> >>> It would somehow mean two closely related projects shepherded by the
> same
> >>> community.
> >>>
> >>> It might be possible at a later stage to either merge this into Flink
> core
> >>> (once Stateful Functions is more settled) or even spin this out as a
> >>> standalone Apache project, if that is how the community develops.
> >>>
> >>> That is my main motivation. It is not driven primarily by
> technicalities
> >>> like code versioning and dependencies, but much rather by what is the
> best
> >>> setup to develop this as Flink's way to expand its community towards
> new
> >>> users from a different background.
> >>>
> >>> Curious to hear if that makes sense to you.
> >>>
> >>> Best,
> >>> Stephan
> >>>
>
>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Timo Walther <tw...@apache.org>.

Hi Stephan,

+1 for keeping it in a separate repository for fast release cycles and 
stability until it is mature enough. But we should definitely merge it 
back to the core repo also for marketing reasons.

IMHO side projects tend to be overlooked by the outside world even 
though they are great technology.

Would we still document the code in our main documentation or on a 
separate website?

Thanks,
Timo


On 16.10.19 09:15, Aljoscha Krettek wrote:
> I would keep statefun in a separate repo in the beginning, for the reasons you mentioned.
>
> Best,
> Aljoscha
>
>> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it> wrote:
>>
>> Definitely on the same page..+1 to keep it in a separate repo (at least
>> until the cose becomes "stable" and widely adopted from the community)
>>
>> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
>>
>>> Hi Flink folks!
>>>
>>> After the positive reaction to the contribution proposal for Stateful
>>> Functions, I would like to kick off the discussion for the big question: In
>>> which form should it go into Flink?
>>>
>>> Before jumping into the "repository" question directly, let's get some
>>> clarity on what would be our high-level goal with this project and the
>>> contribution.
>>> My thinking so far was:
>>>
>>>   - Stateful Functions is a way for Flink and stream processing to become
>>> applicable for more general application development. That is a chance to
>>> grow our community to a new crowd of developers.
>>>
>>>   - While adding this to Flink gives synergies with the runtime it build on
>>> top of, it makes sense to offer the new developers a lightweight way to get
>>> involved. Simple setup, easy contributions.
>>>
>>>   - This is a new project, the API and many designs are not frozen at this
>>> point and may still change heavily.
>>>     To become really good, the project needs to still make a bunch of
>>> iterations (no pun intended) and change many things quickly.
>>>
>>>   - The Stateful Functions project will likely try to release very
>>> frequently in its early days, to improve quickly and gather feedback fast.
>>> Being bound to Flink core release cycle would hurt here.
>>>
>>>
>>> I believe that with all those goals, adding Stateful Functions to the Flink
>>> core repository would not make sense. Flink core has processes that make
>>> sense for an established project that needs to guarantee stability. These
>>> processes are simply prohibitive for new projects to develop.
>>> In addition, the Flink main repository is gigantic, has a build system and
>>> CI system that cannot handle the size of the project any more. Not the best
>>> way to start expanding into a new community.
>>>
>>> In some sense, Stateful Functions could make sense as an independent
>>> project, but it is so tightly coupled to Flink right now that I think an
>>> even better fit is a separate repository in Flink.
>>> Think Hive and Hadoop in the early days. That way, we get the synergy
>>> between the two (the same community drives them) while letting both move at
>>> their own speed.
>>> It would somehow mean two closely related projects shepherded by the same
>>> community.
>>>
>>> It might be possible at a later stage to either merge this into Flink core
>>> (once Stateful Functions is more settled) or even spin this out as a
>>> standalone Apache project, if that is how the community develops.
>>>
>>> That is my main motivation. It is not driven primarily by technicalities
>>> like code versioning and dependencies, but much rather by what is the best
>>> setup to develop this as Flink's way to expand its community towards new
>>> users from a different background.
>>>
>>> Curious to hear if that makes sense to you.
>>>
>>> Best,
>>> Stephan
>>>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Aljoscha Krettek <al...@apache.org>.

I would keep statefun in a separate repo in the beginning, for the reasons you mentioned.

Best,
Aljoscha

> On 15. Oct 2019, at 23:40, Flavio Pompermaier <po...@okkam.it> wrote:
> 
> Definitely on the same page..+1 to keep it in a separate repo (at least
> until the cose becomes "stable" and widely adopted from the community)
> 
> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:
> 
>> Hi Flink folks!
>> 
>> After the positive reaction to the contribution proposal for Stateful
>> Functions, I would like to kick off the discussion for the big question: In
>> which form should it go into Flink?
>> 
>> Before jumping into the "repository" question directly, let's get some
>> clarity on what would be our high-level goal with this project and the
>> contribution.
>> My thinking so far was:
>> 
>>  - Stateful Functions is a way for Flink and stream processing to become
>> applicable for more general application development. That is a chance to
>> grow our community to a new crowd of developers.
>> 
>>  - While adding this to Flink gives synergies with the runtime it build on
>> top of, it makes sense to offer the new developers a lightweight way to get
>> involved. Simple setup, easy contributions.
>> 
>>  - This is a new project, the API and many designs are not frozen at this
>> point and may still change heavily.
>>    To become really good, the project needs to still make a bunch of
>> iterations (no pun intended) and change many things quickly.
>> 
>>  - The Stateful Functions project will likely try to release very
>> frequently in its early days, to improve quickly and gather feedback fast.
>> Being bound to Flink core release cycle would hurt here.
>> 
>> 
>> I believe that with all those goals, adding Stateful Functions to the Flink
>> core repository would not make sense. Flink core has processes that make
>> sense for an established project that needs to guarantee stability. These
>> processes are simply prohibitive for new projects to develop.
>> In addition, the Flink main repository is gigantic, has a build system and
>> CI system that cannot handle the size of the project any more. Not the best
>> way to start expanding into a new community.
>> 
>> In some sense, Stateful Functions could make sense as an independent
>> project, but it is so tightly coupled to Flink right now that I think an
>> even better fit is a separate repository in Flink.
>> Think Hive and Hadoop in the early days. That way, we get the synergy
>> between the two (the same community drives them) while letting both move at
>> their own speed.
>> It would somehow mean two closely related projects shepherded by the same
>> community.
>> 
>> It might be possible at a later stage to either merge this into Flink core
>> (once Stateful Functions is more settled) or even spin this out as a
>> standalone Apache project, if that is how the community develops.
>> 
>> That is my main motivation. It is not driven primarily by technicalities
>> like code versioning and dependencies, but much rather by what is the best
>> setup to develop this as Flink's way to expand its community towards new
>> users from a different background.
>> 
>> Curious to hear if that makes sense to you.
>> 
>> Best,
>> Stephan
>>

Re: [DISCUSS] Stateful Functions - in which form to contribute? (same or different repository)

Posted by Flavio Pompermaier <po...@okkam.it>.

Definitely on the same page..+1 to keep it in a separate repo (at least
until the cose becomes "stable" and widely adopted from the community)

Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha scritto:

> Hi Flink folks!
>
> After the positive reaction to the contribution proposal for Stateful
> Functions, I would like to kick off the discussion for the big question: In
> which form should it go into Flink?
>
> Before jumping into the "repository" question directly, let's get some
> clarity on what would be our high-level goal with this project and the
> contribution.
> My thinking so far was:
>
>   - Stateful Functions is a way for Flink and stream processing to become
> applicable for more general application development. That is a chance to
> grow our community to a new crowd of developers.
>
>   - While adding this to Flink gives synergies with the runtime it build on
> top of, it makes sense to offer the new developers a lightweight way to get
> involved. Simple setup, easy contributions.
>
>   - This is a new project, the API and many designs are not frozen at this
> point and may still change heavily.
>     To become really good, the project needs to still make a bunch of
> iterations (no pun intended) and change many things quickly.
>
>   - The Stateful Functions project will likely try to release very
> frequently in its early days, to improve quickly and gather feedback fast.
> Being bound to Flink core release cycle would hurt here.
>
>
> I believe that with all those goals, adding Stateful Functions to the Flink
> core repository would not make sense. Flink core has processes that make
> sense for an established project that needs to guarantee stability. These
> processes are simply prohibitive for new projects to develop.
> In addition, the Flink main repository is gigantic, has a build system and
> CI system that cannot handle the size of the project any more. Not the best
> way to start expanding into a new community.
>
> In some sense, Stateful Functions could make sense as an independent
> project, but it is so tightly coupled to Flink right now that I think an
> even better fit is a separate repository in Flink.
> Think Hive and Hadoop in the early days. That way, we get the synergy
> between the two (the same community drives them) while letting both move at
> their own speed.
> It would somehow mean two closely related projects shepherded by the same
> community.
>
> It might be possible at a later stage to either merge this into Flink core
> (once Stateful Functions is more settled) or even spin this out as a
> standalone Apache project, if that is how the community develops.
>
> That is my main motivation. It is not driven primarily by technicalities
> like code versioning and dependencies, but much rather by what is the best
> setup to develop this as Flink's way to expand its community towards new
> users from a different background.
>
> Curious to hear if that makes sense to you.
>
> Best,
> Stephan
>