You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Frances Perry <fj...@google.com.INVALID> on 2016/05/19 05:01:43 UTC

[DISCUSS] Developing new components -- branches, maturity, and committers

Hi Beamers --

I’m thrilled by the recent energy and activity on writing new Beam runners!
But that also means it’s probably time for us to figure out how, as a
community, we want to support this process. ;-)

Back near the beginning, we had a thread [1] discussing that feature
branches are the preferred way of doing development of features or
components that may take a while to reach maturity. I think new components
like runners and SDKs meet the bar to be started from a feature branch.
(Other features, like an IO connector or library of PTransforms, might also
qualify depending on complexity.)

We should also lay out what it takes to be considered mature enough to be
merged into master, since once that happens the component gets released to
users and failing tests become blocking issues. Here are some initial
thoughts to kick off the discussion...

In order to be merged into master, new components / major features should:

   -

   have at least 2 contributors interested in maintaining it, and 1
   committer interested in supporting it
   -

   provide both end-user and developer-facing documentation
   -

   have at least a basic level of unit test coverage
   -

   run all existing applicable integration tests with other Beam components
   and create additional tests as appropriate


In addition...

A runner should:

   -

   be able to handle a subset of the model that address a significant set
   of use cases (aka. ‘traditional batch’ or ‘processing time streaming’)
   -

   update the capability matrix with the current status


An SDK* should:

   -

   provide the ability to construct graphs with all the basic building
   blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
   -

   begin fleshing out the common composite transforms (Count, Join, etc)
   and IO connectors (Text, Kafka, etc)
   -

   have at least one runner that can execute the complete model (may be a
   direct runner)
   -

   provide integration tests for executing against current and future
   runners


* A note on DSLs:  I think it’s important to separate out an SDK from a
DSL, because in my mind the former is by definition equivalent to the Beam
model, while the latter may select portions of the model or change the
user-visible abstractions in order to provide a domain-specific experience.
We may want to encourage some DSLs to live separately from Beam because
they may look completely non-Beam-like to their end users. But we can
probably punt this decision until we have concrete examples to discuss.

Another fun part of this growth is that we’ll likely grow new committers.
And given the breadth of Beam, I think it would be useful to annotate our
committers [2] page with which components folks are the most knowledgeable
about.

Looking forward to your thoughts.

[1]
http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E

[2] http://beam.incubator.apache.org/team/

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Amit Sela <am...@gmail.com>.

+1 for Davor's comment on major features being developed in the main
repository.

@Frances: I think the perquisites you describe for new components are
definitely something to aim for, but I guess the project still has some
maturing to do until we're there. We'll probably have to start by helping
new developers and new features get there (and maybe document a few
example-worthy experiences) - and I'm sure we'll do it well ;)

On Thu, May 19, 2016 at 10:37 PM Aljoscha Krettek <al...@apache.org>
wrote:

> +1 I see that for such things it would make sense.
>
> On Thu, 19 May 2016 at 20:59 Davor Bonaci <da...@google.com.invalid>
> wrote:
>
> > If anybody wants to experiment a little with a feature idea --
> absolutely,
> > individual forked repositories are certainly an awesome place for such
> > attempts.
> >
> > However, for something that is a significant undertaking, like a new
> runner
> > or new SDK, I think feature branches in the main repository make total
> > sense. We'd avoid important disadvantages of lower visibility, harder for
> > others to jump in, comment, learn, etc., harder testing because Apache
> > Jenkins wouldn't be able to do it automatically, etc.
> >
> > In summary, I think there's a spectrum of feature complexities and
> > longevity considerations. As such, I'd support being flexible as
> > appropriate, but have a default answer of starting with a feature branch
> in
> > the main repository for new major components.
> >
> > On Thu, May 19, 2016 at 3:09 AM, Ismaël Mejía <ie...@gmail.com> wrote:
> >
> > > I agree with Aljoscha, about not putting the feature branches in the
> main
> > > repo, however how can we make people  aware of the new developments ?
> > >
> > > -Ismaël
> > >
> > > On Thu, May 19, 2016 at 11:56 AM, Aljoscha Krettek <
> aljoscha@apache.org>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > When we say feature branch, are we talking about a branch in the main
> > > repo?
> > > > I would propose that feature branches live in the repos of the
> > committers
> > > > who are working on a feature.
> > > >
> > > > On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofré <jb...@nanthrax.net>
> > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > it looks good to me.
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 05/19/2016 07:01 AM, Frances Perry wrote:
> > > > > > Hi Beamers --
> > > > > >
> > > > > > I’m thrilled by the recent energy and activity on writing new
> Beam
> > > > > runners!
> > > > > > But that also means it’s probably time for us to figure out how,
> > as a
> > > > > > community, we want to support this process. ;-)
> > > > > >
> > > > > > Back near the beginning, we had a thread [1] discussing that
> > feature
> > > > > > branches are the preferred way of doing development of features
> or
> > > > > > components that may take a while to reach maturity. I think new
> > > > > components
> > > > > > like runners and SDKs meet the bar to be started from a feature
> > > branch.
> > > > > > (Other features, like an IO connector or library of PTransforms,
> > > might
> > > > > also
> > > > > > qualify depending on complexity.)
> > > > > >
> > > > > > We should also lay out what it takes to be considered mature
> enough
> > > to
> > > > be
> > > > > > merged into master, since once that happens the component gets
> > > released
> > > > > to
> > > > > > users and failing tests become blocking issues. Here are some
> > initial
> > > > > > thoughts to kick off the discussion...
> > > > > >
> > > > > > In order to be merged into master, new components / major
> features
> > > > > should:
> > > > > >
> > > > > >     -
> > > > > >
> > > > > >     have at least 2 contributors interested in maintaining it,
> and
> > 1
> > > > > >     committer interested in supporting it
> > > > > >     -
> > > > > >
> > > > > >     provide both end-user and developer-facing documentation
> > > > > >     -
> > > > > >
> > > > > >     have at least a basic level of unit test coverage
> > > > > >     -
> > > > > >
> > > > > >     run all existing applicable integration tests with other Beam
> > > > > components
> > > > > >     and create additional tests as appropriate
> > > > > >
> > > > > >
> > > > > > In addition...
> > > > > >
> > > > > > A runner should:
> > > > > >
> > > > > >     -
> > > > > >
> > > > > >     be able to handle a subset of the model that address a
> > > significant
> > > > > set
> > > > > >     of use cases (aka. ‘traditional batch’ or ‘processing time
> > > > > streaming’)
> > > > > >     -
> > > > > >
> > > > > >     update the capability matrix with the current status
> > > > > >
> > > > > >
> > > > > > An SDK* should:
> > > > > >
> > > > > >     -
> > > > > >
> > > > > >     provide the ability to construct graphs with all the basic
> > > building
> > > > > >     blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
> > > > > >     -
> > > > > >
> > > > > >     begin fleshing out the common composite transforms (Count,
> > Join,
> > > > etc)
> > > > > >     and IO connectors (Text, Kafka, etc)
> > > > > >     -
> > > > > >
> > > > > >     have at least one runner that can execute the complete model
> > (may
> > > > be
> > > > > a
> > > > > >     direct runner)
> > > > > >     -
> > > > > >
> > > > > >     provide integration tests for executing against current and
> > > future
> > > > > >     runners
> > > > > >
> > > > > >
> > > > > > * A note on DSLs:  I think it’s important to separate out an SDK
> > > from a
> > > > > > DSL, because in my mind the former is by definition equivalent to
> > the
> > > > > Beam
> > > > > > model, while the latter may select portions of the model or
> change
> > > the
> > > > > > user-visible abstractions in order to provide a domain-specific
> > > > > experience.
> > > > > > We may want to encourage some DSLs to live separately from Beam
> > > because
> > > > > > they may look completely non-Beam-like to their end users. But we
> > can
> > > > > > probably punt this decision until we have concrete examples to
> > > discuss.
> > > > > >
> > > > > > Another fun part of this growth is that we’ll likely grow new
> > > > committers.
> > > > > > And given the breadth of Beam, I think it would be useful to
> > annotate
> > > > our
> > > > > > committers [2] page with which components folks are the most
> > > > > knowledgeable
> > > > > > about.
> > > > > >
> > > > > > Looking forward to your thoughts.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
> > > > > >
> > > > > > [2] http://beam.incubator.apache.org/team/
> > > > > >
> > > > >
> > > > > --
> > > > > Jean-Baptiste Onofré
> > > > > jbonofre@apache.org
> > > > > http://blog.nanthrax.net
> > > > > Talend - http://www.talend.com
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Aljoscha Krettek <al...@apache.org>.

+1 I see that for such things it would make sense.

On Thu, 19 May 2016 at 20:59 Davor Bonaci <da...@google.com.invalid> wrote:

> If anybody wants to experiment a little with a feature idea -- absolutely,
> individual forked repositories are certainly an awesome place for such
> attempts.
>
> However, for something that is a significant undertaking, like a new runner
> or new SDK, I think feature branches in the main repository make total
> sense. We'd avoid important disadvantages of lower visibility, harder for
> others to jump in, comment, learn, etc., harder testing because Apache
> Jenkins wouldn't be able to do it automatically, etc.
>
> In summary, I think there's a spectrum of feature complexities and
> longevity considerations. As such, I'd support being flexible as
> appropriate, but have a default answer of starting with a feature branch in
> the main repository for new major components.
>
> On Thu, May 19, 2016 at 3:09 AM, Ismaël Mejía <ie...@gmail.com> wrote:
>
> > I agree with Aljoscha, about not putting the feature branches in the main
> > repo, however how can we make people  aware of the new developments ?
> >
> > -Ismaël
> >
> > On Thu, May 19, 2016 at 11:56 AM, Aljoscha Krettek <al...@apache.org>
> > wrote:
> >
> > > +1
> > >
> > > When we say feature branch, are we talking about a branch in the main
> > repo?
> > > I would propose that feature branches live in the repos of the
> committers
> > > who are working on a feature.
> > >
> > > On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofré <jb...@nanthrax.net>
> > wrote:
> > >
> > > > +1
> > > >
> > > > it looks good to me.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 05/19/2016 07:01 AM, Frances Perry wrote:
> > > > > Hi Beamers --
> > > > >
> > > > > I’m thrilled by the recent energy and activity on writing new Beam
> > > > runners!
> > > > > But that also means it’s probably time for us to figure out how,
> as a
> > > > > community, we want to support this process. ;-)
> > > > >
> > > > > Back near the beginning, we had a thread [1] discussing that
> feature
> > > > > branches are the preferred way of doing development of features or
> > > > > components that may take a while to reach maturity. I think new
> > > > components
> > > > > like runners and SDKs meet the bar to be started from a feature
> > branch.
> > > > > (Other features, like an IO connector or library of PTransforms,
> > might
> > > > also
> > > > > qualify depending on complexity.)
> > > > >
> > > > > We should also lay out what it takes to be considered mature enough
> > to
> > > be
> > > > > merged into master, since once that happens the component gets
> > released
> > > > to
> > > > > users and failing tests become blocking issues. Here are some
> initial
> > > > > thoughts to kick off the discussion...
> > > > >
> > > > > In order to be merged into master, new components / major features
> > > > should:
> > > > >
> > > > >     -
> > > > >
> > > > >     have at least 2 contributors interested in maintaining it, and
> 1
> > > > >     committer interested in supporting it
> > > > >     -
> > > > >
> > > > >     provide both end-user and developer-facing documentation
> > > > >     -
> > > > >
> > > > >     have at least a basic level of unit test coverage
> > > > >     -
> > > > >
> > > > >     run all existing applicable integration tests with other Beam
> > > > components
> > > > >     and create additional tests as appropriate
> > > > >
> > > > >
> > > > > In addition...
> > > > >
> > > > > A runner should:
> > > > >
> > > > >     -
> > > > >
> > > > >     be able to handle a subset of the model that address a
> > significant
> > > > set
> > > > >     of use cases (aka. ‘traditional batch’ or ‘processing time
> > > > streaming’)
> > > > >     -
> > > > >
> > > > >     update the capability matrix with the current status
> > > > >
> > > > >
> > > > > An SDK* should:
> > > > >
> > > > >     -
> > > > >
> > > > >     provide the ability to construct graphs with all the basic
> > building
> > > > >     blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
> > > > >     -
> > > > >
> > > > >     begin fleshing out the common composite transforms (Count,
> Join,
> > > etc)
> > > > >     and IO connectors (Text, Kafka, etc)
> > > > >     -
> > > > >
> > > > >     have at least one runner that can execute the complete model
> (may
> > > be
> > > > a
> > > > >     direct runner)
> > > > >     -
> > > > >
> > > > >     provide integration tests for executing against current and
> > future
> > > > >     runners
> > > > >
> > > > >
> > > > > * A note on DSLs:  I think it’s important to separate out an SDK
> > from a
> > > > > DSL, because in my mind the former is by definition equivalent to
> the
> > > > Beam
> > > > > model, while the latter may select portions of the model or change
> > the
> > > > > user-visible abstractions in order to provide a domain-specific
> > > > experience.
> > > > > We may want to encourage some DSLs to live separately from Beam
> > because
> > > > > they may look completely non-Beam-like to their end users. But we
> can
> > > > > probably punt this decision until we have concrete examples to
> > discuss.
> > > > >
> > > > > Another fun part of this growth is that we’ll likely grow new
> > > committers.
> > > > > And given the breadth of Beam, I think it would be useful to
> annotate
> > > our
> > > > > committers [2] page with which components folks are the most
> > > > knowledgeable
> > > > > about.
> > > > >
> > > > > Looking forward to your thoughts.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
> > > > >
> > > > > [2] http://beam.incubator.apache.org/team/
> > > > >
> > > >
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbonofre@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > >
> >
>

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Frances Perry <fj...@google.com.INVALID>.

@Amit Fair enough, given we don't currently meet those requirements on
master ;-) But still a good things to aim for, with lots of help and
encouragement all around!

On Thu, May 19, 2016 at 10:47 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> Fully agree with Davor for feature idea impl.
>
> Regards
> JB
>
>
> On 05/19/2016 08:59 PM, Davor Bonaci wrote:
>
>> If anybody wants to experiment a little with a feature idea -- absolutely,
>> individual forked repositories are certainly an awesome place for such
>> attempts.
>>
>> However, for something that is a significant undertaking, like a new
>> runner
>> or new SDK, I think feature branches in the main repository make total
>> sense. We'd avoid important disadvantages of lower visibility, harder for
>> others to jump in, comment, learn, etc., harder testing because Apache
>> Jenkins wouldn't be able to do it automatically, etc.
>>
>> In summary, I think there's a spectrum of feature complexities and
>> longevity considerations. As such, I'd support being flexible as
>> appropriate, but have a default answer of starting with a feature branch
>> in
>> the main repository for new major components.
>>
>> On Thu, May 19, 2016 at 3:09 AM, Ismaël Mejía <ie...@gmail.com> wrote:
>>
>> I agree with Aljoscha, about not putting the feature branches in the main
>>> repo, however how can we make people  aware of the new developments ?
>>>
>>> -Ismaël
>>>
>>> On Thu, May 19, 2016 at 11:56 AM, Aljoscha Krettek <al...@apache.org>
>>> wrote:
>>>
>>> +1
>>>>
>>>> When we say feature branch, are we talking about a branch in the main
>>>>
>>> repo?
>>>
>>>> I would propose that feature branches live in the repos of the
>>>> committers
>>>> who are working on a feature.
>>>>
>>>> On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofré <jb...@nanthrax.net>
>>>>
>>> wrote:
>>>
>>>>
>>>> +1
>>>>>
>>>>> it looks good to me.
>>>>>
>>>>> Regards
>>>>> JB
>>>>>
>>>>> On 05/19/2016 07:01 AM, Frances Perry wrote:
>>>>>
>>>>>> Hi Beamers --
>>>>>>
>>>>>> I’m thrilled by the recent energy and activity on writing new Beam
>>>>>>
>>>>> runners!
>>>>>
>>>>>> But that also means it’s probably time for us to figure out how, as a
>>>>>> community, we want to support this process. ;-)
>>>>>>
>>>>>> Back near the beginning, we had a thread [1] discussing that feature
>>>>>> branches are the preferred way of doing development of features or
>>>>>> components that may take a while to reach maturity. I think new
>>>>>>
>>>>> components
>>>>>
>>>>>> like runners and SDKs meet the bar to be started from a feature
>>>>>>
>>>>> branch.
>>>
>>>> (Other features, like an IO connector or library of PTransforms,
>>>>>>
>>>>> might
>>>
>>>> also
>>>>>
>>>>>> qualify depending on complexity.)
>>>>>>
>>>>>> We should also lay out what it takes to be considered mature enough
>>>>>>
>>>>> to
>>>
>>>> be
>>>>
>>>>> merged into master, since once that happens the component gets
>>>>>>
>>>>> released
>>>
>>>> to
>>>>>
>>>>>> users and failing tests become blocking issues. Here are some initial
>>>>>> thoughts to kick off the discussion...
>>>>>>
>>>>>> In order to be merged into master, new components / major features
>>>>>>
>>>>> should:
>>>>>
>>>>>>
>>>>>>      -
>>>>>>
>>>>>>      have at least 2 contributors interested in maintaining it, and 1
>>>>>>      committer interested in supporting it
>>>>>>      -
>>>>>>
>>>>>>      provide both end-user and developer-facing documentation
>>>>>>      -
>>>>>>
>>>>>>      have at least a basic level of unit test coverage
>>>>>>      -
>>>>>>
>>>>>>      run all existing applicable integration tests with other Beam
>>>>>>
>>>>> components
>>>>>
>>>>>>      and create additional tests as appropriate
>>>>>>
>>>>>>
>>>>>> In addition...
>>>>>>
>>>>>> A runner should:
>>>>>>
>>>>>>      -
>>>>>>
>>>>>>      be able to handle a subset of the model that address a
>>>>>>
>>>>> significant
>>>
>>>> set
>>>>>
>>>>>>      of use cases (aka. ‘traditional batch’ or ‘processing time
>>>>>>
>>>>> streaming’)
>>>>>
>>>>>>      -
>>>>>>
>>>>>>      update the capability matrix with the current status
>>>>>>
>>>>>>
>>>>>> An SDK* should:
>>>>>>
>>>>>>      -
>>>>>>
>>>>>>      provide the ability to construct graphs with all the basic
>>>>>>
>>>>> building
>>>
>>>>      blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
>>>>>>      -
>>>>>>
>>>>>>      begin fleshing out the common composite transforms (Count, Join,
>>>>>>
>>>>> etc)
>>>>
>>>>>      and IO connectors (Text, Kafka, etc)
>>>>>>      -
>>>>>>
>>>>>>      have at least one runner that can execute the complete model (may
>>>>>>
>>>>> be
>>>>
>>>>> a
>>>>>
>>>>>>      direct runner)
>>>>>>      -
>>>>>>
>>>>>>      provide integration tests for executing against current and
>>>>>>
>>>>> future
>>>
>>>>      runners
>>>>>>
>>>>>>
>>>>>> * A note on DSLs:  I think it’s important to separate out an SDK
>>>>>>
>>>>> from a
>>>
>>>> DSL, because in my mind the former is by definition equivalent to the
>>>>>>
>>>>> Beam
>>>>>
>>>>>> model, while the latter may select portions of the model or change
>>>>>>
>>>>> the
>>>
>>>> user-visible abstractions in order to provide a domain-specific
>>>>>>
>>>>> experience.
>>>>>
>>>>>> We may want to encourage some DSLs to live separately from Beam
>>>>>>
>>>>> because
>>>
>>>> they may look completely non-Beam-like to their end users. But we can
>>>>>> probably punt this decision until we have concrete examples to
>>>>>>
>>>>> discuss.
>>>
>>>>
>>>>>> Another fun part of this growth is that we’ll likely grow new
>>>>>>
>>>>> committers.
>>>>
>>>>> And given the breadth of Beam, I think it would be useful to annotate
>>>>>>
>>>>> our
>>>>
>>>>> committers [2] page with which components folks are the most
>>>>>>
>>>>> knowledgeable
>>>>>
>>>>>> about.
>>>>>>
>>>>>> Looking forward to your thoughts.
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
>>>
>>>>
>>>>>> [2] http://beam.incubator.apache.org/team/
>>>>>>
>>>>>>
>>>>> --
>>>>> Jean-Baptiste Onofré
>>>>> jbonofre@apache.org
>>>>> http://blog.nanthrax.net
>>>>> Talend - http://www.talend.com
>>>>>
>>>>>
>>>>
>>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

Fully agree with Davor for feature idea impl.

Regards
JB

On 05/19/2016 08:59 PM, Davor Bonaci wrote:
> If anybody wants to experiment a little with a feature idea -- absolutely,
> individual forked repositories are certainly an awesome place for such
> attempts.
>
> However, for something that is a significant undertaking, like a new runner
> or new SDK, I think feature branches in the main repository make total
> sense. We'd avoid important disadvantages of lower visibility, harder for
> others to jump in, comment, learn, etc., harder testing because Apache
> Jenkins wouldn't be able to do it automatically, etc.
>
> In summary, I think there's a spectrum of feature complexities and
> longevity considerations. As such, I'd support being flexible as
> appropriate, but have a default answer of starting with a feature branch in
> the main repository for new major components.
>
> On Thu, May 19, 2016 at 3:09 AM, Isma�l Mej�a <ie...@gmail.com> wrote:
>
>> I agree with Aljoscha, about not putting the feature branches in the main
>> repo, however how can we make people  aware of the new developments ?
>>
>> -Isma�l
>>
>> On Thu, May 19, 2016 at 11:56 AM, Aljoscha Krettek <al...@apache.org>
>> wrote:
>>
>>> +1
>>>
>>> When we say feature branch, are we talking about a branch in the main
>> repo?
>>> I would propose that feature branches live in the repos of the committers
>>> who are working on a feature.
>>>
>>> On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofr� <jb...@nanthrax.net>
>> wrote:
>>>
>>>> +1
>>>>
>>>> it looks good to me.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 05/19/2016 07:01 AM, Frances Perry wrote:
>>>>> Hi Beamers --
>>>>>
>>>>> I\u2019m thrilled by the recent energy and activity on writing new Beam
>>>> runners!
>>>>> But that also means it\u2019s probably time for us to figure out how, as a
>>>>> community, we want to support this process. ;-)
>>>>>
>>>>> Back near the beginning, we had a thread [1] discussing that feature
>>>>> branches are the preferred way of doing development of features or
>>>>> components that may take a while to reach maturity. I think new
>>>> components
>>>>> like runners and SDKs meet the bar to be started from a feature
>> branch.
>>>>> (Other features, like an IO connector or library of PTransforms,
>> might
>>>> also
>>>>> qualify depending on complexity.)
>>>>>
>>>>> We should also lay out what it takes to be considered mature enough
>> to
>>> be
>>>>> merged into master, since once that happens the component gets
>> released
>>>> to
>>>>> users and failing tests become blocking issues. Here are some initial
>>>>> thoughts to kick off the discussion...
>>>>>
>>>>> In order to be merged into master, new components / major features
>>>> should:
>>>>>
>>>>>      -
>>>>>
>>>>>      have at least 2 contributors interested in maintaining it, and 1
>>>>>      committer interested in supporting it
>>>>>      -
>>>>>
>>>>>      provide both end-user and developer-facing documentation
>>>>>      -
>>>>>
>>>>>      have at least a basic level of unit test coverage
>>>>>      -
>>>>>
>>>>>      run all existing applicable integration tests with other Beam
>>>> components
>>>>>      and create additional tests as appropriate
>>>>>
>>>>>
>>>>> In addition...
>>>>>
>>>>> A runner should:
>>>>>
>>>>>      -
>>>>>
>>>>>      be able to handle a subset of the model that address a
>> significant
>>>> set
>>>>>      of use cases (aka. \u2018traditional batch\u2019 or \u2018processing time
>>>> streaming\u2019)
>>>>>      -
>>>>>
>>>>>      update the capability matrix with the current status
>>>>>
>>>>>
>>>>> An SDK* should:
>>>>>
>>>>>      -
>>>>>
>>>>>      provide the ability to construct graphs with all the basic
>> building
>>>>>      blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
>>>>>      -
>>>>>
>>>>>      begin fleshing out the common composite transforms (Count, Join,
>>> etc)
>>>>>      and IO connectors (Text, Kafka, etc)
>>>>>      -
>>>>>
>>>>>      have at least one runner that can execute the complete model (may
>>> be
>>>> a
>>>>>      direct runner)
>>>>>      -
>>>>>
>>>>>      provide integration tests for executing against current and
>> future
>>>>>      runners
>>>>>
>>>>>
>>>>> * A note on DSLs:  I think it\u2019s important to separate out an SDK
>> from a
>>>>> DSL, because in my mind the former is by definition equivalent to the
>>>> Beam
>>>>> model, while the latter may select portions of the model or change
>> the
>>>>> user-visible abstractions in order to provide a domain-specific
>>>> experience.
>>>>> We may want to encourage some DSLs to live separately from Beam
>> because
>>>>> they may look completely non-Beam-like to their end users. But we can
>>>>> probably punt this decision until we have concrete examples to
>> discuss.
>>>>>
>>>>> Another fun part of this growth is that we\u2019ll likely grow new
>>> committers.
>>>>> And given the breadth of Beam, I think it would be useful to annotate
>>> our
>>>>> committers [2] page with which components folks are the most
>>>> knowledgeable
>>>>> about.
>>>>>
>>>>> Looking forward to your thoughts.
>>>>>
>>>>> [1]
>>>>>
>>>>
>>>
>> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
>>>>>
>>>>> [2] http://beam.incubator.apache.org/team/
>>>>>
>>>>
>>>> --
>>>> Jean-Baptiste Onofr�
>>>> jbonofre@apache.org
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>
>>
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Davor Bonaci <da...@google.com.INVALID>.

If anybody wants to experiment a little with a feature idea -- absolutely,
individual forked repositories are certainly an awesome place for such
attempts.

However, for something that is a significant undertaking, like a new runner
or new SDK, I think feature branches in the main repository make total
sense. We'd avoid important disadvantages of lower visibility, harder for
others to jump in, comment, learn, etc., harder testing because Apache
Jenkins wouldn't be able to do it automatically, etc.

In summary, I think there's a spectrum of feature complexities and
longevity considerations. As such, I'd support being flexible as
appropriate, but have a default answer of starting with a feature branch in
the main repository for new major components.

On Thu, May 19, 2016 at 3:09 AM, Ismaël Mejía <ie...@gmail.com> wrote:

> I agree with Aljoscha, about not putting the feature branches in the main
> repo, however how can we make people  aware of the new developments ?
>
> -Ismaël
>
> On Thu, May 19, 2016 at 11:56 AM, Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > +1
> >
> > When we say feature branch, are we talking about a branch in the main
> repo?
> > I would propose that feature branches live in the repos of the committers
> > who are working on a feature.
> >
> > On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
> >
> > > +1
> > >
> > > it looks good to me.
> > >
> > > Regards
> > > JB
> > >
> > > On 05/19/2016 07:01 AM, Frances Perry wrote:
> > > > Hi Beamers --
> > > >
> > > > I’m thrilled by the recent energy and activity on writing new Beam
> > > runners!
> > > > But that also means it’s probably time for us to figure out how, as a
> > > > community, we want to support this process. ;-)
> > > >
> > > > Back near the beginning, we had a thread [1] discussing that feature
> > > > branches are the preferred way of doing development of features or
> > > > components that may take a while to reach maturity. I think new
> > > components
> > > > like runners and SDKs meet the bar to be started from a feature
> branch.
> > > > (Other features, like an IO connector or library of PTransforms,
> might
> > > also
> > > > qualify depending on complexity.)
> > > >
> > > > We should also lay out what it takes to be considered mature enough
> to
> > be
> > > > merged into master, since once that happens the component gets
> released
> > > to
> > > > users and failing tests become blocking issues. Here are some initial
> > > > thoughts to kick off the discussion...
> > > >
> > > > In order to be merged into master, new components / major features
> > > should:
> > > >
> > > >     -
> > > >
> > > >     have at least 2 contributors interested in maintaining it, and 1
> > > >     committer interested in supporting it
> > > >     -
> > > >
> > > >     provide both end-user and developer-facing documentation
> > > >     -
> > > >
> > > >     have at least a basic level of unit test coverage
> > > >     -
> > > >
> > > >     run all existing applicable integration tests with other Beam
> > > components
> > > >     and create additional tests as appropriate
> > > >
> > > >
> > > > In addition...
> > > >
> > > > A runner should:
> > > >
> > > >     -
> > > >
> > > >     be able to handle a subset of the model that address a
> significant
> > > set
> > > >     of use cases (aka. ‘traditional batch’ or ‘processing time
> > > streaming’)
> > > >     -
> > > >
> > > >     update the capability matrix with the current status
> > > >
> > > >
> > > > An SDK* should:
> > > >
> > > >     -
> > > >
> > > >     provide the ability to construct graphs with all the basic
> building
> > > >     blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
> > > >     -
> > > >
> > > >     begin fleshing out the common composite transforms (Count, Join,
> > etc)
> > > >     and IO connectors (Text, Kafka, etc)
> > > >     -
> > > >
> > > >     have at least one runner that can execute the complete model (may
> > be
> > > a
> > > >     direct runner)
> > > >     -
> > > >
> > > >     provide integration tests for executing against current and
> future
> > > >     runners
> > > >
> > > >
> > > > * A note on DSLs:  I think it’s important to separate out an SDK
> from a
> > > > DSL, because in my mind the former is by definition equivalent to the
> > > Beam
> > > > model, while the latter may select portions of the model or change
> the
> > > > user-visible abstractions in order to provide a domain-specific
> > > experience.
> > > > We may want to encourage some DSLs to live separately from Beam
> because
> > > > they may look completely non-Beam-like to their end users. But we can
> > > > probably punt this decision until we have concrete examples to
> discuss.
> > > >
> > > > Another fun part of this growth is that we’ll likely grow new
> > committers.
> > > > And given the breadth of Beam, I think it would be useful to annotate
> > our
> > > > committers [2] page with which components folks are the most
> > > knowledgeable
> > > > about.
> > > >
> > > > Looking forward to your thoughts.
> > > >
> > > > [1]
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
> > > >
> > > > [2] http://beam.incubator.apache.org/team/
> > > >
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbonofre@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Ismaël Mejía <ie...@gmail.com>.

I agree with Aljoscha, about not putting the feature branches in the main
repo, however how can we make people  aware of the new developments ?

-Ismaël

On Thu, May 19, 2016 at 11:56 AM, Aljoscha Krettek <al...@apache.org>
wrote:

> +1
>
> When we say feature branch, are we talking about a branch in the main repo?
> I would propose that feature branches live in the repos of the committers
> who are working on a feature.
>
> On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofré <jb...@nanthrax.net> wrote:
>
> > +1
> >
> > it looks good to me.
> >
> > Regards
> > JB
> >
> > On 05/19/2016 07:01 AM, Frances Perry wrote:
> > > Hi Beamers --
> > >
> > > I’m thrilled by the recent energy and activity on writing new Beam
> > runners!
> > > But that also means it’s probably time for us to figure out how, as a
> > > community, we want to support this process. ;-)
> > >
> > > Back near the beginning, we had a thread [1] discussing that feature
> > > branches are the preferred way of doing development of features or
> > > components that may take a while to reach maturity. I think new
> > components
> > > like runners and SDKs meet the bar to be started from a feature branch.
> > > (Other features, like an IO connector or library of PTransforms, might
> > also
> > > qualify depending on complexity.)
> > >
> > > We should also lay out what it takes to be considered mature enough to
> be
> > > merged into master, since once that happens the component gets released
> > to
> > > users and failing tests become blocking issues. Here are some initial
> > > thoughts to kick off the discussion...
> > >
> > > In order to be merged into master, new components / major features
> > should:
> > >
> > >     -
> > >
> > >     have at least 2 contributors interested in maintaining it, and 1
> > >     committer interested in supporting it
> > >     -
> > >
> > >     provide both end-user and developer-facing documentation
> > >     -
> > >
> > >     have at least a basic level of unit test coverage
> > >     -
> > >
> > >     run all existing applicable integration tests with other Beam
> > components
> > >     and create additional tests as appropriate
> > >
> > >
> > > In addition...
> > >
> > > A runner should:
> > >
> > >     -
> > >
> > >     be able to handle a subset of the model that address a significant
> > set
> > >     of use cases (aka. ‘traditional batch’ or ‘processing time
> > streaming’)
> > >     -
> > >
> > >     update the capability matrix with the current status
> > >
> > >
> > > An SDK* should:
> > >
> > >     -
> > >
> > >     provide the ability to construct graphs with all the basic building
> > >     blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
> > >     -
> > >
> > >     begin fleshing out the common composite transforms (Count, Join,
> etc)
> > >     and IO connectors (Text, Kafka, etc)
> > >     -
> > >
> > >     have at least one runner that can execute the complete model (may
> be
> > a
> > >     direct runner)
> > >     -
> > >
> > >     provide integration tests for executing against current and future
> > >     runners
> > >
> > >
> > > * A note on DSLs:  I think it’s important to separate out an SDK from a
> > > DSL, because in my mind the former is by definition equivalent to the
> > Beam
> > > model, while the latter may select portions of the model or change the
> > > user-visible abstractions in order to provide a domain-specific
> > experience.
> > > We may want to encourage some DSLs to live separately from Beam because
> > > they may look completely non-Beam-like to their end users. But we can
> > > probably punt this decision until we have concrete examples to discuss.
> > >
> > > Another fun part of this growth is that we’ll likely grow new
> committers.
> > > And given the breadth of Beam, I think it would be useful to annotate
> our
> > > committers [2] page with which components folks are the most
> > knowledgeable
> > > about.
> > >
> > > Looking forward to your thoughts.
> > >
> > > [1]
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
> > >
> > > [2] http://beam.incubator.apache.org/team/
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Aljoscha Krettek <al...@apache.org>.

+1

When we say feature branch, are we talking about a branch in the main repo?
I would propose that feature branches live in the repos of the committers
who are working on a feature.

On Thu, 19 May 2016 at 11:54 Jean-Baptiste Onofré <jb...@nanthrax.net> wrote:

> +1
>
> it looks good to me.
>
> Regards
> JB
>
> On 05/19/2016 07:01 AM, Frances Perry wrote:
> > Hi Beamers --
> >
> > I’m thrilled by the recent energy and activity on writing new Beam
> runners!
> > But that also means it’s probably time for us to figure out how, as a
> > community, we want to support this process. ;-)
> >
> > Back near the beginning, we had a thread [1] discussing that feature
> > branches are the preferred way of doing development of features or
> > components that may take a while to reach maturity. I think new
> components
> > like runners and SDKs meet the bar to be started from a feature branch.
> > (Other features, like an IO connector or library of PTransforms, might
> also
> > qualify depending on complexity.)
> >
> > We should also lay out what it takes to be considered mature enough to be
> > merged into master, since once that happens the component gets released
> to
> > users and failing tests become blocking issues. Here are some initial
> > thoughts to kick off the discussion...
> >
> > In order to be merged into master, new components / major features
> should:
> >
> >     -
> >
> >     have at least 2 contributors interested in maintaining it, and 1
> >     committer interested in supporting it
> >     -
> >
> >     provide both end-user and developer-facing documentation
> >     -
> >
> >     have at least a basic level of unit test coverage
> >     -
> >
> >     run all existing applicable integration tests with other Beam
> components
> >     and create additional tests as appropriate
> >
> >
> > In addition...
> >
> > A runner should:
> >
> >     -
> >
> >     be able to handle a subset of the model that address a significant
> set
> >     of use cases (aka. ‘traditional batch’ or ‘processing time
> streaming’)
> >     -
> >
> >     update the capability matrix with the current status
> >
> >
> > An SDK* should:
> >
> >     -
> >
> >     provide the ability to construct graphs with all the basic building
> >     blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
> >     -
> >
> >     begin fleshing out the common composite transforms (Count, Join, etc)
> >     and IO connectors (Text, Kafka, etc)
> >     -
> >
> >     have at least one runner that can execute the complete model (may be
> a
> >     direct runner)
> >     -
> >
> >     provide integration tests for executing against current and future
> >     runners
> >
> >
> > * A note on DSLs:  I think it’s important to separate out an SDK from a
> > DSL, because in my mind the former is by definition equivalent to the
> Beam
> > model, while the latter may select portions of the model or change the
> > user-visible abstractions in order to provide a domain-specific
> experience.
> > We may want to encourage some DSLs to live separately from Beam because
> > they may look completely non-Beam-like to their end users. But we can
> > probably punt this decision until we have concrete examples to discuss.
> >
> > Another fun part of this growth is that we’ll likely grow new committers.
> > And given the breadth of Beam, I think it would be useful to annotate our
> > committers [2] page with which components folks are the most
> knowledgeable
> > about.
> >
> > Looking forward to your thoughts.
> >
> > [1]
> >
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
> >
> > [2] http://beam.incubator.apache.org/team/
> >
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

+1

it looks good to me.

Regards
JB

On 05/19/2016 07:01 AM, Frances Perry wrote:
> Hi Beamers --
>
> I\u2019m thrilled by the recent energy and activity on writing new Beam runners!
> But that also means it\u2019s probably time for us to figure out how, as a
> community, we want to support this process. ;-)
>
> Back near the beginning, we had a thread [1] discussing that feature
> branches are the preferred way of doing development of features or
> components that may take a while to reach maturity. I think new components
> like runners and SDKs meet the bar to be started from a feature branch.
> (Other features, like an IO connector or library of PTransforms, might also
> qualify depending on complexity.)
>
> We should also lay out what it takes to be considered mature enough to be
> merged into master, since once that happens the component gets released to
> users and failing tests become blocking issues. Here are some initial
> thoughts to kick off the discussion...
>
> In order to be merged into master, new components / major features should:
>
>     -
>
>     have at least 2 contributors interested in maintaining it, and 1
>     committer interested in supporting it
>     -
>
>     provide both end-user and developer-facing documentation
>     -
>
>     have at least a basic level of unit test coverage
>     -
>
>     run all existing applicable integration tests with other Beam components
>     and create additional tests as appropriate
>
>
> In addition...
>
> A runner should:
>
>     -
>
>     be able to handle a subset of the model that address a significant set
>     of use cases (aka. \u2018traditional batch\u2019 or \u2018processing time streaming\u2019)
>     -
>
>     update the capability matrix with the current status
>
>
> An SDK* should:
>
>     -
>
>     provide the ability to construct graphs with all the basic building
>     blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
>     -
>
>     begin fleshing out the common composite transforms (Count, Join, etc)
>     and IO connectors (Text, Kafka, etc)
>     -
>
>     have at least one runner that can execute the complete model (may be a
>     direct runner)
>     -
>
>     provide integration tests for executing against current and future
>     runners
>
>
> * A note on DSLs:  I think it\u2019s important to separate out an SDK from a
> DSL, because in my mind the former is by definition equivalent to the Beam
> model, while the latter may select portions of the model or change the
> user-visible abstractions in order to provide a domain-specific experience.
> We may want to encourage some DSLs to live separately from Beam because
> they may look completely non-Beam-like to their end users. But we can
> probably punt this decision until we have concrete examples to discuss.
>
> Another fun part of this growth is that we\u2019ll likely grow new committers.
> And given the breadth of Beam, I think it would be useful to annotate our
> committers [2] page with which components folks are the most knowledgeable
> about.
>
> Looking forward to your thoughts.
>
> [1]
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
>
> [2] http://beam.incubator.apache.org/team/
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: [DISCUSS] Developing new components -- branches, maturity, and committers

Posted by Seetharam Venkatesh <ve...@innerzeal.com>.

+1, this is a step in the right direction.

On Wed, May 18, 2016 at 10:02 PM Frances Perry <fj...@google.com.invalid>
wrote:

> Hi Beamers --
>
> I’m thrilled by the recent energy and activity on writing new Beam runners!
> But that also means it’s probably time for us to figure out how, as a
> community, we want to support this process. ;-)
>
> Back near the beginning, we had a thread [1] discussing that feature
> branches are the preferred way of doing development of features or
> components that may take a while to reach maturity. I think new components
> like runners and SDKs meet the bar to be started from a feature branch.
> (Other features, like an IO connector or library of PTransforms, might also
> qualify depending on complexity.)
>
> We should also lay out what it takes to be considered mature enough to be
> merged into master, since once that happens the component gets released to
> users and failing tests become blocking issues. Here are some initial
> thoughts to kick off the discussion...
>
> In order to be merged into master, new components / major features should:
>
>    -
>
>    have at least 2 contributors interested in maintaining it, and 1
>    committer interested in supporting it
>    -
>
>    provide both end-user and developer-facing documentation
>    -
>
>    have at least a basic level of unit test coverage
>    -
>
>    run all existing applicable integration tests with other Beam components
>    and create additional tests as appropriate
>
>
> In addition...
>
> A runner should:
>
>    -
>
>    be able to handle a subset of the model that address a significant set
>    of use cases (aka. ‘traditional batch’ or ‘processing time streaming’)
>    -
>
>    update the capability matrix with the current status
>
>
> An SDK* should:
>
>    -
>
>    provide the ability to construct graphs with all the basic building
>    blocks of the model (ParDo, GroupByKey, Window, Trigger, etc)
>    -
>
>    begin fleshing out the common composite transforms (Count, Join, etc)
>    and IO connectors (Text, Kafka, etc)
>    -
>
>    have at least one runner that can execute the complete model (may be a
>    direct runner)
>    -
>
>    provide integration tests for executing against current and future
>    runners
>
>
> * A note on DSLs:  I think it’s important to separate out an SDK from a
> DSL, because in my mind the former is by definition equivalent to the Beam
> model, while the latter may select portions of the model or change the
> user-visible abstractions in order to provide a domain-specific experience.
> We may want to encourage some DSLs to live separately from Beam because
> they may look completely non-Beam-like to their end users. But we can
> probably punt this decision until we have concrete examples to discuss.
>
> Another fun part of this growth is that we’ll likely grow new committers.
> And given the breadth of Beam, I think it would be useful to annotate our
> committers [2] page with which components folks are the most knowledgeable
> about.
>
> Looking forward to your thoughts.
>
> [1]
>
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201602.mbox/%3CCAAzyFAymVNpjQgZdz2BoMknnE3H9rYRbdnUemamt9Pavw8ugsw%40mail.gmail.com%3E
>
> [2] http://beam.incubator.apache.org/team/
>