You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Xinyu Liu <xi...@gmail.com> on 2018/06/18 18:57:14 UTC

[PROPOSAL] Merge samza-runner to master

Hi, Folks,

On behalf of the Samza team, I would like to propose to merge the
samza-runner branch into master. The branch was created on Jan when we
first introduced the Samza Runner [1], and we've been adding features and
refining it afterwards. Now the runner satisfies the criteria outlined in
[2], and merging it to master will give more visibility to other
contributors and users.

1. Have at least 2 contributors interested in maintaining it, and 1
committer interested in supporting it: *Both Chris and me have been making
contributions and I am going to sign up for the support. There are more
folks in the Samza team interested in contributing to it. Thanks Kenn for
all the help and reviews for the runner!*
2. Provide both end-user and developer-facing documentation: *The PR for
the samza-runner doc has runner user guide, capability matrix, and tutorial
using WordCount examples.*
3. Have at least a basic level of unit test coverage: *Unit tests are here
[3].*
4. Run all existing applicable integration tests with other Beam components
and create additional tests as appropriate: Enabled ValidatesRunner tests.*
5. Be able to handle a subset of the model that addresses a significant set
of use cases, such as ‘traditional batch’ or ‘processing time streaming’:
*We have test Beam jobs running in Yarn using event-time processing of
Kafka streams.*
6. Update the capability matrix with the current status. *Same as #2.*
7. Add a webpage under documentation/runners. *Same as #2.*

The PR for the samza-runner merge: https://github.com/apache/beam/pull/5668
The PR for the samza-runner doc: https://github.com/
apache/beam-site/pull/471

Thanks,
Xinyu

[1] https://issues.apache.org/jira/browse/BEAM-3079
[2] https://beam.apache.org/contribute/
[3] https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test

Re: [PROPOSAL] Merge samza-runner to master

Posted by Xinyu Liu <xi...@gmail.com>.
Thanks a lot, Kenn! Finally we linked our runner in. I will work on the
rest of the stuff as you mentioned. Thanks again for everyone's comments,
too.

Thanks,
Xinyu

On Mon, Jun 25, 2018 at 2:48 PM, Kenneth Knowles <kl...@google.com> wrote:

> This is done. Now we need to make sure the build is running, healthy, in
> the PR template.
>
> On Mon, Jun 25, 2018 at 9:10 AM Kenneth Knowles <kl...@google.com> wrote:
>
>> I'll do it. I'm working with Xinyu on the PR.
>>
>> Kenn
>>
>> On Mon, Jun 25, 2018, 08:24 Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> +1
>>>
>>> It is important to have new runners merged (even if not 100% complete)
>>> so they benefit of the fixes going on, and that they can easily (and
>>> incrementally) start to track the new portability features as they develop.
>>>
>>> What is next then ? Who triggers the green button so this happens?
>>>
>>>
>>>
>>>
>>> On Sat, Jun 23, 2018 at 6:43 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> As the build is fine, it makes sense to merge pretty fast.
>>>>
>>>> Thanks,
>>>> Regards
>>>> JB
>>>>
>>>> On 22/06/2018 00:14, Xinyu Liu wrote:
>>>> > I updated the merge PR with the gradle integration (there was some
>>>> > Jenkins Java tests failure with google cloud quota issues. It seems
>>>> not
>>>> > related to this patch). Please feel free to ping me if anything else
>>>> is
>>>> > needed.
>>>> >
>>>> > Thanks,
>>>> > Xinyu
>>>> >
>>>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xinyuliu.us@gmail.com
>>>> > <ma...@gmail.com>> wrote:
>>>> >
>>>> >     @Kenn: I am going to add the build.gradle. Is there anything else?
>>>> >
>>>> >     @Ahmet, @Robert: here are more details about the samza runner
>>>> right now:
>>>> >
>>>> >     - Missing pieces: timer support in ParDo is not there yet and I
>>>> plan
>>>> >     to add it soon. SplittableParDo is missing but we don't have a use
>>>> >     case so far. We are on par with the other runners for the rest of
>>>> >     the Java features.
>>>> >     - Work in Progress: implement the portable pipeline runner logic.
>>>> >     - Future plans: support Python is our next goal. Hopefully we will
>>>> >     get a prototype working sometime next quarter :).
>>>> >
>>>> >     Btw, thanks everyone for the comments!
>>>> >
>>>> >     Thanks,
>>>> >     Xinyu
>>>> >
>>>> >     On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <robert@frantil.com
>>>> >     <ma...@frantil.com>> wrote:
>>>> >
>>>> >         This is exciting! Is it implemented as a portability framework
>>>> >         runner too?
>>>> >
>>>> >
>>>> >         On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <
>>>> pabloem@google.com
>>>> >         <ma...@google.com>> wrote:
>>>> >
>>>> >             It's very exciting to see a new runner making it into
>>>> >             master. : )
>>>> >
>>>> >             Best
>>>> >             -P.
>>>> >
>>>> >             On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez
>>>> >             <rfernand@google.com <ma...@google.com>> wrote:
>>>> >
>>>> >                 I've just read this and wanted to share my excitement
>>>> :D
>>>> >
>>>> >
>>>> >
>>>> >                 On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles
>>>> >                 <klk@google.com <ma...@google.com>> wrote:
>>>> >
>>>> >                     One thing that will be necessary is porting the
>>>> >                     build to Gradle.
>>>> >
>>>> >                     Kenn
>>>> >
>>>> >                     On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu
>>>> >                     <xinyuliu.us@gmail.com
>>>> >                     <ma...@gmail.com>> wrote:
>>>> >
>>>> >                         Hi, Folks,
>>>> >
>>>> >                         On behalf of the Samza team, I would like to
>>>> >                         propose to merge the samza-runner branch into
>>>> >                         master. The branch was created on Jan when we
>>>> >                         first introduced the Samza Runner [1], and
>>>> we've
>>>> >                         been adding features and refining it
>>>> afterwards.
>>>> >                         Now the runner satisfies the criteria outlined
>>>> >                         in [2], and merging it to master will give
>>>> more
>>>> >                         visibility to other contributors and users.
>>>> >
>>>> >                         1. Have at least 2 contributors interested in
>>>> >                         maintaining it, and 1 committer interested in
>>>> >                         supporting it: *Both Chris and me have been
>>>> >                         making contributions and I am going to sign up
>>>> >                         for the support. There are more folks in the
>>>> >                         Samza team interested in contributing to it.
>>>> >                         Thanks Kenn for all the help and reviews for
>>>> the
>>>> >                         runner!*
>>>> >                         2. Provide both end-user and developer-facing
>>>> >                         documentation: *The PR for the samza-runner
>>>> doc
>>>> >                         has runner user guide, capability matrix, and
>>>> >                         tutorial using WordCount examples.*
>>>> >                         3. Have at least a basic level of unit test
>>>> >                         coverage: *Unit tests are here [3].*
>>>> >                         4. Run all existing applicable integration
>>>> tests
>>>> >                         with other Beam components and create
>>>> additional
>>>> >                         tests as appropriate: Enabled ValidatesRunner
>>>> >                         tests.*
>>>> >                         5. Be able to handle a subset of the model
>>>> that
>>>> >                         addresses a significant set of use cases, such
>>>> >                         as ‘traditional batch’ or ‘processing time
>>>> >                         streaming’: *We have test Beam jobs running in
>>>> >                         Yarn using event-time processing of Kafka
>>>> streams.*
>>>> >                         6. Update the capability matrix with the
>>>> current
>>>> >                         status. *Same as #2.*
>>>> >                         7. Add a webpage under documentation/runners.
>>>> >                         *Same as #2.*
>>>> >
>>>> >                         The PR for the samza-runner
>>>> >                         merge: https://github.com/
>>>> apache/beam/pull/5668
>>>> >                         <https://github.com/apache/beam/pull/5668>
>>>> >                         The PR for the samza-runner
>>>> >                         doc: https://github.com/
>>>> apache/beam-site/pull/471 <https://github.com/apache/beam-site/pull/471
>>>> >
>>>> >
>>>> >                         Thanks,
>>>> >                         Xinyu
>>>> >
>>>> >                         [1] https://issues.apache.
>>>> org/jira/browse/BEAM-3079
>>>> >                         <https://issues.apache.org/
>>>> jira/browse/BEAM-3079>
>>>> >                         [2] https://beam.apache.org/contribute/
>>>> >                         <https://beam.apache.org/contribute/>
>>>> >                         [3] https://github.com/
>>>> apache/beam/tree/samza-runner/runners/samza/src/test
>>>> >                         <https://github.com/apache/
>>>> beam/tree/samza-runner/runners/samza/src/test>
>>>> >
>>>> >             --
>>>> >             Got feedback? go/pabloem-feedback
>>>> <https://goto.google.com/pabloem-feedback>
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Kenneth Knowles <kl...@google.com>.
This is done. Now we need to make sure the build is running, healthy, in
the PR template.

On Mon, Jun 25, 2018 at 9:10 AM Kenneth Knowles <kl...@google.com> wrote:

> I'll do it. I'm working with Xinyu on the PR.
>
> Kenn
>
> On Mon, Jun 25, 2018, 08:24 Ismaël Mejía <ie...@gmail.com> wrote:
>
>> +1
>>
>> It is important to have new runners merged (even if not 100% complete) so
>> they benefit of the fixes going on, and that they can easily (and
>> incrementally) start to track the new portability features as they develop.
>>
>> What is next then ? Who triggers the green button so this happens?
>>
>>
>>
>>
>> On Sat, Jun 23, 2018 at 6:43 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>> wrote:
>>
>>> +1
>>>
>>> As the build is fine, it makes sense to merge pretty fast.
>>>
>>> Thanks,
>>> Regards
>>> JB
>>>
>>> On 22/06/2018 00:14, Xinyu Liu wrote:
>>> > I updated the merge PR with the gradle integration (there was some
>>> > Jenkins Java tests failure with google cloud quota issues. It seems not
>>> > related to this patch). Please feel free to ping me if anything else is
>>> > needed.
>>> >
>>> > Thanks,
>>> > Xinyu
>>> >
>>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xinyuliu.us@gmail.com
>>> > <ma...@gmail.com>> wrote:
>>> >
>>> >     @Kenn: I am going to add the build.gradle. Is there anything else?
>>> >
>>> >     @Ahmet, @Robert: here are more details about the samza runner
>>> right now:
>>> >
>>> >     - Missing pieces: timer support in ParDo is not there yet and I
>>> plan
>>> >     to add it soon. SplittableParDo is missing but we don't have a use
>>> >     case so far. We are on par with the other runners for the rest of
>>> >     the Java features.
>>> >     - Work in Progress: implement the portable pipeline runner logic.
>>> >     - Future plans: support Python is our next goal. Hopefully we will
>>> >     get a prototype working sometime next quarter :).
>>> >
>>> >     Btw, thanks everyone for the comments!
>>> >
>>> >     Thanks,
>>> >     Xinyu
>>> >
>>> >     On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <robert@frantil.com
>>> >     <ma...@frantil.com>> wrote:
>>> >
>>> >         This is exciting! Is it implemented as a portability framework
>>> >         runner too?
>>> >
>>> >
>>> >         On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <
>>> pabloem@google.com
>>> >         <ma...@google.com>> wrote:
>>> >
>>> >             It's very exciting to see a new runner making it into
>>> >             master. : )
>>> >
>>> >             Best
>>> >             -P.
>>> >
>>> >             On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez
>>> >             <rfernand@google.com <ma...@google.com>> wrote:
>>> >
>>> >                 I've just read this and wanted to share my excitement
>>> :D
>>> >
>>> >
>>> >
>>> >                 On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles
>>> >                 <klk@google.com <ma...@google.com>> wrote:
>>> >
>>> >                     One thing that will be necessary is porting the
>>> >                     build to Gradle.
>>> >
>>> >                     Kenn
>>> >
>>> >                     On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu
>>> >                     <xinyuliu.us@gmail.com
>>> >                     <ma...@gmail.com>> wrote:
>>> >
>>> >                         Hi, Folks,
>>> >
>>> >                         On behalf of the Samza team, I would like to
>>> >                         propose to merge the samza-runner branch into
>>> >                         master. The branch was created on Jan when we
>>> >                         first introduced the Samza Runner [1], and
>>> we've
>>> >                         been adding features and refining it
>>> afterwards.
>>> >                         Now the runner satisfies the criteria outlined
>>> >                         in [2], and merging it to master will give more
>>> >                         visibility to other contributors and users.
>>> >
>>> >                         1. Have at least 2 contributors interested in
>>> >                         maintaining it, and 1 committer interested in
>>> >                         supporting it: *Both Chris and me have been
>>> >                         making contributions and I am going to sign up
>>> >                         for the support. There are more folks in the
>>> >                         Samza team interested in contributing to it.
>>> >                         Thanks Kenn for all the help and reviews for
>>> the
>>> >                         runner!*
>>> >                         2. Provide both end-user and developer-facing
>>> >                         documentation: *The PR for the samza-runner doc
>>> >                         has runner user guide, capability matrix, and
>>> >                         tutorial using WordCount examples.*
>>> >                         3. Have at least a basic level of unit test
>>> >                         coverage: *Unit tests are here [3].*
>>> >                         4. Run all existing applicable integration
>>> tests
>>> >                         with other Beam components and create
>>> additional
>>> >                         tests as appropriate: Enabled ValidatesRunner
>>> >                         tests.*
>>> >                         5. Be able to handle a subset of the model that
>>> >                         addresses a significant set of use cases, such
>>> >                         as ‘traditional batch’ or ‘processing time
>>> >                         streaming’: *We have test Beam jobs running in
>>> >                         Yarn using event-time processing of Kafka
>>> streams.*
>>> >                         6. Update the capability matrix with the
>>> current
>>> >                         status. *Same as #2.*
>>> >                         7. Add a webpage under documentation/runners.
>>> >                         *Same as #2.*
>>> >
>>> >                         The PR for the samza-runner
>>> >                         merge:
>>> https://github.com/apache/beam/pull/5668
>>> >                         <https://github.com/apache/beam/pull/5668>
>>> >                         The PR for the samza-runner
>>> >                         doc:
>>> https://github.com/apache/beam-site/pull/471 <
>>> https://github.com/apache/beam-site/pull/471>
>>> >
>>> >                         Thanks,
>>> >                         Xinyu
>>> >
>>> >                         [1]
>>> https://issues.apache.org/jira/browse/BEAM-3079
>>> >                         <
>>> https://issues.apache.org/jira/browse/BEAM-3079>
>>> >                         [2] https://beam.apache.org/contribute/
>>> >                         <https://beam.apache.org/contribute/>
>>> >                         [3]
>>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>>> >                         <
>>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test>
>>> >
>>> >             --
>>> >             Got feedback? go/pabloem-feedback
>>> <https://goto.google.com/pabloem-feedback>
>>> >
>>> >
>>> >
>>>
>>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Kenneth Knowles <kl...@google.com>.
I'll do it. I'm working with Xinyu on the PR.

Kenn

On Mon, Jun 25, 2018, 08:24 Ismaël Mejía <ie...@gmail.com> wrote:

> +1
>
> It is important to have new runners merged (even if not 100% complete) so
> they benefit of the fixes going on, and that they can easily (and
> incrementally) start to track the new portability features as they develop.
>
> What is next then ? Who triggers the green button so this happens?
>
>
>
>
> On Sat, Jun 23, 2018 at 6:43 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
>> +1
>>
>> As the build is fine, it makes sense to merge pretty fast.
>>
>> Thanks,
>> Regards
>> JB
>>
>> On 22/06/2018 00:14, Xinyu Liu wrote:
>> > I updated the merge PR with the gradle integration (there was some
>> > Jenkins Java tests failure with google cloud quota issues. It seems not
>> > related to this patch). Please feel free to ping me if anything else is
>> > needed.
>> >
>> > Thanks,
>> > Xinyu
>> >
>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xinyuliu.us@gmail.com
>> > <ma...@gmail.com>> wrote:
>> >
>> >     @Kenn: I am going to add the build.gradle. Is there anything else?
>> >
>> >     @Ahmet, @Robert: here are more details about the samza runner right
>> now:
>> >
>> >     - Missing pieces: timer support in ParDo is not there yet and I plan
>> >     to add it soon. SplittableParDo is missing but we don't have a use
>> >     case so far. We are on par with the other runners for the rest of
>> >     the Java features.
>> >     - Work in Progress: implement the portable pipeline runner logic.
>> >     - Future plans: support Python is our next goal. Hopefully we will
>> >     get a prototype working sometime next quarter :).
>> >
>> >     Btw, thanks everyone for the comments!
>> >
>> >     Thanks,
>> >     Xinyu
>> >
>> >     On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <robert@frantil.com
>> >     <ma...@frantil.com>> wrote:
>> >
>> >         This is exciting! Is it implemented as a portability framework
>> >         runner too?
>> >
>> >
>> >         On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pabloem@google.com
>> >         <ma...@google.com>> wrote:
>> >
>> >             It's very exciting to see a new runner making it into
>> >             master. : )
>> >
>> >             Best
>> >             -P.
>> >
>> >             On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez
>> >             <rfernand@google.com <ma...@google.com>> wrote:
>> >
>> >                 I've just read this and wanted to share my excitement
>> :D
>> >
>> >
>> >
>> >                 On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles
>> >                 <klk@google.com <ma...@google.com>> wrote:
>> >
>> >                     One thing that will be necessary is porting the
>> >                     build to Gradle.
>> >
>> >                     Kenn
>> >
>> >                     On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu
>> >                     <xinyuliu.us@gmail.com
>> >                     <ma...@gmail.com>> wrote:
>> >
>> >                         Hi, Folks,
>> >
>> >                         On behalf of the Samza team, I would like to
>> >                         propose to merge the samza-runner branch into
>> >                         master. The branch was created on Jan when we
>> >                         first introduced the Samza Runner [1], and we've
>> >                         been adding features and refining it afterwards.
>> >                         Now the runner satisfies the criteria outlined
>> >                         in [2], and merging it to master will give more
>> >                         visibility to other contributors and users.
>> >
>> >                         1. Have at least 2 contributors interested in
>> >                         maintaining it, and 1 committer interested in
>> >                         supporting it: *Both Chris and me have been
>> >                         making contributions and I am going to sign up
>> >                         for the support. There are more folks in the
>> >                         Samza team interested in contributing to it.
>> >                         Thanks Kenn for all the help and reviews for the
>> >                         runner!*
>> >                         2. Provide both end-user and developer-facing
>> >                         documentation: *The PR for the samza-runner doc
>> >                         has runner user guide, capability matrix, and
>> >                         tutorial using WordCount examples.*
>> >                         3. Have at least a basic level of unit test
>> >                         coverage: *Unit tests are here [3].*
>> >                         4. Run all existing applicable integration tests
>> >                         with other Beam components and create additional
>> >                         tests as appropriate: Enabled ValidatesRunner
>> >                         tests.*
>> >                         5. Be able to handle a subset of the model that
>> >                         addresses a significant set of use cases, such
>> >                         as ‘traditional batch’ or ‘processing time
>> >                         streaming’: *We have test Beam jobs running in
>> >                         Yarn using event-time processing of Kafka
>> streams.*
>> >                         6. Update the capability matrix with the current
>> >                         status. *Same as #2.*
>> >                         7. Add a webpage under documentation/runners.
>> >                         *Same as #2.*
>> >
>> >                         The PR for the samza-runner
>> >                         merge: https://github.com/apache/beam/pull/5668
>> >                         <https://github.com/apache/beam/pull/5668>
>> >                         The PR for the samza-runner
>> >                         doc:
>> https://github.com/apache/beam-site/pull/471 <
>> https://github.com/apache/beam-site/pull/471>
>> >
>> >                         Thanks,
>> >                         Xinyu
>> >
>> >                         [1]
>> https://issues.apache.org/jira/browse/BEAM-3079
>> >                         <
>> https://issues.apache.org/jira/browse/BEAM-3079>
>> >                         [2] https://beam.apache.org/contribute/
>> >                         <https://beam.apache.org/contribute/>
>> >                         [3]
>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>> >                         <
>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test>
>> >
>> >             --
>> >             Got feedback? go/pabloem-feedback
>> >
>> >
>> >
>>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Ismaël Mejía <ie...@gmail.com>.
+1

It is important to have new runners merged (even if not 100% complete) so
they benefit of the fixes going on, and that they can easily (and
incrementally) start to track the new portability features as they develop.

What is next then ? Who triggers the green button so this happens?




On Sat, Jun 23, 2018 at 6:43 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> +1
>
> As the build is fine, it makes sense to merge pretty fast.
>
> Thanks,
> Regards
> JB
>
> On 22/06/2018 00:14, Xinyu Liu wrote:
> > I updated the merge PR with the gradle integration (there was some
> > Jenkins Java tests failure with google cloud quota issues. It seems not
> > related to this patch). Please feel free to ping me if anything else is
> > needed.
> >
> > Thanks,
> > Xinyu
> >
> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xinyuliu.us@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >     @Kenn: I am going to add the build.gradle. Is there anything else?
> >
> >     @Ahmet, @Robert: here are more details about the samza runner right
> now:
> >
> >     - Missing pieces: timer support in ParDo is not there yet and I plan
> >     to add it soon. SplittableParDo is missing but we don't have a use
> >     case so far. We are on par with the other runners for the rest of
> >     the Java features.
> >     - Work in Progress: implement the portable pipeline runner logic.
> >     - Future plans: support Python is our next goal. Hopefully we will
> >     get a prototype working sometime next quarter :).
> >
> >     Btw, thanks everyone for the comments!
> >
> >     Thanks,
> >     Xinyu
> >
> >     On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <robert@frantil.com
> >     <ma...@frantil.com>> wrote:
> >
> >         This is exciting! Is it implemented as a portability framework
> >         runner too?
> >
> >
> >         On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pabloem@google.com
> >         <ma...@google.com>> wrote:
> >
> >             It's very exciting to see a new runner making it into
> >             master. : )
> >
> >             Best
> >             -P.
> >
> >             On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez
> >             <rfernand@google.com <ma...@google.com>> wrote:
> >
> >                 I've just read this and wanted to share my excitement :D
> >
> >
> >
> >                 On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles
> >                 <klk@google.com <ma...@google.com>> wrote:
> >
> >                     One thing that will be necessary is porting the
> >                     build to Gradle.
> >
> >                     Kenn
> >
> >                     On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu
> >                     <xinyuliu.us@gmail.com
> >                     <ma...@gmail.com>> wrote:
> >
> >                         Hi, Folks,
> >
> >                         On behalf of the Samza team, I would like to
> >                         propose to merge the samza-runner branch into
> >                         master. The branch was created on Jan when we
> >                         first introduced the Samza Runner [1], and we've
> >                         been adding features and refining it afterwards.
> >                         Now the runner satisfies the criteria outlined
> >                         in [2], and merging it to master will give more
> >                         visibility to other contributors and users.
> >
> >                         1. Have at least 2 contributors interested in
> >                         maintaining it, and 1 committer interested in
> >                         supporting it: *Both Chris and me have been
> >                         making contributions and I am going to sign up
> >                         for the support. There are more folks in the
> >                         Samza team interested in contributing to it.
> >                         Thanks Kenn for all the help and reviews for the
> >                         runner!*
> >                         2. Provide both end-user and developer-facing
> >                         documentation: *The PR for the samza-runner doc
> >                         has runner user guide, capability matrix, and
> >                         tutorial using WordCount examples.*
> >                         3. Have at least a basic level of unit test
> >                         coverage: *Unit tests are here [3].*
> >                         4. Run all existing applicable integration tests
> >                         with other Beam components and create additional
> >                         tests as appropriate: Enabled ValidatesRunner
> >                         tests.*
> >                         5. Be able to handle a subset of the model that
> >                         addresses a significant set of use cases, such
> >                         as ‘traditional batch’ or ‘processing time
> >                         streaming’: *We have test Beam jobs running in
> >                         Yarn using event-time processing of Kafka
> streams.*
> >                         6. Update the capability matrix with the current
> >                         status. *Same as #2.*
> >                         7. Add a webpage under documentation/runners.
> >                         *Same as #2.*
> >
> >                         The PR for the samza-runner
> >                         merge: https://github.com/apache/beam/pull/5668
> >                         <https://github.com/apache/beam/pull/5668>
> >                         The PR for the samza-runner
> >                         doc:
> https://github.com/apache/beam-site/pull/471 <
> https://github.com/apache/beam-site/pull/471>
> >
> >                         Thanks,
> >                         Xinyu
> >
> >                         [1]
> https://issues.apache.org/jira/browse/BEAM-3079
> >                         <https://issues.apache.org/jira/browse/BEAM-3079
> >
> >                         [2] https://beam.apache.org/contribute/
> >                         <https://beam.apache.org/contribute/>
> >                         [3]
> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
> >                         <
> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test>
> >
> >             --
> >             Got feedback? go/pabloem-feedback
> >
> >
> >
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
+1

As the build is fine, it makes sense to merge pretty fast.

Thanks,
Regards
JB

On 22/06/2018 00:14, Xinyu Liu wrote:
> I updated the merge PR with the gradle integration (there was some
> Jenkins Java tests failure with google cloud quota issues. It seems not
> related to this patch). Please feel free to ping me if anything else is
> needed.
> 
> Thanks,
> Xinyu
> 
> On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xinyuliu.us@gmail.com
> <ma...@gmail.com>> wrote:
> 
>     @Kenn: I am going to add the build.gradle. Is there anything else?
> 
>     @Ahmet, @Robert: here are more details about the samza runner right now:
> 
>     - Missing pieces: timer support in ParDo is not there yet and I plan
>     to add it soon. SplittableParDo is missing but we don't have a use
>     case so far. We are on par with the other runners for the rest of
>     the Java features.
>     - Work in Progress: implement the portable pipeline runner logic.
>     - Future plans: support Python is our next goal. Hopefully we will
>     get a prototype working sometime next quarter :).
> 
>     Btw, thanks everyone for the comments!
> 
>     Thanks,
>     Xinyu
> 
>     On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <robert@frantil.com
>     <ma...@frantil.com>> wrote:
> 
>         This is exciting! Is it implemented as a portability framework
>         runner too?
> 
> 
>         On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pabloem@google.com
>         <ma...@google.com>> wrote:
> 
>             It's very exciting to see a new runner making it into
>             master. : )
> 
>             Best
>             -P.
> 
>             On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez
>             <rfernand@google.com <ma...@google.com>> wrote:
> 
>                 I've just read this and wanted to share my excitement :D 
> 
> 
> 
>                 On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles
>                 <klk@google.com <ma...@google.com>> wrote:
> 
>                     One thing that will be necessary is porting the
>                     build to Gradle.
> 
>                     Kenn
> 
>                     On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu
>                     <xinyuliu.us@gmail.com
>                     <ma...@gmail.com>> wrote:
> 
>                         Hi, Folks,
> 
>                         On behalf of the Samza team, I would like to
>                         propose to merge the samza-runner branch into
>                         master. The branch was created on Jan when we
>                         first introduced the Samza Runner [1], and we've
>                         been adding features and refining it afterwards.
>                         Now the runner satisfies the criteria outlined
>                         in [2], and merging it to master will give more
>                         visibility to other contributors and users.
> 
>                         1. Have at least 2 contributors interested in
>                         maintaining it, and 1 committer interested in
>                         supporting it: *Both Chris and me have been
>                         making contributions and I am going to sign up
>                         for the support. There are more folks in the
>                         Samza team interested in contributing to it.
>                         Thanks Kenn for all the help and reviews for the
>                         runner!*
>                         2. Provide both end-user and developer-facing
>                         documentation: *The PR for the samza-runner doc
>                         has runner user guide, capability matrix, and
>                         tutorial using WordCount examples.*
>                         3. Have at least a basic level of unit test
>                         coverage: *Unit tests are here [3].*
>                         4. Run all existing applicable integration tests
>                         with other Beam components and create additional
>                         tests as appropriate: Enabled ValidatesRunner
>                         tests.*
>                         5. Be able to handle a subset of the model that
>                         addresses a significant set of use cases, such
>                         as ‘traditional batch’ or ‘processing time
>                         streaming’: *We have test Beam jobs running in
>                         Yarn using event-time processing of Kafka streams.*
>                         6. Update the capability matrix with the current
>                         status. *Same as #2.*
>                         7. Add a webpage under documentation/runners.
>                         *Same as #2.*
> 
>                         The PR for the samza-runner
>                         merge: https://github.com/apache/beam/pull/5668
>                         <https://github.com/apache/beam/pull/5668>
>                         The PR for the samza-runner
>                         doc: https://github.com/apache/beam-site/pull/471 <https://github.com/apache/beam-site/pull/471>
> 
>                         Thanks,
>                         Xinyu
> 
>                         [1] https://issues.apache.org/jira/browse/BEAM-3079
>                         <https://issues.apache.org/jira/browse/BEAM-3079>
>                         [2] https://beam.apache.org/contribute/
>                         <https://beam.apache.org/contribute/>
>                         [3] https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>                         <https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test>
> 
>             -- 
>             Got feedback? go/pabloem-feedback
> 
> 
> 

Re: [PROPOSAL] Merge samza-runner to master

Posted by Lukasz Cwik <lc...@google.com>.
+1 for merging Samza.
Portability is making great progress and it will be great to see Samza join
in the effort to be able to run non Java pipelines.

On Fri, Jun 22, 2018 at 1:53 PM Robert Bradshaw <ro...@google.com> wrote:

> Thanks for the clarification on contributors, that makes me much more
> comfortable. I agree that portability isn't ready enough to require
> it, and am encouraged by the plans to focus on this next quarter.
>
> These are my only reservations, and I see lot of benefits of making
> samza an official runner. I am definitely in favor of merging it in.
>
> On Fri, Jun 22, 2018 at 9:55 AM Xinyu Liu <xi...@gmail.com> wrote:
> >
> > A little clarification on the contributors: Chris Pettitt and I are the
> main contributors so far. Chris wrote the initial prototype but his commits
> got squashed into the giant initial commit, and he's been reviewing all
> incremental changes afterwards. Two more team members (Boris Shkolnik and
> Hai Lu) are starting to work on it. In the next quarter, our focus is
> portability, particularly Python. I will keep you guys updated with our
> status and plan, and maybe more questions and ideas down the road :).
> >
> > Thanks,
> > Xinyu
> >
> > On Fri, Jun 22, 2018 at 7:23 AM, Rafael Fernandez <rf...@google.com>
> wrote:
> >>
> >> I think it's great to go ahead and merge it, so it can continue
> evolving. As with all things, it'll adopt new stuff as it becomes ready (in
> fact, it may even prove to be a great example of how to port an existing
> "legacy" runner to the portability stuff when ready).
> >>
> >> It seems the immediate blocker (gradle) was addressed, and there is
> great future work planned. Exciting!
> >>
> >> On Thu, Jun 21, 2018 at 8:00 PM Kenneth Knowles <kl...@google.com> wrote:
> >>>
> >>> *Contributors*
> >>> Agree with Robert's concern. But this is a nice opportunity for Beam
> to connect. It is a different sort of backend and a different sort of
> community that we are linking in.
> >>>
> >>> Consider the Gearpump and Apex runners: both had resumes that met the
> requirements, but might not today. But they haven't been a burden. I have
> some hope the Samza runner might have a better chance recruiting users and
> contributors, since the value add for Samza users is unique among Beam
> runners, and likewise the Samza community is unique among backend
> communities.
> >>>
> >>> *Portability*
> >>> My take is that we shouldn't _start_ any runner down the legacy path.
> But this is runner predates portability. I don't think the Java SDK is
> ready to provide feature parity, much less adequate performance, so it
> doesn't seem reasonable to require using it. Community > code as well.
> >>>
> >>> Kenn
> >>>
> >>> On Thu, Jun 21, 2018 at 3:34 PM Robert Bradshaw <ro...@google.com>
> wrote:
> >>>>
> >>>> Neat to see a new runner on board!
> >>>>
> >>>> I would like to make it a requirement for all new runners to support
> >>>> the portability API, but given that it's still somewhat of a moving
> >>>> target, and you have ongoing work in this direction, that may not be a
> >>>> hard requirement.
> >>>>
> >>>> I'm a bit concerned that there is are only two contributors (but the
> >>>> git logs): you and Kenn. But you do indicate there are others
> >>>> interested in working on this.
> >>>>
> >>>> Other than that, this looks great.
> >>>>
> >>>> - Robert
> >>>>
> >>>>
> >>>> On Thu, Jun 21, 2018 at 3:14 PM Xinyu Liu <xi...@gmail.com>
> wrote:
> >>>> >
> >>>> > I updated the merge PR with the gradle integration (there was some
> Jenkins Java tests failure with google cloud quota issues. It seems not
> related to this patch). Please feel free to ping me if anything else is
> needed.
> >>>> >
> >>>> > Thanks,
> >>>> > Xinyu
> >>>> >
> >>>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com>
> wrote:
> >>>> >>
> >>>> >> @Kenn: I am going to add the build.gradle. Is there anything else?
> >>>> >>
> >>>> >> @Ahmet, @Robert: here are more details about the samza runner
> right now:
> >>>> >>
> >>>> >> - Missing pieces: timer support in ParDo is not there yet and I
> plan to add it soon. SplittableParDo is missing but we don't have a use
> case so far. We are on par with the other runners for the rest of the Java
> features.
> >>>> >> - Work in Progress: implement the portable pipeline runner logic.
> >>>> >> - Future plans: support Python is our next goal. Hopefully we will
> get a prototype working sometime next quarter :).
> >>>> >>
> >>>> >> Btw, thanks everyone for the comments!
> >>>> >>
> >>>> >> Thanks,
> >>>> >> Xinyu
> >>>> >>
> >>>> >> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com>
> wrote:
> >>>> >>>
> >>>> >>> This is exciting! Is it implemented as a portability framework
> runner too?
> >>>> >>>
> >>>> >>>
> >>>> >>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com>
> wrote:
> >>>> >>>>
> >>>> >>>> It's very exciting to see a new runner making it into master. : )
> >>>> >>>>
> >>>> >>>> Best
> >>>> >>>> -P.
> >>>> >>>>
> >>>> >>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <
> rfernand@google.com> wrote:
> >>>> >>>>>
> >>>> >>>>> I've just read this and wanted to share my excitement :D
> >>>> >>>>>
> >>>> >>>>>
> >>>> >>>>>
> >>>> >>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com>
> wrote:
> >>>> >>>>>>
> >>>> >>>>>> One thing that will be necessary is porting the build to
> Gradle.
> >>>> >>>>>>
> >>>> >>>>>> Kenn
> >>>> >>>>>>
> >>>> >>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <
> xinyuliu.us@gmail.com> wrote:
> >>>> >>>>>>>
> >>>> >>>>>>> Hi, Folks,
> >>>> >>>>>>>
> >>>> >>>>>>> On behalf of the Samza team, I would like to propose to merge
> the samza-runner branch into master. The branch was created on Jan when we
> first introduced the Samza Runner [1], and we've been adding features and
> refining it afterwards. Now the runner satisfies the criteria outlined in
> [2], and merging it to master will give more visibility to other
> contributors and users.
> >>>> >>>>>>>
> >>>> >>>>>>> 1. Have at least 2 contributors interested in maintaining it,
> and 1 committer interested in supporting it: *Both Chris and me have been
> making contributions and I am going to sign up for the support. There are
> more folks in the Samza team interested in contributing to it. Thanks Kenn
> for all the help and reviews for the runner!*
> >>>> >>>>>>> 2. Provide both end-user and developer-facing documentation:
> *The PR for the samza-runner doc has runner user guide, capability matrix,
> and tutorial using WordCount examples.*
> >>>> >>>>>>> 3. Have at least a basic level of unit test coverage: *Unit
> tests are here [3].*
> >>>> >>>>>>> 4. Run all existing applicable integration tests with other
> Beam components and create additional tests as appropriate: Enabled
> ValidatesRunner tests.*
> >>>> >>>>>>> 5. Be able to handle a subset of the model that addresses a
> significant set of use cases, such as ‘traditional batch’ or ‘processing
> time streaming’: *We have test Beam jobs running in Yarn using event-time
> processing of Kafka streams.*
> >>>> >>>>>>> 6. Update the capability matrix with the current status.
> *Same as #2.*
> >>>> >>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
> >>>> >>>>>>>
> >>>> >>>>>>> The PR for the samza-runner merge:
> https://github.com/apache/beam/pull/5668
> >>>> >>>>>>> The PR for the samza-runner doc:
> https://github.com/apache/beam-site/pull/471
> >>>> >>>>>>>
> >>>> >>>>>>> Thanks,
> >>>> >>>>>>> Xinyu
> >>>> >>>>>>>
> >>>> >>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
> >>>> >>>>>>> [2] https://beam.apache.org/contribute/
> >>>> >>>>>>> [3]
> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
> >>>> >>>>
> >>>> >>>> --
> >>>> >>>> Got feedback? go/pabloem-feedback
> <https://goto.google.com/pabloem-feedback>
> >>>> >>
> >>>> >>
> >>>> >
> >
> >
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Robert Bradshaw <ro...@google.com>.
Thanks for the clarification on contributors, that makes me much more
comfortable. I agree that portability isn't ready enough to require
it, and am encouraged by the plans to focus on this next quarter.

These are my only reservations, and I see lot of benefits of making
samza an official runner. I am definitely in favor of merging it in.

On Fri, Jun 22, 2018 at 9:55 AM Xinyu Liu <xi...@gmail.com> wrote:
>
> A little clarification on the contributors: Chris Pettitt and I are the main contributors so far. Chris wrote the initial prototype but his commits got squashed into the giant initial commit, and he's been reviewing all incremental changes afterwards. Two more team members (Boris Shkolnik and Hai Lu) are starting to work on it. In the next quarter, our focus is portability, particularly Python. I will keep you guys updated with our status and plan, and maybe more questions and ideas down the road :).
>
> Thanks,
> Xinyu
>
> On Fri, Jun 22, 2018 at 7:23 AM, Rafael Fernandez <rf...@google.com> wrote:
>>
>> I think it's great to go ahead and merge it, so it can continue evolving. As with all things, it'll adopt new stuff as it becomes ready (in fact, it may even prove to be a great example of how to port an existing "legacy" runner to the portability stuff when ready).
>>
>> It seems the immediate blocker (gradle) was addressed, and there is great future work planned. Exciting!
>>
>> On Thu, Jun 21, 2018 at 8:00 PM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>> *Contributors*
>>> Agree with Robert's concern. But this is a nice opportunity for Beam to connect. It is a different sort of backend and a different sort of community that we are linking in.
>>>
>>> Consider the Gearpump and Apex runners: both had resumes that met the requirements, but might not today. But they haven't been a burden. I have some hope the Samza runner might have a better chance recruiting users and contributors, since the value add for Samza users is unique among Beam runners, and likewise the Samza community is unique among backend communities.
>>>
>>> *Portability*
>>> My take is that we shouldn't _start_ any runner down the legacy path. But this is runner predates portability. I don't think the Java SDK is ready to provide feature parity, much less adequate performance, so it doesn't seem reasonable to require using it. Community > code as well.
>>>
>>> Kenn
>>>
>>> On Thu, Jun 21, 2018 at 3:34 PM Robert Bradshaw <ro...@google.com> wrote:
>>>>
>>>> Neat to see a new runner on board!
>>>>
>>>> I would like to make it a requirement for all new runners to support
>>>> the portability API, but given that it's still somewhat of a moving
>>>> target, and you have ongoing work in this direction, that may not be a
>>>> hard requirement.
>>>>
>>>> I'm a bit concerned that there is are only two contributors (but the
>>>> git logs): you and Kenn. But you do indicate there are others
>>>> interested in working on this.
>>>>
>>>> Other than that, this looks great.
>>>>
>>>> - Robert
>>>>
>>>>
>>>> On Thu, Jun 21, 2018 at 3:14 PM Xinyu Liu <xi...@gmail.com> wrote:
>>>> >
>>>> > I updated the merge PR with the gradle integration (there was some Jenkins Java tests failure with google cloud quota issues. It seems not related to this patch). Please feel free to ping me if anything else is needed.
>>>> >
>>>> > Thanks,
>>>> > Xinyu
>>>> >
>>>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com> wrote:
>>>> >>
>>>> >> @Kenn: I am going to add the build.gradle. Is there anything else?
>>>> >>
>>>> >> @Ahmet, @Robert: here are more details about the samza runner right now:
>>>> >>
>>>> >> - Missing pieces: timer support in ParDo is not there yet and I plan to add it soon. SplittableParDo is missing but we don't have a use case so far. We are on par with the other runners for the rest of the Java features.
>>>> >> - Work in Progress: implement the portable pipeline runner logic.
>>>> >> - Future plans: support Python is our next goal. Hopefully we will get a prototype working sometime next quarter :).
>>>> >>
>>>> >> Btw, thanks everyone for the comments!
>>>> >>
>>>> >> Thanks,
>>>> >> Xinyu
>>>> >>
>>>> >> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com> wrote:
>>>> >>>
>>>> >>> This is exciting! Is it implemented as a portability framework runner too?
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com> wrote:
>>>> >>>>
>>>> >>>> It's very exciting to see a new runner making it into master. : )
>>>> >>>>
>>>> >>>> Best
>>>> >>>> -P.
>>>> >>>>
>>>> >>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com> wrote:
>>>> >>>>>
>>>> >>>>> I've just read this and wanted to share my excitement :D
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>>>> >>>>>>
>>>> >>>>>> One thing that will be necessary is porting the build to Gradle.
>>>> >>>>>>
>>>> >>>>>> Kenn
>>>> >>>>>>
>>>> >>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Hi, Folks,
>>>> >>>>>>>
>>>> >>>>>>> On behalf of the Samza team, I would like to propose to merge the samza-runner branch into master. The branch was created on Jan when we first introduced the Samza Runner [1], and we've been adding features and refining it afterwards. Now the runner satisfies the criteria outlined in [2], and merging it to master will give more visibility to other contributors and users.
>>>> >>>>>>>
>>>> >>>>>>> 1. Have at least 2 contributors interested in maintaining it, and 1 committer interested in supporting it: *Both Chris and me have been making contributions and I am going to sign up for the support. There are more folks in the Samza team interested in contributing to it. Thanks Kenn for all the help and reviews for the runner!*
>>>> >>>>>>> 2. Provide both end-user and developer-facing documentation: *The PR for the samza-runner doc has runner user guide, capability matrix, and tutorial using WordCount examples.*
>>>> >>>>>>> 3. Have at least a basic level of unit test coverage: *Unit tests are here [3].*
>>>> >>>>>>> 4. Run all existing applicable integration tests with other Beam components and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>>> >>>>>>> 5. Be able to handle a subset of the model that addresses a significant set of use cases, such as ‘traditional batch’ or ‘processing time streaming’: *We have test Beam jobs running in Yarn using event-time processing of Kafka streams.*
>>>> >>>>>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>>> >>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>> >>>>>>>
>>>> >>>>>>> The PR for the samza-runner merge: https://github.com/apache/beam/pull/5668
>>>> >>>>>>> The PR for the samza-runner doc: https://github.com/apache/beam-site/pull/471
>>>> >>>>>>>
>>>> >>>>>>> Thanks,
>>>> >>>>>>> Xinyu
>>>> >>>>>>>
>>>> >>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>>> >>>>>>> [2] https://beam.apache.org/contribute/
>>>> >>>>>>> [3] https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>>>> >>>>
>>>> >>>> --
>>>> >>>> Got feedback? go/pabloem-feedback
>>>> >>
>>>> >>
>>>> >
>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Xinyu Liu <xi...@gmail.com>.
A little clarification on the contributors: Chris Pettitt and I are the
main contributors so far. Chris wrote the initial prototype but his commits
got squashed into the giant initial commit, and he's been reviewing all
incremental changes afterwards. Two more team members (Boris Shkolnik and
Hai Lu) are starting to work on it. In the next quarter, our focus is
portability, particularly Python. I will keep you guys updated with our
status and plan, and maybe more questions and ideas down the road :).

Thanks,
Xinyu

On Fri, Jun 22, 2018 at 7:23 AM, Rafael Fernandez <rf...@google.com>
wrote:

> ​I think it's great to go ahead and merge it, so it can continue evolving.
> As with all things, it'll adopt new stuff as it becomes ready (in fact, it
> may even prove to be a great example of how to port an existing "legacy"
> runner to the portability stuff when ready).​
>
> It seems the immediate blocker (gradle) was addressed, and there is great
> future work planned. Exciting!
>
> On Thu, Jun 21, 2018 at 8:00 PM Kenneth Knowles <kl...@google.com> wrote:
>
>> *Contributors*
>> Agree with Robert's concern. But this is a nice opportunity for Beam to
>> connect. It is a different sort of backend and a different sort of
>> community that we are linking in.
>>
>> Consider the Gearpump and Apex runners: both had resumes that met the
>> requirements, but might not today. But they haven't been a burden. I have
>> some hope the Samza runner might have a better chance recruiting users and
>> contributors, since the value add for Samza users is unique among Beam
>> runners, and likewise the Samza community is unique among backend
>> communities.
>>
>> *Portability*
>> My take is that we shouldn't _start_ any runner down the legacy path. But
>> this is runner predates portability. I don't think the Java SDK is ready to
>> provide feature parity, much less adequate performance, so it doesn't seem
>> reasonable to require using it. Community > code as well.
>>
>> Kenn
>>
>> On Thu, Jun 21, 2018 at 3:34 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> Neat to see a new runner on board!
>>>
>>> I would like to make it a requirement for all new runners to support
>>> the portability API, but given that it's still somewhat of a moving
>>> target, and you have ongoing work in this direction, that may not be a
>>> hard requirement.
>>>
>>> I'm a bit concerned that there is are only two contributors (but the
>>> git logs): you and Kenn. But you do indicate there are others
>>> interested in working on this.
>>>
>>> Other than that, this looks great.
>>>
>>> - Robert
>>>
>>>
>>> On Thu, Jun 21, 2018 at 3:14 PM Xinyu Liu <xi...@gmail.com> wrote:
>>> >
>>> > I updated the merge PR with the gradle integration (there was some
>>> Jenkins Java tests failure with google cloud quota issues. It seems not
>>> related to this patch). Please feel free to ping me if anything else is
>>> needed.
>>> >
>>> > Thanks,
>>> > Xinyu
>>> >
>>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com>
>>> wrote:
>>> >>
>>> >> @Kenn: I am going to add the build.gradle. Is there anything else?
>>> >>
>>> >> @Ahmet, @Robert: here are more details about the samza runner right
>>> now:
>>> >>
>>> >> - Missing pieces: timer support in ParDo is not there yet and I plan
>>> to add it soon. SplittableParDo is missing but we don't have a use case so
>>> far. We are on par with the other runners for the rest of the Java features.
>>> >> - Work in Progress: implement the portable pipeline runner logic.
>>> >> - Future plans: support Python is our next goal. Hopefully we will
>>> get a prototype working sometime next quarter :).
>>> >>
>>> >> Btw, thanks everyone for the comments!
>>> >>
>>> >> Thanks,
>>> >> Xinyu
>>> >>
>>> >> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com>
>>> wrote:
>>> >>>
>>> >>> This is exciting! Is it implemented as a portability framework
>>> runner too?
>>> >>>
>>> >>>
>>> >>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com>
>>> wrote:
>>> >>>>
>>> >>>> It's very exciting to see a new runner making it into master. : )
>>> >>>>
>>> >>>> Best
>>> >>>> -P.
>>> >>>>
>>> >>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <
>>> rfernand@google.com> wrote:
>>> >>>>>
>>> >>>>> I've just read this and wanted to share my excitement :D
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>> One thing that will be necessary is porting the build to Gradle.
>>> >>>>>>
>>> >>>>>> Kenn
>>> >>>>>>
>>> >>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com>
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> Hi, Folks,
>>> >>>>>>>
>>> >>>>>>> On behalf of the Samza team, I would like to propose to merge
>>> the samza-runner branch into master. The branch was created on Jan when we
>>> first introduced the Samza Runner [1], and we've been adding features and
>>> refining it afterwards. Now the runner satisfies the criteria outlined in
>>> [2], and merging it to master will give more visibility to other
>>> contributors and users.
>>> >>>>>>>
>>> >>>>>>> 1. Have at least 2 contributors interested in maintaining it,
>>> and 1 committer interested in supporting it: *Both Chris and me have been
>>> making contributions and I am going to sign up for the support. There are
>>> more folks in the Samza team interested in contributing to it. Thanks Kenn
>>> for all the help and reviews for the runner!*
>>> >>>>>>> 2. Provide both end-user and developer-facing documentation:
>>> *The PR for the samza-runner doc has runner user guide, capability matrix,
>>> and tutorial using WordCount examples.*
>>> >>>>>>> 3. Have at least a basic level of unit test coverage: *Unit
>>> tests are here [3].*
>>> >>>>>>> 4. Run all existing applicable integration tests with other Beam
>>> components and create additional tests as appropriate: Enabled
>>> ValidatesRunner tests.*
>>> >>>>>>> 5. Be able to handle a subset of the model that addresses a
>>> significant set of use cases, such as ‘traditional batch’ or ‘processing
>>> time streaming’: *We have test Beam jobs running in Yarn using event-time
>>> processing of Kafka streams.*
>>> >>>>>>> 6. Update the capability matrix with the current status. *Same
>>> as #2.*
>>> >>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>> >>>>>>>
>>> >>>>>>> The PR for the samza-runner merge: https://github.com/apache/
>>> beam/pull/5668
>>> >>>>>>> The PR for the samza-runner doc: https://github.com/apache/
>>> beam-site/pull/471
>>> >>>>>>>
>>> >>>>>>> Thanks,
>>> >>>>>>> Xinyu
>>> >>>>>>>
>>> >>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>> >>>>>>> [2] https://beam.apache.org/contribute/
>>> >>>>>>> [3] https://github.com/apache/beam/tree/samza-runner/
>>> runners/samza/src/test
>>> >>>>
>>> >>>> --
>>> >>>> Got feedback? go/pabloem-feedback
>>> <https://goto.google.com/pabloem-feedback>
>>> >>
>>> >>
>>> >
>>>
>>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Rafael Fernandez <rf...@google.com>.
​I think it's great to go ahead and merge it, so it can continue evolving.
As with all things, it'll adopt new stuff as it becomes ready (in fact, it
may even prove to be a great example of how to port an existing "legacy"
runner to the portability stuff when ready).​

It seems the immediate blocker (gradle) was addressed, and there is great
future work planned. Exciting!

On Thu, Jun 21, 2018 at 8:00 PM Kenneth Knowles <kl...@google.com> wrote:

> *Contributors*
> Agree with Robert's concern. But this is a nice opportunity for Beam to
> connect. It is a different sort of backend and a different sort of
> community that we are linking in.
>
> Consider the Gearpump and Apex runners: both had resumes that met the
> requirements, but might not today. But they haven't been a burden. I have
> some hope the Samza runner might have a better chance recruiting users and
> contributors, since the value add for Samza users is unique among Beam
> runners, and likewise the Samza community is unique among backend
> communities.
>
> *Portability*
> My take is that we shouldn't _start_ any runner down the legacy path. But
> this is runner predates portability. I don't think the Java SDK is ready to
> provide feature parity, much less adequate performance, so it doesn't seem
> reasonable to require using it. Community > code as well.
>
> Kenn
>
> On Thu, Jun 21, 2018 at 3:34 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> Neat to see a new runner on board!
>>
>> I would like to make it a requirement for all new runners to support
>> the portability API, but given that it's still somewhat of a moving
>> target, and you have ongoing work in this direction, that may not be a
>> hard requirement.
>>
>> I'm a bit concerned that there is are only two contributors (but the
>> git logs): you and Kenn. But you do indicate there are others
>> interested in working on this.
>>
>> Other than that, this looks great.
>>
>> - Robert
>>
>>
>> On Thu, Jun 21, 2018 at 3:14 PM Xinyu Liu <xi...@gmail.com> wrote:
>> >
>> > I updated the merge PR with the gradle integration (there was some
>> Jenkins Java tests failure with google cloud quota issues. It seems not
>> related to this patch). Please feel free to ping me if anything else is
>> needed.
>> >
>> > Thanks,
>> > Xinyu
>> >
>> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com>
>> wrote:
>> >>
>> >> @Kenn: I am going to add the build.gradle. Is there anything else?
>> >>
>> >> @Ahmet, @Robert: here are more details about the samza runner right
>> now:
>> >>
>> >> - Missing pieces: timer support in ParDo is not there yet and I plan
>> to add it soon. SplittableParDo is missing but we don't have a use case so
>> far. We are on par with the other runners for the rest of the Java features.
>> >> - Work in Progress: implement the portable pipeline runner logic.
>> >> - Future plans: support Python is our next goal. Hopefully we will get
>> a prototype working sometime next quarter :).
>> >>
>> >> Btw, thanks everyone for the comments!
>> >>
>> >> Thanks,
>> >> Xinyu
>> >>
>> >> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com>
>> wrote:
>> >>>
>> >>> This is exciting! Is it implemented as a portability framework runner
>> too?
>> >>>
>> >>>
>> >>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com>
>> wrote:
>> >>>>
>> >>>> It's very exciting to see a new runner making it into master. : )
>> >>>>
>> >>>> Best
>> >>>> -P.
>> >>>>
>> >>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <
>> rfernand@google.com> wrote:
>> >>>>>
>> >>>>> I've just read this and wanted to share my excitement :D
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com>
>> wrote:
>> >>>>>>
>> >>>>>> One thing that will be necessary is porting the build to Gradle.
>> >>>>>>
>> >>>>>> Kenn
>> >>>>>>
>> >>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com>
>> wrote:
>> >>>>>>>
>> >>>>>>> Hi, Folks,
>> >>>>>>>
>> >>>>>>> On behalf of the Samza team, I would like to propose to merge the
>> samza-runner branch into master. The branch was created on Jan when we
>> first introduced the Samza Runner [1], and we've been adding features and
>> refining it afterwards. Now the runner satisfies the criteria outlined in
>> [2], and merging it to master will give more visibility to other
>> contributors and users.
>> >>>>>>>
>> >>>>>>> 1. Have at least 2 contributors interested in maintaining it, and
>> 1 committer interested in supporting it: *Both Chris and me have been
>> making contributions and I am going to sign up for the support. There are
>> more folks in the Samza team interested in contributing to it. Thanks Kenn
>> for all the help and reviews for the runner!*
>> >>>>>>> 2. Provide both end-user and developer-facing documentation: *The
>> PR for the samza-runner doc has runner user guide, capability matrix, and
>> tutorial using WordCount examples.*
>> >>>>>>> 3. Have at least a basic level of unit test coverage: *Unit tests
>> are here [3].*
>> >>>>>>> 4. Run all existing applicable integration tests with other Beam
>> components and create additional tests as appropriate: Enabled
>> ValidatesRunner tests.*
>> >>>>>>> 5. Be able to handle a subset of the model that addresses a
>> significant set of use cases, such as ‘traditional batch’ or ‘processing
>> time streaming’: *We have test Beam jobs running in Yarn using event-time
>> processing of Kafka streams.*
>> >>>>>>> 6. Update the capability matrix with the current status. *Same as
>> #2.*
>> >>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>> >>>>>>>
>> >>>>>>> The PR for the samza-runner merge:
>> https://github.com/apache/beam/pull/5668
>> >>>>>>> The PR for the samza-runner doc:
>> https://github.com/apache/beam-site/pull/471
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Xinyu
>> >>>>>>>
>> >>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>> >>>>>>> [2] https://beam.apache.org/contribute/
>> >>>>>>> [3]
>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>> >>>>
>> >>>> --
>> >>>> Got feedback? go/pabloem-feedback
>> <https://goto.google.com/pabloem-feedback>
>> >>
>> >>
>> >
>>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Kenneth Knowles <kl...@google.com>.
*Contributors*
Agree with Robert's concern. But this is a nice opportunity for Beam to
connect. It is a different sort of backend and a different sort of
community that we are linking in.

Consider the Gearpump and Apex runners: both had resumes that met the
requirements, but might not today. But they haven't been a burden. I have
some hope the Samza runner might have a better chance recruiting users and
contributors, since the value add for Samza users is unique among Beam
runners, and likewise the Samza community is unique among backend
communities.

*Portability*
My take is that we shouldn't _start_ any runner down the legacy path. But
this is runner predates portability. I don't think the Java SDK is ready to
provide feature parity, much less adequate performance, so it doesn't seem
reasonable to require using it. Community > code as well.

Kenn

On Thu, Jun 21, 2018 at 3:34 PM Robert Bradshaw <ro...@google.com> wrote:

> Neat to see a new runner on board!
>
> I would like to make it a requirement for all new runners to support
> the portability API, but given that it's still somewhat of a moving
> target, and you have ongoing work in this direction, that may not be a
> hard requirement.
>
> I'm a bit concerned that there is are only two contributors (but the
> git logs): you and Kenn. But you do indicate there are others
> interested in working on this.
>
> Other than that, this looks great.
>
> - Robert
>
>
> On Thu, Jun 21, 2018 at 3:14 PM Xinyu Liu <xi...@gmail.com> wrote:
> >
> > I updated the merge PR with the gradle integration (there was some
> Jenkins Java tests failure with google cloud quota issues. It seems not
> related to this patch). Please feel free to ping me if anything else is
> needed.
> >
> > Thanks,
> > Xinyu
> >
> > On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com>
> wrote:
> >>
> >> @Kenn: I am going to add the build.gradle. Is there anything else?
> >>
> >> @Ahmet, @Robert: here are more details about the samza runner right now:
> >>
> >> - Missing pieces: timer support in ParDo is not there yet and I plan to
> add it soon. SplittableParDo is missing but we don't have a use case so
> far. We are on par with the other runners for the rest of the Java features.
> >> - Work in Progress: implement the portable pipeline runner logic.
> >> - Future plans: support Python is our next goal. Hopefully we will get
> a prototype working sometime next quarter :).
> >>
> >> Btw, thanks everyone for the comments!
> >>
> >> Thanks,
> >> Xinyu
> >>
> >> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com>
> wrote:
> >>>
> >>> This is exciting! Is it implemented as a portability framework runner
> too?
> >>>
> >>>
> >>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com>
> wrote:
> >>>>
> >>>> It's very exciting to see a new runner making it into master. : )
> >>>>
> >>>> Best
> >>>> -P.
> >>>>
> >>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com>
> wrote:
> >>>>>
> >>>>> I've just read this and wanted to share my excitement :D
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com>
> wrote:
> >>>>>>
> >>>>>> One thing that will be necessary is porting the build to Gradle.
> >>>>>>
> >>>>>> Kenn
> >>>>>>
> >>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> Hi, Folks,
> >>>>>>>
> >>>>>>> On behalf of the Samza team, I would like to propose to merge the
> samza-runner branch into master. The branch was created on Jan when we
> first introduced the Samza Runner [1], and we've been adding features and
> refining it afterwards. Now the runner satisfies the criteria outlined in
> [2], and merging it to master will give more visibility to other
> contributors and users.
> >>>>>>>
> >>>>>>> 1. Have at least 2 contributors interested in maintaining it, and
> 1 committer interested in supporting it: *Both Chris and me have been
> making contributions and I am going to sign up for the support. There are
> more folks in the Samza team interested in contributing to it. Thanks Kenn
> for all the help and reviews for the runner!*
> >>>>>>> 2. Provide both end-user and developer-facing documentation: *The
> PR for the samza-runner doc has runner user guide, capability matrix, and
> tutorial using WordCount examples.*
> >>>>>>> 3. Have at least a basic level of unit test coverage: *Unit tests
> are here [3].*
> >>>>>>> 4. Run all existing applicable integration tests with other Beam
> components and create additional tests as appropriate: Enabled
> ValidatesRunner tests.*
> >>>>>>> 5. Be able to handle a subset of the model that addresses a
> significant set of use cases, such as ‘traditional batch’ or ‘processing
> time streaming’: *We have test Beam jobs running in Yarn using event-time
> processing of Kafka streams.*
> >>>>>>> 6. Update the capability matrix with the current status. *Same as
> #2.*
> >>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
> >>>>>>>
> >>>>>>> The PR for the samza-runner merge:
> https://github.com/apache/beam/pull/5668
> >>>>>>> The PR for the samza-runner doc:
> https://github.com/apache/beam-site/pull/471
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Xinyu
> >>>>>>>
> >>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
> >>>>>>> [2] https://beam.apache.org/contribute/
> >>>>>>> [3]
> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
> >>>>
> >>>> --
> >>>> Got feedback? go/pabloem-feedback
> <https://goto.google.com/pabloem-feedback>
> >>
> >>
> >
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Robert Bradshaw <ro...@google.com>.
Neat to see a new runner on board!

I would like to make it a requirement for all new runners to support
the portability API, but given that it's still somewhat of a moving
target, and you have ongoing work in this direction, that may not be a
hard requirement.

I'm a bit concerned that there is are only two contributors (but the
git logs): you and Kenn. But you do indicate there are others
interested in working on this.

Other than that, this looks great.

- Robert


On Thu, Jun 21, 2018 at 3:14 PM Xinyu Liu <xi...@gmail.com> wrote:
>
> I updated the merge PR with the gradle integration (there was some Jenkins Java tests failure with google cloud quota issues. It seems not related to this patch). Please feel free to ping me if anything else is needed.
>
> Thanks,
> Xinyu
>
> On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com> wrote:
>>
>> @Kenn: I am going to add the build.gradle. Is there anything else?
>>
>> @Ahmet, @Robert: here are more details about the samza runner right now:
>>
>> - Missing pieces: timer support in ParDo is not there yet and I plan to add it soon. SplittableParDo is missing but we don't have a use case so far. We are on par with the other runners for the rest of the Java features.
>> - Work in Progress: implement the portable pipeline runner logic.
>> - Future plans: support Python is our next goal. Hopefully we will get a prototype working sometime next quarter :).
>>
>> Btw, thanks everyone for the comments!
>>
>> Thanks,
>> Xinyu
>>
>> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com> wrote:
>>>
>>> This is exciting! Is it implemented as a portability framework runner too?
>>>
>>>
>>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com> wrote:
>>>>
>>>> It's very exciting to see a new runner making it into master. : )
>>>>
>>>> Best
>>>> -P.
>>>>
>>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com> wrote:
>>>>>
>>>>> I've just read this and wanted to share my excitement :D
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>>>>>>
>>>>>> One thing that will be necessary is porting the build to Gradle.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi, Folks,
>>>>>>>
>>>>>>> On behalf of the Samza team, I would like to propose to merge the samza-runner branch into master. The branch was created on Jan when we first introduced the Samza Runner [1], and we've been adding features and refining it afterwards. Now the runner satisfies the criteria outlined in [2], and merging it to master will give more visibility to other contributors and users.
>>>>>>>
>>>>>>> 1. Have at least 2 contributors interested in maintaining it, and 1 committer interested in supporting it: *Both Chris and me have been making contributions and I am going to sign up for the support. There are more folks in the Samza team interested in contributing to it. Thanks Kenn for all the help and reviews for the runner!*
>>>>>>> 2. Provide both end-user and developer-facing documentation: *The PR for the samza-runner doc has runner user guide, capability matrix, and tutorial using WordCount examples.*
>>>>>>> 3. Have at least a basic level of unit test coverage: *Unit tests are here [3].*
>>>>>>> 4. Run all existing applicable integration tests with other Beam components and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>>>>>> 5. Be able to handle a subset of the model that addresses a significant set of use cases, such as ‘traditional batch’ or ‘processing time streaming’: *We have test Beam jobs running in Yarn using event-time processing of Kafka streams.*
>>>>>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>>>>>
>>>>>>> The PR for the samza-runner merge: https://github.com/apache/beam/pull/5668
>>>>>>> The PR for the samza-runner doc: https://github.com/apache/beam-site/pull/471
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Xinyu
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>>>>>> [2] https://beam.apache.org/contribute/
>>>>>>> [3] https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>>>>
>>>> --
>>>> Got feedback? go/pabloem-feedback
>>
>>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Xinyu Liu <xi...@gmail.com>.
I updated the merge PR with the gradle integration (there was some Jenkins
Java tests failure with google cloud quota issues. It seems not related to
this patch). Please feel free to ping me if anything else is needed.

Thanks,
Xinyu

On Mon, Jun 18, 2018 at 5:44 PM, Xinyu Liu <xi...@gmail.com> wrote:

> @Kenn: I am going to add the build.gradle. Is there anything else?
>
> @Ahmet, @Robert: here are more details about the samza runner right now:
>
> - Missing pieces: timer support in ParDo is not there yet and I plan to
> add it soon. SplittableParDo is missing but we don't have a use case so
> far. We are on par with the other runners for the rest of the Java features.
> - Work in Progress: implement the portable pipeline runner logic.
> - Future plans: support Python is our next goal. Hopefully we will get a
> prototype working sometime next quarter :).
>
> Btw, thanks everyone for the comments!
>
> Thanks,
> Xinyu
>
> On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com> wrote:
>
>> This is exciting! Is it implemented as a portability framework runner too?
>>
>>
>> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com> wrote:
>>
>>> It's very exciting to see a new runner making it into master. : )
>>>
>>> Best
>>> -P.
>>>
>>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com>
>>> wrote:
>>>
>>>> I've just read this and wanted to share my excitement :D
>>>>
>>>>
>>>>
>>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>> One thing that will be necessary is porting the build to Gradle.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, Folks,
>>>>>>
>>>>>> On behalf of the Samza team, I would like to propose to merge the
>>>>>> samza-runner branch into master. The branch was created on Jan when we
>>>>>> first introduced the Samza Runner [1], and we've been adding features and
>>>>>> refining it afterwards. Now the runner satisfies the criteria outlined in
>>>>>> [2], and merging it to master will give more visibility to other
>>>>>> contributors and users.
>>>>>>
>>>>>> 1. Have at least 2 contributors interested in maintaining it, and 1
>>>>>> committer interested in supporting it: *Both Chris and me have been making
>>>>>> contributions and I am going to sign up for the support. There are more
>>>>>> folks in the Samza team interested in contributing to it. Thanks Kenn for
>>>>>> all the help and reviews for the runner!*
>>>>>> 2. Provide both end-user and developer-facing documentation: *The PR
>>>>>> for the samza-runner doc has runner user guide, capability matrix, and
>>>>>> tutorial using WordCount examples.*
>>>>>> 3. Have at least a basic level of unit test coverage: *Unit tests are
>>>>>> here [3].*
>>>>>> 4. Run all existing applicable integration tests with other Beam components
>>>>>> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>>>>> 5. Be able to handle a subset of the model that addresses a
>>>>>> significant set of use cases, such as ‘traditional batch’ or ‘processing
>>>>>> time streaming’: *We have test Beam jobs running in Yarn using event-time
>>>>>> processing of Kafka streams.*
>>>>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>>>>
>>>>>> The PR for the samza-runner merge: https://github.com/apac
>>>>>> he/beam/pull/5668
>>>>>> The PR for the samza-runner doc: https://github.com/apache
>>>>>> /beam-site/pull/471
>>>>>>
>>>>>> Thanks,
>>>>>> Xinyu
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>>>>> [2] https://beam.apache.org/contribute/
>>>>>> [3] https://github.com/apache/beam/tree/samza-runner/runners
>>>>>> /samza/src/test
>>>>>>
>>>>> --
>>> Got feedback? go/pabloem-feedback
>>>
>>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Xinyu Liu <xi...@gmail.com>.
@Kenn: I am going to add the build.gradle. Is there anything else?

@Ahmet, @Robert: here are more details about the samza runner right now:

- Missing pieces: timer support in ParDo is not there yet and I plan to add
it soon. SplittableParDo is missing but we don't have a use case so far. We
are on par with the other runners for the rest of the Java features.
- Work in Progress: implement the portable pipeline runner logic.
- Future plans: support Python is our next goal. Hopefully we will get a
prototype working sometime next quarter :).

Btw, thanks everyone for the comments!

Thanks,
Xinyu

On Mon, Jun 18, 2018 at 4:59 PM, Robert Burke <ro...@frantil.com> wrote:

> This is exciting! Is it implemented as a portability framework runner too?
>
>
> On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com> wrote:
>
>> It's very exciting to see a new runner making it into master. : )
>>
>> Best
>> -P.
>>
>> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com>
>> wrote:
>>
>>> I've just read this and wanted to share my excitement :D
>>>
>>>
>>>
>>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>> One thing that will be necessary is porting the build to Gradle.
>>>>
>>>> Kenn
>>>>
>>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, Folks,
>>>>>
>>>>> On behalf of the Samza team, I would like to propose to merge the
>>>>> samza-runner branch into master. The branch was created on Jan when we
>>>>> first introduced the Samza Runner [1], and we've been adding features and
>>>>> refining it afterwards. Now the runner satisfies the criteria outlined in
>>>>> [2], and merging it to master will give more visibility to other
>>>>> contributors and users.
>>>>>
>>>>> 1. Have at least 2 contributors interested in maintaining it, and 1
>>>>> committer interested in supporting it: *Both Chris and me have been making
>>>>> contributions and I am going to sign up for the support. There are more
>>>>> folks in the Samza team interested in contributing to it. Thanks Kenn for
>>>>> all the help and reviews for the runner!*
>>>>> 2. Provide both end-user and developer-facing documentation: *The PR
>>>>> for the samza-runner doc has runner user guide, capability matrix, and
>>>>> tutorial using WordCount examples.*
>>>>> 3. Have at least a basic level of unit test coverage: *Unit tests are
>>>>> here [3].*
>>>>> 4. Run all existing applicable integration tests with other Beam components
>>>>> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>>>> 5. Be able to handle a subset of the model that addresses a
>>>>> significant set of use cases, such as ‘traditional batch’ or ‘processing
>>>>> time streaming’: *We have test Beam jobs running in Yarn using event-time
>>>>> processing of Kafka streams.*
>>>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>>>
>>>>> The PR for the samza-runner merge: https://github.com/
>>>>> apache/beam/pull/5668
>>>>> The PR for the samza-runner doc: https://github.com/
>>>>> apache/beam-site/pull/471
>>>>>
>>>>> Thanks,
>>>>> Xinyu
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>>>> [2] https://beam.apache.org/contribute/
>>>>> [3] https://github.com/apache/beam/tree/samza-runner/
>>>>> runners/samza/src/test
>>>>>
>>>> --
>> Got feedback? go/pabloem-feedback
>>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Robert Burke <ro...@frantil.com>.
This is exciting! Is it implemented as a portability framework runner too?

On Mon, Jun 18, 2018, 4:36 PM Pablo Estrada <pa...@google.com> wrote:

> It's very exciting to see a new runner making it into master. : )
>
> Best
> -P.
>
> On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com>
> wrote:
>
>> I've just read this and wanted to share my excitement :D
>>
>>
>>
>> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>> One thing that will be necessary is porting the build to Gradle.
>>>
>>> Kenn
>>>
>>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com>
>>> wrote:
>>>
>>>> Hi, Folks,
>>>>
>>>> On behalf of the Samza team, I would like to propose to merge the
>>>> samza-runner branch into master. The branch was created on Jan when we
>>>> first introduced the Samza Runner [1], and we've been adding features and
>>>> refining it afterwards. Now the runner satisfies the criteria outlined in
>>>> [2], and merging it to master will give more visibility to other
>>>> contributors and users.
>>>>
>>>> 1. Have at least 2 contributors interested in maintaining it, and 1
>>>> committer interested in supporting it: *Both Chris and me have been making
>>>> contributions and I am going to sign up for the support. There are more
>>>> folks in the Samza team interested in contributing to it. Thanks Kenn for
>>>> all the help and reviews for the runner!*
>>>> 2. Provide both end-user and developer-facing documentation: *The PR
>>>> for the samza-runner doc has runner user guide, capability matrix, and
>>>> tutorial using WordCount examples.*
>>>> 3. Have at least a basic level of unit test coverage: *Unit tests are
>>>> here [3].*
>>>> 4. Run all existing applicable integration tests with other Beam components
>>>> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>>> 5. Be able to handle a subset of the model that addresses a significant
>>>> set of use cases, such as ‘traditional batch’ or ‘processing time
>>>> streaming’: *We have test Beam jobs running in Yarn using event-time
>>>> processing of Kafka streams.*
>>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>>
>>>> The PR for the samza-runner merge:
>>>> https://github.com/apache/beam/pull/5668
>>>> The PR for the samza-runner doc:
>>>> https://github.com/apache/beam-site/pull/471
>>>>
>>>> Thanks,
>>>> Xinyu
>>>>
>>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>>> [2] https://beam.apache.org/contribute/
>>>> [3]
>>>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>>>>
>>> --
> Got feedback? go/pabloem-feedback
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Pablo Estrada <pa...@google.com>.
It's very exciting to see a new runner making it into master. : )

Best
-P.

On Mon, Jun 18, 2018 at 3:38 PM Rafael Fernandez <rf...@google.com>
wrote:

> I've just read this and wanted to share my excitement :D
>
>
>
> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>
>> One thing that will be necessary is porting the build to Gradle.
>>
>> Kenn
>>
>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com> wrote:
>>
>>> Hi, Folks,
>>>
>>> On behalf of the Samza team, I would like to propose to merge the
>>> samza-runner branch into master. The branch was created on Jan when we
>>> first introduced the Samza Runner [1], and we've been adding features and
>>> refining it afterwards. Now the runner satisfies the criteria outlined in
>>> [2], and merging it to master will give more visibility to other
>>> contributors and users.
>>>
>>> 1. Have at least 2 contributors interested in maintaining it, and 1
>>> committer interested in supporting it: *Both Chris and me have been making
>>> contributions and I am going to sign up for the support. There are more
>>> folks in the Samza team interested in contributing to it. Thanks Kenn for
>>> all the help and reviews for the runner!*
>>> 2. Provide both end-user and developer-facing documentation: *The PR for
>>> the samza-runner doc has runner user guide, capability matrix, and tutorial
>>> using WordCount examples.*
>>> 3. Have at least a basic level of unit test coverage: *Unit tests are
>>> here [3].*
>>> 4. Run all existing applicable integration tests with other Beam components
>>> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>> 5. Be able to handle a subset of the model that addresses a significant
>>> set of use cases, such as ‘traditional batch’ or ‘processing time
>>> streaming’: *We have test Beam jobs running in Yarn using event-time
>>> processing of Kafka streams.*
>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>
>>> The PR for the samza-runner merge:
>>> https://github.com/apache/beam/pull/5668
>>> The PR for the samza-runner doc:
>>> https://github.com/apache/beam-site/pull/471
>>>
>>> Thanks,
>>> Xinyu
>>>
>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>> [2] https://beam.apache.org/contribute/
>>> [3]
>>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>>>
>> --
Got feedback? go/pabloem-feedback

Re: [PROPOSAL] Merge samza-runner to master

Posted by Ahmet Altay <al...@google.com>.
Thank you for everyone who contributed to this runner. It is really great
to see this.

Xinyu, for the people like myself who were not following the development
closely, could you talk about missing pieces, work in progress, future
plans?

On Mon, Jun 18, 2018 at 3:37 PM, Rafael Fernandez <rf...@google.com>
wrote:

> I've just read this and wanted to share my excitement :D
>
>
>
> On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:
>
>> One thing that will be necessary is porting the build to Gradle.
>>
>> Kenn
>>
>> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com> wrote:
>>
>>> Hi, Folks,
>>>
>>> On behalf of the Samza team, I would like to propose to merge the
>>> samza-runner branch into master. The branch was created on Jan when we
>>> first introduced the Samza Runner [1], and we've been adding features and
>>> refining it afterwards. Now the runner satisfies the criteria outlined in
>>> [2], and merging it to master will give more visibility to other
>>> contributors and users.
>>>
>>> 1. Have at least 2 contributors interested in maintaining it, and 1
>>> committer interested in supporting it: *Both Chris and me have been making
>>> contributions and I am going to sign up for the support. There are more
>>> folks in the Samza team interested in contributing to it. Thanks Kenn for
>>> all the help and reviews for the runner!*
>>> 2. Provide both end-user and developer-facing documentation: *The PR for
>>> the samza-runner doc has runner user guide, capability matrix, and tutorial
>>> using WordCount examples.*
>>> 3. Have at least a basic level of unit test coverage: *Unit tests are
>>> here [3].*
>>> 4. Run all existing applicable integration tests with other Beam components
>>> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>>> 5. Be able to handle a subset of the model that addresses a significant
>>> set of use cases, such as ‘traditional batch’ or ‘processing time
>>> streaming’: *We have test Beam jobs running in Yarn using event-time
>>> processing of Kafka streams.*
>>> 6. Update the capability matrix with the current status. *Same as #2.*
>>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>>
>>> The PR for the samza-runner merge: https://github.com/
>>> apache/beam/pull/5668
>>> The PR for the samza-runner doc: https://github.com/
>>> apache/beam-site/pull/471
>>>
>>> Thanks,
>>> Xinyu
>>>
>>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>>> [2] https://beam.apache.org/contribute/
>>> [3] https://github.com/apache/beam/tree/samza-runner/
>>> runners/samza/src/test
>>>
>>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Rafael Fernandez <rf...@google.com>.
I've just read this and wanted to share my excitement :D



On Mon, Jun 18, 2018 at 3:10 PM Kenneth Knowles <kl...@google.com> wrote:

> One thing that will be necessary is porting the build to Gradle.
>
> Kenn
>
> On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com> wrote:
>
>> Hi, Folks,
>>
>> On behalf of the Samza team, I would like to propose to merge the
>> samza-runner branch into master. The branch was created on Jan when we
>> first introduced the Samza Runner [1], and we've been adding features and
>> refining it afterwards. Now the runner satisfies the criteria outlined in
>> [2], and merging it to master will give more visibility to other
>> contributors and users.
>>
>> 1. Have at least 2 contributors interested in maintaining it, and 1
>> committer interested in supporting it: *Both Chris and me have been making
>> contributions and I am going to sign up for the support. There are more
>> folks in the Samza team interested in contributing to it. Thanks Kenn for
>> all the help and reviews for the runner!*
>> 2. Provide both end-user and developer-facing documentation: *The PR for
>> the samza-runner doc has runner user guide, capability matrix, and tutorial
>> using WordCount examples.*
>> 3. Have at least a basic level of unit test coverage: *Unit tests are
>> here [3].*
>> 4. Run all existing applicable integration tests with other Beam components
>> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
>> 5. Be able to handle a subset of the model that addresses a significant
>> set of use cases, such as ‘traditional batch’ or ‘processing time
>> streaming’: *We have test Beam jobs running in Yarn using event-time
>> processing of Kafka streams.*
>> 6. Update the capability matrix with the current status. *Same as #2.*
>> 7. Add a webpage under documentation/runners. *Same as #2.*
>>
>> The PR for the samza-runner merge:
>> https://github.com/apache/beam/pull/5668
>> The PR for the samza-runner doc:
>> https://github.com/apache/beam-site/pull/471
>>
>> Thanks,
>> Xinyu
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-3079
>> [2] https://beam.apache.org/contribute/
>> [3]
>> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>>
>

Re: [PROPOSAL] Merge samza-runner to master

Posted by Kenneth Knowles <kl...@google.com>.
One thing that will be necessary is porting the build to Gradle.

Kenn

On Mon, Jun 18, 2018 at 11:57 AM Xinyu Liu <xi...@gmail.com> wrote:

> Hi, Folks,
>
> On behalf of the Samza team, I would like to propose to merge the
> samza-runner branch into master. The branch was created on Jan when we
> first introduced the Samza Runner [1], and we've been adding features and
> refining it afterwards. Now the runner satisfies the criteria outlined in
> [2], and merging it to master will give more visibility to other
> contributors and users.
>
> 1. Have at least 2 contributors interested in maintaining it, and 1
> committer interested in supporting it: *Both Chris and me have been making
> contributions and I am going to sign up for the support. There are more
> folks in the Samza team interested in contributing to it. Thanks Kenn for
> all the help and reviews for the runner!*
> 2. Provide both end-user and developer-facing documentation: *The PR for
> the samza-runner doc has runner user guide, capability matrix, and tutorial
> using WordCount examples.*
> 3. Have at least a basic level of unit test coverage: *Unit tests are here
> [3].*
> 4. Run all existing applicable integration tests with other Beam components
> and create additional tests as appropriate: Enabled ValidatesRunner tests.*
> 5. Be able to handle a subset of the model that addresses a significant
> set of use cases, such as ‘traditional batch’ or ‘processing time
> streaming’: *We have test Beam jobs running in Yarn using event-time
> processing of Kafka streams.*
> 6. Update the capability matrix with the current status. *Same as #2.*
> 7. Add a webpage under documentation/runners. *Same as #2.*
>
> The PR for the samza-runner merge:
> https://github.com/apache/beam/pull/5668
> The PR for the samza-runner doc:
> https://github.com/apache/beam-site/pull/471
>
> Thanks,
> Xinyu
>
> [1] https://issues.apache.org/jira/browse/BEAM-3079
> [2] https://beam.apache.org/contribute/
> [3]
> https://github.com/apache/beam/tree/samza-runner/runners/samza/src/test
>