You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by David Cavazos <dc...@google.com> on 2022/01/05 20:47:36 UTC

Re: Beam Java starter project template

I personally like the idea of a separate repo since we can see how a true
minimal project looks like. Having it in the main repo would inherit build
file configurations and other settings that would be different from a clean
project, so it could be non-trivial to adapt. Also as its own repo, it's
easier to clone and modify, or create an instance of the template.

Dependabot
<https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
can take care of updating the Beam version and other dependencies
automatically. Testing is already set up via GitHub actions for every pull
request, so it would automatically be tested as soon as there is a new
dependency version available.

Being such minimal examples, I don't expect them to break commonly, but I
think it would be good to make sure tests aren't failing when a release is
published.

I'm okay with having one repo per language, and having all the build
systems we want to support for them. As long as we document which files are
for which build system. That way there are less repos to maintain.

On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:

> The github repo is definitely more flexible then the archetypes but the
> archetypes have a few conveniences since they are integrated with
> apache/beam repo. For example, updates/testing are done at the same time a
> corresponding change to the main repo is done (like library version
> updates), they are released when the SDK is released.
>
> Should these be part of the main repo, or a single starter repo containing
> all the starters or one per language or one per build system?
>
> When should updates to the starter happen?
> How as a community do we get them to happen (e.g. release manager owns it)?
>
>
> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com> wrote:
>
>> We could do the Maven archetype, but that wouldn't work very well for
>> Gradle and SBT users. I think a GitHub template might be the more flexible
>> option, and we could have something similar for other languages as well.
>> Having said that, we could still create a Maven archetype. If someone is
>> familiar with that process, please let me know since I'm not too familiar
>> with Maven and its ecosystem.
>>
>> @Ahmet Altay <al...@google.com> I think right now we only need to pin
>> down the name of the repo, create it, and move the code there. I was
>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>> What do you think?
>>
>> What would be the next steps on creating the repo?
>>
>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>
>>> This is great David. Was there any progress on this? Do you need help?
>>>
>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>> wrote:
>>>
>>>> This is cool, thanks!
>>>>
>>>> We do have a template in apache/beam already, built with Maven
>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>> co-locate the archetype with the GitHub template)?
>>>>
>>>> As far as creating an Apache repo, would we put this somewhere like
>>>> apache/beam-java-template? I think apache repositories like beam-* are
>>>> allowed.
>>>>
>>>> Brian
>>>>
>>>> [1] https://maven.apache.org/archetype/index.html
>>>> [2]
>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>
>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> +Ahmet Altay <al...@google.com>
>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>
>>>>> Please feel free to include anyone else!
>>>>>
>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Beam community!
>>>>>>
>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>> people to start with.
>>>>>>
>>>>>> *Link to the GitHub template*:
>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>
>>>>>> So far, here's what the template contains:
>>>>>>
>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>    - Minimal test file
>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>    - Continuous integration via GitHub actions
>>>>>>    <https://github.com/features/actions> (around 1-2 minutes to run)
>>>>>>    - README with instructions on how to build, run, test, and add
>>>>>>    other runners
>>>>>>
>>>>>> It's easy to create a new GitHub repo from a template
>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>> .
>>>>>>
>>>>>> *Next steps*
>>>>>>
>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>    - Right now it lives in my personal GitHub account, so we need to
>>>>>>    create an Apache repo to host it
>>>>>>    - Update/create docs with instructions on how to create a new
>>>>>>    Beam Java pipeline
>>>>>>
>>>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
+1 to starters for Kotlin and Scala too. Scala at least as sbt as the
idiomatic build tool (as of like 10 years ago...) so a fully idiomatic repo
would look different and be helpful.

On Tue, Jan 25, 2022 at 10:54 AM David Cavazos <dc...@google.com> wrote:

> We could also have Kotlin and Scala starter projects. I was already
> experimenting with those. It's low hanging fruit since they use the same
> Java SDK, and many people are already using and liking Kotlin (from
> Android) and Scala (from Spark).
>
> On Tue, Jan 25, 2022 at 10:52 AM David Cavazos <dc...@google.com>
> wrote:
>
>> I can change the license to ASL2, no problem. I might have copied and
>> pasted it from somewhere else.
>>
>> As for Python and Go, I think the main advantage would be having a
>> testing infrastructure setup including GitHub actions configuration. But
>> other than that I agree that the actual Beam setup would be pretty minimal.
>> But I think the main advantage is that the quickstarts for all
>> languages would be more consistent: git clone and then run.
>>
>> On Tue, Jan 18, 2022 at 5:25 PM Robert Burke <re...@google.com> wrote:
>>
>>> Can confirm that Go would be very minimal as well. But I agree there's
>>> value in users not needing to start entirely from scratch. Some users will
>>> find it easier to mutate and expand to what you want vs writing it from a
>>> blank page, even if the boiler plate is negligible.
>>>
>>> Most Go tooling is pretty good at doing package lookups and similar, but
>>> only after the a module has been loaded into your cache.
>>>
>>> On Tue, Jan 18, 2022 at 6:20 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> I want to clarify one thing: I am not certain the requirement of ASL2
>>>> applies to example code snippets. I am also not sure if it makes a material
>>>> difference to users. I _am_ sure we would need to deal with some process to
>>>> use something other than ASL2, so I'd rather not.
>>>>
>>>> Kenn
>>>>
>>>> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> Agree with Luke here. "Just git clone and go" is a big part of it.
>>>>>
>>>>> But also the answer to "I simply don't know what one would put in a
>>>>> Python repo than, other than a bare setup.py that lists a dependency on
>>>>> apache_beam" is answered by David's initial email and his repo, namely:
>>>>>
>>>>>  - GitHub Actions configuration
>>>>>  - README.md
>>>>>  - example that already runs
>>>>>  - LICENSE (notably you've got it as MIT but to be part of Apache
>>>>> software it needs to be ASL2)
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> I think for consistency it makes sense to users to be told to
>>>>>> checkout this git repo for the language of your choice and run. Some repos
>>>>>> will have more/less than others when it comes to setup necessary.
>>>>>>
>>>>>> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 for doing this for Java, as setting up a project there is quite
>>>>>>> complicated. I simply don't know what one would put in a Python repo
>>>>>>> than, other than a bare setup.py that lists a dependency on
>>>>>>> apache_beam. We don't have recommendations on file layout, etc. more
>>>>>>> than that (though there's plenty of generic advice to be found out
>>>>>>> there on the topic). I have a hunch go is similar, and javascript
>>>>>>> would be as well (npm install apache-beam and your package.json file
>>>>>>> gets updated).
>>>>>>>
>>>>>>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>>>>>>> >
>>>>>>> > There are several examples already within the Beam repo found in:
>>>>>>> > https://github.com/apache/beam/tree/master/examples
>>>>>>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>> >
>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>> >
>>>>>>> >
>>>>>>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>> sachinag@google.com> wrote:
>>>>>>> >>
>>>>>>> >> I'd love to do something other than Wordcount just for
>>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>>> each quickstart would be ideal.
>>>>>>> >>
>>>>>>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>> >>>
>>>>>>> >>> + 1 to a separate repo for each language.
>>>>>>> >>>
>>>>>>> >>> Would it make sense to include the Wordcount example in each
>>>>>>> repo? I know that makes the repos less minimal, but we could rewrite the
>>>>>>> quickstarts around these repos instead of the current Wordcount examples.
>>>>>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>>>>>> >>>
>>>>>>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >>>>
>>>>>>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>>>>> preferable, and the github repos are more flexible and maintainable.
>>>>>>> >>>>
>>>>>>> >>>> How about we create:
>>>>>>> >>>>
>>>>>>> >>>> apache/beam-starter-java
>>>>>>> >>>> apache/beam-starter-python
>>>>>>> >>>> apache/beam-starter-go
>>>>>>> >>>>
>>>>>>> >>>> During our OKR planning, +Keith Malvetti would prefer having
>>>>>>> repos for all languages. It makes sense for consistency as well.
>>>>>>> >>>>
>>>>>>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> As long as we have tags so that people can pull out a specific
>>>>>>> version of the examples that coincides with a specific SDK version then we
>>>>>>> could drop the archetypes.
>>>>>>> >>>>>
>>>>>>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>> bhulette@google.com> wrote:
>>>>>>> >>>>>>
>>>>>>> >>>>>> > Being such minimal examples, I don't expect them to break
>>>>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>>>>> when a release is published.
>>>>>>> >>>>>>
>>>>>>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
>>>>>>> after the release. Agree we should verify RCs (document as part of the
>>>>>>> release process), or even better, add automation to verify the repo against
>>>>>>> snapshots. The automation could be nice to have anyway since it provides an
>>>>>>> example for users to follow if they want to test against snapshots and
>>>>>>> report issues to us sooner.
>>>>>>> >>>>>>
>>>>>>> >>>>>>
>>>>>>> >>>>>> If we move forward with this can we drop the archetype?
>>>>>>> >>>>>>
>>>>>>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> Sounds reasonable.
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> I personally like the idea of a separate repo since we can
>>>>>>> see how a true minimal project looks like. Having it in the main repo would
>>>>>>> inherit build file configurations and other settings that would be
>>>>>>> different from a clean project, so it could be non-trivial to adapt. Also
>>>>>>> as its own repo, it's easier to clone and modify, or create an instance of
>>>>>>> the template.
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> Dependabot can take care of updating the Beam version and
>>>>>>> other dependencies automatically. Testing is already set up via GitHub
>>>>>>> actions for every pull request, so it would automatically be tested as soon
>>>>>>> as there is a new dependency version available.
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> Being such minimal examples, I don't expect them to break
>>>>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>>>>> when a release is published.
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> I'm okay with having one repo per language, and having all
>>>>>>> the build systems we want to support for them. As long as we document which
>>>>>>> files are for which build system. That way there are less repos to maintain.
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> The github repo is definitely more flexible then the
>>>>>>> archetypes but the archetypes have a few conveniences since they are
>>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>>> version updates), they are released when the SDK is released.
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> Should these be part of the main repo, or a single starter
>>>>>>> repo containing all the starters or one per language or one per build
>>>>>>> system?
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> When should updates to the starter happen?
>>>>>>> >>>>>>>>> How as a community do we get them to happen (e.g. release
>>>>>>> manager owns it)?
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work
>>>>>>> very well for Gradle and SBT users. I think a GitHub template might be the
>>>>>>> more flexible option, and we could have something similar for other
>>>>>>> languages as well. Having said that, we could still create a Maven
>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down
>>>>>>> the name of the repo, create it, and move the code there. I was thinking
>>>>>>> either `apache/beam-java-template` or `apache/beam-java-starter`. What do
>>>>>>> you think?
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>> altay@google.com> wrote:
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>> This is great David. Was there any progress on this? Do
>>>>>>> you need help?
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>>> bhulette@google.com> wrote:
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> We do have a template in apache/beam already, built
>>>>>>> with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could
>>>>>>> we de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>>>>>>> somewhere like apache/beam-java-template? I think apache repositories like
>>>>>>> beam-* are allowed.
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> Brian
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>> >>>>>>>>>>>> [2]
>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project,
>>>>>>> I've been working on a GitHub template containing a minimal Beam Java
>>>>>>> pipeline for people to start with.
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>> >>>>>>>>>>>>>> Minimal test file
>>>>>>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
>>>>>>> minutes to run)
>>>>>>> >>>>>>>>>>>>>> README with instructions on how to build, run, test,
>>>>>>> and add other runners
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Next steps
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it
>>>>>>> 🙂
>>>>>>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so
>>>>>>> we need to create an Apache repo to host it
>>>>>>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create
>>>>>>> a new Beam Java pipeline
>>>>>>>
>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
We could also have Kotlin and Scala starter projects. I was already
experimenting with those. It's low hanging fruit since they use the same
Java SDK, and many people are already using and liking Kotlin (from
Android) and Scala (from Spark).

On Tue, Jan 25, 2022 at 10:52 AM David Cavazos <dc...@google.com> wrote:

> I can change the license to ASL2, no problem. I might have copied and
> pasted it from somewhere else.
>
> As for Python and Go, I think the main advantage would be having a testing
> infrastructure setup including GitHub actions configuration. But other than
> that I agree that the actual Beam setup would be pretty minimal. But I
> think the main advantage is that the quickstarts for all languages would be
> more consistent: git clone and then run.
>
> On Tue, Jan 18, 2022 at 5:25 PM Robert Burke <re...@google.com> wrote:
>
>> Can confirm that Go would be very minimal as well. But I agree there's
>> value in users not needing to start entirely from scratch. Some users will
>> find it easier to mutate and expand to what you want vs writing it from a
>> blank page, even if the boiler plate is negligible.
>>
>> Most Go tooling is pretty good at doing package lookups and similar, but
>> only after the a module has been loaded into your cache.
>>
>> On Tue, Jan 18, 2022 at 6:20 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> I want to clarify one thing: I am not certain the requirement of ASL2
>>> applies to example code snippets. I am also not sure if it makes a material
>>> difference to users. I _am_ sure we would need to deal with some process to
>>> use something other than ASL2, so I'd rather not.
>>>
>>> Kenn
>>>
>>> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> Agree with Luke here. "Just git clone and go" is a big part of it.
>>>>
>>>> But also the answer to "I simply don't know what one would put in a
>>>> Python repo than, other than a bare setup.py that lists a dependency on
>>>> apache_beam" is answered by David's initial email and his repo, namely:
>>>>
>>>>  - GitHub Actions configuration
>>>>  - README.md
>>>>  - example that already runs
>>>>  - LICENSE (notably you've got it as MIT but to be part of Apache
>>>> software it needs to be ASL2)
>>>>
>>>> Kenn
>>>>
>>>> On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> I think for consistency it makes sense to users to be told to checkout
>>>>> this git repo for the language of your choice and run. Some repos will have
>>>>> more/less than others when it comes to setup necessary.
>>>>>
>>>>> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> +1 for doing this for Java, as setting up a project there is quite
>>>>>> complicated. I simply don't know what one would put in a Python repo
>>>>>> than, other than a bare setup.py that lists a dependency on
>>>>>> apache_beam. We don't have recommendations on file layout, etc. more
>>>>>> than that (though there's plenty of generic advice to be found out
>>>>>> there on the topic). I have a hunch go is similar, and javascript
>>>>>> would be as well (npm install apache-beam and your package.json file
>>>>>> gets updated).
>>>>>>
>>>>>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>>>>>> >
>>>>>> > There are several examples already within the Beam repo found in:
>>>>>> > https://github.com/apache/beam/tree/master/examples
>>>>>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>> >
>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>> >
>>>>>> >
>>>>>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>> sachinag@google.com> wrote:
>>>>>> >>
>>>>>> >> I'd love to do something other than Wordcount just for
>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>> each quickstart would be ideal.
>>>>>> >>
>>>>>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>> dhuntsperger@google.com> wrote:
>>>>>> >>>
>>>>>> >>> + 1 to a separate repo for each language.
>>>>>> >>>
>>>>>> >>> Would it make sense to include the Wordcount example in each
>>>>>> repo? I know that makes the repos less minimal, but we could rewrite the
>>>>>> quickstarts around these repos instead of the current Wordcount examples.
>>>>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>>>>> >>>
>>>>>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >>>>
>>>>>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>>>> preferable, and the github repos are more flexible and maintainable.
>>>>>> >>>>
>>>>>> >>>> How about we create:
>>>>>> >>>>
>>>>>> >>>> apache/beam-starter-java
>>>>>> >>>> apache/beam-starter-python
>>>>>> >>>> apache/beam-starter-go
>>>>>> >>>>
>>>>>> >>>> During our OKR planning, +Keith Malvetti would prefer having
>>>>>> repos for all languages. It makes sense for consistency as well.
>>>>>> >>>>
>>>>>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>>>>>> wrote:
>>>>>> >>>>>
>>>>>> >>>>> As long as we have tags so that people can pull out a specific
>>>>>> version of the examples that coincides with a specific SDK version then we
>>>>>> could drop the archetypes.
>>>>>> >>>>>
>>>>>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>> bhulette@google.com> wrote:
>>>>>> >>>>>>
>>>>>> >>>>>> > Being such minimal examples, I don't expect them to break
>>>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>>>> when a release is published.
>>>>>> >>>>>>
>>>>>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
>>>>>> after the release. Agree we should verify RCs (document as part of the
>>>>>> release process), or even better, add automation to verify the repo against
>>>>>> snapshots. The automation could be nice to have anyway since it provides an
>>>>>> example for users to follow if they want to test against snapshots and
>>>>>> report issues to us sooner.
>>>>>> >>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>> If we move forward with this can we drop the archetype?
>>>>>> >>>>>>
>>>>>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>>>>>> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>> Sounds reasonable.
>>>>>> >>>>>>>
>>>>>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> I personally like the idea of a separate repo since we can
>>>>>> see how a true minimal project looks like. Having it in the main repo would
>>>>>> inherit build file configurations and other settings that would be
>>>>>> different from a clean project, so it could be non-trivial to adapt. Also
>>>>>> as its own repo, it's easier to clone and modify, or create an instance of
>>>>>> the template.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Dependabot can take care of updating the Beam version and
>>>>>> other dependencies automatically. Testing is already set up via GitHub
>>>>>> actions for every pull request, so it would automatically be tested as soon
>>>>>> as there is a new dependency version available.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> Being such minimal examples, I don't expect them to break
>>>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>>>> when a release is published.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> I'm okay with having one repo per language, and having all
>>>>>> the build systems we want to support for them. As long as we document which
>>>>>> files are for which build system. That way there are less repos to maintain.
>>>>>> >>>>>>>>
>>>>>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
>>>>>> wrote:
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> The github repo is definitely more flexible then the
>>>>>> archetypes but the archetypes have a few conveniences since they are
>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>> version updates), they are released when the SDK is released.
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> Should these be part of the main repo, or a single starter
>>>>>> repo containing all the starters or one per language or one per build
>>>>>> system?
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> When should updates to the starter happen?
>>>>>> >>>>>>>>> How as a community do we get them to happen (e.g. release
>>>>>> manager owns it)?
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>>
>>>>>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work
>>>>>> very well for Gradle and SBT users. I think a GitHub template might be the
>>>>>> more flexible option, and we could have something similar for other
>>>>>> languages as well. Having said that, we could still create a Maven
>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down
>>>>>> the name of the repo, create it, and move the code there. I was thinking
>>>>>> either `apache/beam-java-template` or `apache/beam-java-starter`. What do
>>>>>> you think?
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>>> >>>>>>>>>>
>>>>>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>> altay@google.com> wrote:
>>>>>> >>>>>>>>>>>
>>>>>> >>>>>>>>>>> This is great David. Was there any progress on this? Do
>>>>>> you need help?
>>>>>> >>>>>>>>>>>
>>>>>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>> bhulette@google.com> wrote:
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> This is cool, thanks!
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> We do have a template in apache/beam already, built with
>>>>>> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>> co-locate the archetype with the GitHub template)?
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>>>>>> somewhere like apache/beam-java-template? I think apache repositories like
>>>>>> beam-* are allowed.
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> Brian
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>> >>>>>>>>>>>> [2]
>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>> >>>>>>>>>>>>
>>>>>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>> >>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project,
>>>>>> I've been working on a GitHub template containing a minimal Beam Java
>>>>>> pipeline for people to start with.
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>> https://github.com/davidcavazos/beam-java
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>> >>>>>>>>>>>>>> Minimal test file
>>>>>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
>>>>>> minutes to run)
>>>>>> >>>>>>>>>>>>>> README with instructions on how to build, run, test,
>>>>>> and add other runners
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Next steps
>>>>>> >>>>>>>>>>>>>>
>>>>>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it
>>>>>> 🙂
>>>>>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so
>>>>>> we need to create an Apache repo to host it
>>>>>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create
>>>>>> a new Beam Java pipeline
>>>>>>
>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
I can change the license to ASL2, no problem. I might have copied and
pasted it from somewhere else.

As for Python and Go, I think the main advantage would be having a testing
infrastructure setup including GitHub actions configuration. But other than
that I agree that the actual Beam setup would be pretty minimal. But I
think the main advantage is that the quickstarts for all languages would be
more consistent: git clone and then run.

On Tue, Jan 18, 2022 at 5:25 PM Robert Burke <re...@google.com> wrote:

> Can confirm that Go would be very minimal as well. But I agree there's
> value in users not needing to start entirely from scratch. Some users will
> find it easier to mutate and expand to what you want vs writing it from a
> blank page, even if the boiler plate is negligible.
>
> Most Go tooling is pretty good at doing package lookups and similar, but
> only after the a module has been loaded into your cache.
>
> On Tue, Jan 18, 2022 at 6:20 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> I want to clarify one thing: I am not certain the requirement of ASL2
>> applies to example code snippets. I am also not sure if it makes a material
>> difference to users. I _am_ sure we would need to deal with some process to
>> use something other than ASL2, so I'd rather not.
>>
>> Kenn
>>
>> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> Agree with Luke here. "Just git clone and go" is a big part of it.
>>>
>>> But also the answer to "I simply don't know what one would put in a
>>> Python repo than, other than a bare setup.py that lists a dependency on
>>> apache_beam" is answered by David's initial email and his repo, namely:
>>>
>>>  - GitHub Actions configuration
>>>  - README.md
>>>  - example that already runs
>>>  - LICENSE (notably you've got it as MIT but to be part of Apache
>>> software it needs to be ASL2)
>>>
>>> Kenn
>>>
>>> On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> I think for consistency it makes sense to users to be told to checkout
>>>> this git repo for the language of your choice and run. Some repos will have
>>>> more/less than others when it comes to setup necessary.
>>>>
>>>> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> +1 for doing this for Java, as setting up a project there is quite
>>>>> complicated. I simply don't know what one would put in a Python repo
>>>>> than, other than a bare setup.py that lists a dependency on
>>>>> apache_beam. We don't have recommendations on file layout, etc. more
>>>>> than that (though there's plenty of generic advice to be found out
>>>>> there on the topic). I have a hunch go is similar, and javascript
>>>>> would be as well (npm install apache-beam and your package.json file
>>>>> gets updated).
>>>>>
>>>>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>>>>> >
>>>>> > There are several examples already within the Beam repo found in:
>>>>> > https://github.com/apache/beam/tree/master/examples
>>>>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>> >
>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>> >
>>>>> >
>>>>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com>
>>>>> wrote:
>>>>> >>
>>>>> >> I'd love to do something other than Wordcount just for
>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>> each quickstart would be ideal.
>>>>> >>
>>>>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>> dhuntsperger@google.com> wrote:
>>>>> >>>
>>>>> >>> + 1 to a separate repo for each language.
>>>>> >>>
>>>>> >>> Would it make sense to include the Wordcount example in each repo?
>>>>> I know that makes the repos less minimal, but we could rewrite the
>>>>> quickstarts around these repos instead of the current Wordcount examples.
>>>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>>>> >>>
>>>>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>>> preferable, and the github repos are more flexible and maintainable.
>>>>> >>>>
>>>>> >>>> How about we create:
>>>>> >>>>
>>>>> >>>> apache/beam-starter-java
>>>>> >>>> apache/beam-starter-python
>>>>> >>>> apache/beam-starter-go
>>>>> >>>>
>>>>> >>>> During our OKR planning, +Keith Malvetti would prefer having
>>>>> repos for all languages. It makes sense for consistency as well.
>>>>> >>>>
>>>>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>>>>> wrote:
>>>>> >>>>>
>>>>> >>>>> As long as we have tags so that people can pull out a specific
>>>>> version of the examples that coincides with a specific SDK version then we
>>>>> could drop the archetypes.
>>>>> >>>>>
>>>>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>> bhulette@google.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> > Being such minimal examples, I don't expect them to break
>>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>>> when a release is published.
>>>>> >>>>>>
>>>>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
>>>>> after the release. Agree we should verify RCs (document as part of the
>>>>> release process), or even better, add automation to verify the repo against
>>>>> snapshots. The automation could be nice to have anyway since it provides an
>>>>> example for users to follow if they want to test against snapshots and
>>>>> report issues to us sooner.
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>> If we move forward with this can we drop the archetype?
>>>>> >>>>>>
>>>>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> Sounds reasonable.
>>>>> >>>>>>>
>>>>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>> I personally like the idea of a separate repo since we can
>>>>> see how a true minimal project looks like. Having it in the main repo would
>>>>> inherit build file configurations and other settings that would be
>>>>> different from a clean project, so it could be non-trivial to adapt. Also
>>>>> as its own repo, it's easier to clone and modify, or create an instance of
>>>>> the template.
>>>>> >>>>>>>>
>>>>> >>>>>>>> Dependabot can take care of updating the Beam version and
>>>>> other dependencies automatically. Testing is already set up via GitHub
>>>>> actions for every pull request, so it would automatically be tested as soon
>>>>> as there is a new dependency version available.
>>>>> >>>>>>>>
>>>>> >>>>>>>> Being such minimal examples, I don't expect them to break
>>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>>> when a release is published.
>>>>> >>>>>>>>
>>>>> >>>>>>>> I'm okay with having one repo per language, and having all
>>>>> the build systems we want to support for them. As long as we document which
>>>>> files are for which build system. That way there are less repos to maintain.
>>>>> >>>>>>>>
>>>>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
>>>>> wrote:
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> The github repo is definitely more flexible then the
>>>>> archetypes but the archetypes have a few conveniences since they are
>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>> the same time a corresponding change to the main repo is done (like library
>>>>> version updates), they are released when the SDK is released.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Should these be part of the main repo, or a single starter
>>>>> repo containing all the starters or one per language or one per build
>>>>> system?
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> When should updates to the starter happen?
>>>>> >>>>>>>>> How as a community do we get them to happen (e.g. release
>>>>> manager owns it)?
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work
>>>>> very well for Gradle and SBT users. I think a GitHub template might be the
>>>>> more flexible option, and we could have something similar for other
>>>>> languages as well. Having said that, we could still create a Maven
>>>>> archetype. If someone is familiar with that process, please let me know
>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the
>>>>> name of the repo, create it, and move the code there. I was thinking either
>>>>> `apache/beam-java-template` or `apache/beam-java-starter`. What do you
>>>>> think?
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>> altay@google.com> wrote:
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> This is great David. Was there any progress on this? Do
>>>>> you need help?
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>> bhulette@google.com> wrote:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> This is cool, thanks!
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> We do have a template in apache/beam already, built with
>>>>> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>> co-locate the archetype with the GitHub template)?
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>>>>> somewhere like apache/beam-java-template? I think apache repositories like
>>>>> beam-* are allowed.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Brian
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>> >>>>>>>>>>>> [2]
>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> +Ahmet Altay
>>>>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Hi Beam community!
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project,
>>>>> I've been working on a GitHub template containing a minimal Beam Java
>>>>> pipeline for people to start with.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>> https://github.com/davidcavazos/beam-java
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>> >>>>>>>>>>>>>> Minimal test file
>>>>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>>>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
>>>>> minutes to run)
>>>>> >>>>>>>>>>>>>> README with instructions on how to build, run, test,
>>>>> and add other runners
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Next steps
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>>>>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we
>>>>> need to create an Apache repo to host it
>>>>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a
>>>>> new Beam Java pipeline
>>>>>
>>>>

Re: Beam Java starter project template

Posted by Robert Burke <re...@google.com>.
Can confirm that Go would be very minimal as well. But I agree there's
value in users not needing to start entirely from scratch. Some users will
find it easier to mutate and expand to what you want vs writing it from a
blank page, even if the boiler plate is negligible.

Most Go tooling is pretty good at doing package lookups and similar, but
only after the a module has been loaded into your cache.

On Tue, Jan 18, 2022 at 6:20 AM Kenneth Knowles <ke...@apache.org> wrote:

> I want to clarify one thing: I am not certain the requirement of ASL2
> applies to example code snippets. I am also not sure if it makes a material
> difference to users. I _am_ sure we would need to deal with some process to
> use something other than ASL2, so I'd rather not.
>
> Kenn
>
> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> Agree with Luke here. "Just git clone and go" is a big part of it.
>>
>> But also the answer to "I simply don't know what one would put in a
>> Python repo than, other than a bare setup.py that lists a dependency on
>> apache_beam" is answered by David's initial email and his repo, namely:
>>
>>  - GitHub Actions configuration
>>  - README.md
>>  - example that already runs
>>  - LICENSE (notably you've got it as MIT but to be part of Apache
>> software it needs to be ASL2)
>>
>> Kenn
>>
>> On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> I think for consistency it makes sense to users to be told to checkout
>>> this git repo for the language of your choice and run. Some repos will have
>>> more/less than others when it comes to setup necessary.
>>>
>>> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> +1 for doing this for Java, as setting up a project there is quite
>>>> complicated. I simply don't know what one would put in a Python repo
>>>> than, other than a bare setup.py that lists a dependency on
>>>> apache_beam. We don't have recommendations on file layout, etc. more
>>>> than that (though there's plenty of generic advice to be found out
>>>> there on the topic). I have a hunch go is similar, and javascript
>>>> would be as well (npm install apache-beam and your package.json file
>>>> gets updated).
>>>>
>>>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>>>> >
>>>> > There are several examples already within the Beam repo found in:
>>>> > https://github.com/apache/beam/tree/master/examples
>>>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>> >
>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>> >
>>>> >
>>>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com>
>>>> wrote:
>>>> >>
>>>> >> I'd love to do something other than Wordcount just for
>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>> each quickstart would be ideal.
>>>> >>
>>>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>> dhuntsperger@google.com> wrote:
>>>> >>>
>>>> >>> + 1 to a separate repo for each language.
>>>> >>>
>>>> >>> Would it make sense to include the Wordcount example in each repo?
>>>> I know that makes the repos less minimal, but we could rewrite the
>>>> quickstarts around these repos instead of the current Wordcount examples.
>>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>>> >>>
>>>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>> >>>>
>>>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>> preferable, and the github repos are more flexible and maintainable.
>>>> >>>>
>>>> >>>> How about we create:
>>>> >>>>
>>>> >>>> apache/beam-starter-java
>>>> >>>> apache/beam-starter-python
>>>> >>>> apache/beam-starter-go
>>>> >>>>
>>>> >>>> During our OKR planning, +Keith Malvetti would prefer having repos
>>>> for all languages. It makes sense for consistency as well.
>>>> >>>>
>>>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >>>>>
>>>> >>>>> As long as we have tags so that people can pull out a specific
>>>> version of the examples that coincides with a specific SDK version then we
>>>> could drop the archetypes.
>>>> >>>>>
>>>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>> bhulette@google.com> wrote:
>>>> >>>>>>
>>>> >>>>>> > Being such minimal examples, I don't expect them to break
>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>> when a release is published.
>>>> >>>>>>
>>>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
>>>> after the release. Agree we should verify RCs (document as part of the
>>>> release process), or even better, add automation to verify the repo against
>>>> snapshots. The automation could be nice to have anyway since it provides an
>>>> example for users to follow if they want to test against snapshots and
>>>> report issues to us sooner.
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>> If we move forward with this can we drop the archetype?
>>>> >>>>>>
>>>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Sounds reasonable.
>>>> >>>>>>>
>>>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> I personally like the idea of a separate repo since we can see
>>>> how a true minimal project looks like. Having it in the main repo would
>>>> inherit build file configurations and other settings that would be
>>>> different from a clean project, so it could be non-trivial to adapt. Also
>>>> as its own repo, it's easier to clone and modify, or create an instance of
>>>> the template.
>>>> >>>>>>>>
>>>> >>>>>>>> Dependabot can take care of updating the Beam version and
>>>> other dependencies automatically. Testing is already set up via GitHub
>>>> actions for every pull request, so it would automatically be tested as soon
>>>> as there is a new dependency version available.
>>>> >>>>>>>>
>>>> >>>>>>>> Being such minimal examples, I don't expect them to break
>>>> commonly, but I think it would be good to make sure tests aren't failing
>>>> when a release is published.
>>>> >>>>>>>>
>>>> >>>>>>>> I'm okay with having one repo per language, and having all the
>>>> build systems we want to support for them. As long as we document which
>>>> files are for which build system. That way there are less repos to maintain.
>>>> >>>>>>>>
>>>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >>>>>>>>>
>>>> >>>>>>>>> The github repo is definitely more flexible then the
>>>> archetypes but the archetypes have a few conveniences since they are
>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>> the same time a corresponding change to the main repo is done (like library
>>>> version updates), they are released when the SDK is released.
>>>> >>>>>>>>>
>>>> >>>>>>>>> Should these be part of the main repo, or a single starter
>>>> repo containing all the starters or one per language or one per build
>>>> system?
>>>> >>>>>>>>>
>>>> >>>>>>>>> When should updates to the starter happen?
>>>> >>>>>>>>> How as a community do we get them to happen (e.g. release
>>>> manager owns it)?
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work very
>>>> well for Gradle and SBT users. I think a GitHub template might be the more
>>>> flexible option, and we could have something similar for other languages as
>>>> well. Having said that, we could still create a Maven archetype. If someone
>>>> is familiar with that process, please let me know since I'm not too
>>>> familiar with Maven and its ecosystem.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the
>>>> name of the repo, create it, and move the code there. I was thinking either
>>>> `apache/beam-java-template` or `apache/beam-java-starter`. What do you
>>>> think?
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>> altay@google.com> wrote:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> This is great David. Was there any progress on this? Do you
>>>> need help?
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>> bhulette@google.com> wrote:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> This is cool, thanks!
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> We do have a template in apache/beam already, built with
>>>> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>> co-locate the archetype with the GitHub template)?
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>>>> somewhere like apache/beam-java-template? I think apache repositories like
>>>> beam-* are allowed.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Brian
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>> >>>>>>>>>>>> [2]
>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> +Ahmet Altay
>>>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>> >>>>>>>>>>>>> +Kenneth Knowles
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Hi Beam community!
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project,
>>>> I've been working on a GitHub template containing a minimal Beam Java
>>>> pipeline for people to start with.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>> https://github.com/davidcavazos/beam-java
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>> >>>>>>>>>>>>>> Minimal test file
>>>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
>>>> minutes to run)
>>>> >>>>>>>>>>>>>> README with instructions on how to build, run, test, and
>>>> add other runners
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Next steps
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>>>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we
>>>> need to create an Apache repo to host it
>>>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a
>>>> new Beam Java pipeline
>>>>
>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
I want to clarify one thing: I am not certain the requirement of ASL2
applies to example code snippets. I am also not sure if it makes a material
difference to users. I _am_ sure we would need to deal with some process to
use something other than ASL2, so I'd rather not.

Kenn

On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:

> Agree with Luke here. "Just git clone and go" is a big part of it.
>
> But also the answer to "I simply don't know what one would put in a Python
> repo than, other than a bare setup.py that lists a dependency on
> apache_beam" is answered by David's initial email and his repo, namely:
>
>  - GitHub Actions configuration
>  - README.md
>  - example that already runs
>  - LICENSE (notably you've got it as MIT but to be part of Apache software
> it needs to be ASL2)
>
> Kenn
>
> On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>
>> I think for consistency it makes sense to users to be told to checkout
>> this git repo for the language of your choice and run. Some repos will have
>> more/less than others when it comes to setup necessary.
>>
>> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> +1 for doing this for Java, as setting up a project there is quite
>>> complicated. I simply don't know what one would put in a Python repo
>>> than, other than a bare setup.py that lists a dependency on
>>> apache_beam. We don't have recommendations on file layout, etc. more
>>> than that (though there's plenty of generic advice to be found out
>>> there on the topic). I have a hunch go is similar, and javascript
>>> would be as well (npm install apache-beam and your package.json file
>>> gets updated).
>>>
>>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>>> >
>>> > There are several examples already within the Beam repo found in:
>>> > https://github.com/apache/beam/tree/master/examples
>>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>> >
>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>> >
>>> >
>>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com>
>>> wrote:
>>> >>
>>> >> I'd love to do something other than Wordcount just for
>>> novelty/freshness but agreed with the suggestion that having an example in
>>> each quickstart would be ideal.
>>> >>
>>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>> dhuntsperger@google.com> wrote:
>>> >>>
>>> >>> + 1 to a separate repo for each language.
>>> >>>
>>> >>> Would it make sense to include the Wordcount example in each repo? I
>>> know that makes the repos less minimal, but we could rewrite the
>>> quickstarts around these repos instead of the current Wordcount examples.
>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>> >>>
>>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
>>> wrote:
>>> >>>>
>>> >>>> I agree with dropping the archetypes. Less maintenance is
>>> preferable, and the github repos are more flexible and maintainable.
>>> >>>>
>>> >>>> How about we create:
>>> >>>>
>>> >>>> apache/beam-starter-java
>>> >>>> apache/beam-starter-python
>>> >>>> apache/beam-starter-go
>>> >>>>
>>> >>>> During our OKR planning, +Keith Malvetti would prefer having repos
>>> for all languages. It makes sense for consistency as well.
>>> >>>>
>>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>>> >>>>>
>>> >>>>> As long as we have tags so that people can pull out a specific
>>> version of the examples that coincides with a specific SDK version then we
>>> could drop the archetypes.
>>> >>>>>
>>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>> > Being such minimal examples, I don't expect them to break
>>> commonly, but I think it would be good to make sure tests aren't failing
>>> when a release is published.
>>> >>>>>>
>>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
>>> after the release. Agree we should verify RCs (document as part of the
>>> release process), or even better, add automation to verify the repo against
>>> snapshots. The automation could be nice to have anyway since it provides an
>>> example for users to follow if they want to test against snapshots and
>>> report issues to us sooner.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> If we move forward with this can we drop the archetype?
>>> >>>>>>
>>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> Sounds reasonable.
>>> >>>>>>>
>>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >>>>>>>>
>>> >>>>>>>> I personally like the idea of a separate repo since we can see
>>> how a true minimal project looks like. Having it in the main repo would
>>> inherit build file configurations and other settings that would be
>>> different from a clean project, so it could be non-trivial to adapt. Also
>>> as its own repo, it's easier to clone and modify, or create an instance of
>>> the template.
>>> >>>>>>>>
>>> >>>>>>>> Dependabot can take care of updating the Beam version and other
>>> dependencies automatically. Testing is already set up via GitHub actions
>>> for every pull request, so it would automatically be tested as soon as
>>> there is a new dependency version available.
>>> >>>>>>>>
>>> >>>>>>>> Being such minimal examples, I don't expect them to break
>>> commonly, but I think it would be good to make sure tests aren't failing
>>> when a release is published.
>>> >>>>>>>>
>>> >>>>>>>> I'm okay with having one repo per language, and having all the
>>> build systems we want to support for them. As long as we document which
>>> files are for which build system. That way there are less repos to maintain.
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>> The github repo is definitely more flexible then the
>>> archetypes but the archetypes have a few conveniences since they are
>>> integrated with apache/beam repo. For example, updates/testing are done at
>>> the same time a corresponding change to the main repo is done (like library
>>> version updates), they are released when the SDK is released.
>>> >>>>>>>>>
>>> >>>>>>>>> Should these be part of the main repo, or a single starter
>>> repo containing all the starters or one per language or one per build
>>> system?
>>> >>>>>>>>>
>>> >>>>>>>>> When should updates to the starter happen?
>>> >>>>>>>>> How as a community do we get them to happen (e.g. release
>>> manager owns it)?
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work very
>>> well for Gradle and SBT users. I think a GitHub template might be the more
>>> flexible option, and we could have something similar for other languages as
>>> well. Having said that, we could still create a Maven archetype. If someone
>>> is familiar with that process, please let me know since I'm not too
>>> familiar with Maven and its ecosystem.
>>> >>>>>>>>>>
>>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the
>>> name of the repo, create it, and move the code there. I was thinking either
>>> `apache/beam-java-template` or `apache/beam-java-starter`. What do you
>>> think?
>>> >>>>>>>>>>
>>> >>>>>>>>>> What would be the next steps on creating the repo?
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com>
>>> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> This is great David. Was there any progress on this? Do you
>>> need help?
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>> bhulette@google.com> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> This is cool, thanks!
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> We do have a template in apache/beam already, built with
>>> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>> co-locate the archetype with the GitHub template)?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>>> somewhere like apache/beam-java-template? I think apache repositories like
>>> beam-* are allowed.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Brian
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>> >>>>>>>>>>>> [2]
>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> +Ahmet Altay
>>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>> >>>>>>>>>>>>> +Kenneth Knowles
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Hi Beam community!
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've
>>> been working on a GitHub template containing a minimal Beam Java pipeline
>>> for people to start with.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Link to the GitHub template:
>>> https://github.com/davidcavazos/beam-java
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>> >>>>>>>>>>>>>> Minimal test file
>>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
>>> minutes to run)
>>> >>>>>>>>>>>>>> README with instructions on how to build, run, test, and
>>> add other runners
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Next steps
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we
>>> need to create an Apache repo to host it
>>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a
>>> new Beam Java pipeline
>>>
>>

Re: Beam Java starter project template

Posted by Ahmet Altay <al...@google.com>.
Nice. Thank you David.

On Mon, Jun 27, 2022 at 8:55 AM David Cavazos <dc...@google.com> wrote:

> I think they were enabled but it was a matter of configuring it to run on
> the pull_request event instead of the push event. It looks like tests are
> running correctly now (haven't tested Go on an additional PR yet, but I
> think it should work).
>
> On Fri, Jun 24, 2022 at 5:03 PM Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Thu, Jun 9, 2022 at 2:14 PM Ahmet Altay <al...@google.com> wrote:
>>
>>>
>>>
>>> On Thu, Jun 9, 2022 at 1:10 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Sorry, I was OOO.
>>>>
>>>> @Ahmet Altay <al...@google.com> Yes, GitHub actions have been set for
>>>> the Java project, but not for other projects like Python.
>>>>
>>>
>>> Nice. Let's enable it for other projects as well?
>>>
>>
>> @David Cavazos <dc...@google.com> - Were you able to enable tests for
>> other projects?
>>
>>
>>>
>>>
>>>>
>>>> @Damon Douglas <da...@google.com> Our plan for these starter
>>>> repos is providing the minimal viable product for a pre-configured Apache
>>>> Beam project. The plan is having one repo for every supported language. I
>>>> think having Terraform integration would be great, so feel free to open a
>>>> PR.
>>>>
>>>> FYI to everyone on the thread, the PR for the Python starter project is
>>>> out for review. It doesn't look like the tests are running. I think we need
>>>> to enable GitHub Actions on this repo (and all others as well). Please help
>>>> us review/approve this so we can update the quickstarts.
>>>> https://github.com/apache/beam-starter-python/pull/1
>>>>
>>>> -David
>>>>
>>>> On Thu, Jun 9, 2022 at 12:59 PM Damon Douglas <da...@google.com>
>>>> wrote:
>>>>
>>>>> Hello Ahmet,
>>>>>
>>>>> Thank you so much for checking in.  I never got an answer to my
>>>>> question.  However, a colleague of mine and I are putting together a quick
>>>>> and easy standalone repo demonstration using terraform with Apache Beam
>>>>> that we hope will benefit those in the community that target using Dataflow
>>>>> on Google Cloud as the execution engine.  I'll report if/when it gets
>>>>> approved to open source.
>>>>>
>>>>
>>> Nice and thank you.
>>>
>>> Could this go to the starter repo now that David clarified? Or are you
>>> planning to share it in some other form?
>>>
>>>
>>>>
>>>>> Best,
>>>>>
>>>>> Damon
>>>>>
>>>>> On Wed, Jun 8, 2022 at 4:37 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> just checking:
>>>>>>
>>>>>> @David Cavazos <dc...@google.com> - were you able to enable GH
>>>>>> actions on the new repos?
>>>>>> @Damon Douglas <da...@google.com> - Did you get an answer to
>>>>>> your question?
>>>>>>
>>>>>> Thank you!
>>>>>> Ahmet
>>>>>>
>>>>>>
>>>>>> On Thu, May 12, 2022 at 11:53 AM Damon Douglas <
>>>>>> damondouglas@google.com> wrote:
>>>>>>
>>>>>>> Good day, @David Cavazos <dc...@google.com> I was recently able
>>>>>>> to solve using terraform to create a Cloud Build trigger for provisioning
>>>>>>> Dataflow custom templates.  I wanted to check in first before initiating a
>>>>>>> pull request on https://github.com/davidcavazos/beam-java.
>>>>>>>
>>>>>>> I was considering the PR to add a directory called
>>>>>>> infrastructure/google with all the terraform someone would need to
>>>>>>> provision a service account, custom network, IAM permissions, etc as well
>>>>>>> as the Cloud Build integration.  Would this be helpful?  The reason for
>>>>>>> infrastructure/google instead of just infrastructure is that I wanted to
>>>>>>> leave room for others to potentially add their own cloud variants i.e.
>>>>>>> infrastructure/aws.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Damon
>>>>>>>
>>>>>>> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>>>>>>>
>>>>>>>> @David Cavazos <dc...@google.com> - Were you able to resolve
>>>>>>>> this? And what exactly does a person need to do to enable GH actions?
>>>>>>>>
>>>>>>>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni
>>>>>>>>> <re...@google.com>, @Robert Bradshaw <ro...@google.com> can
>>>>>>>>> any of you help us enable GitHub actions on all the starter repositories?
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Good news! The Java starter repo has been merged! 🎉
>>>>>>>>>>
>>>>>>>>>> However, Ahmet noticed that the tests are not running
>>>>>>>>>> automatically. I tested them in my personal repo and they work, but I think
>>>>>>>>>> GitHub actions have to be enabled in the new starter repos. I don't have
>>>>>>>>>> permission to do so, can someone help us enable GitHub actions on all
>>>>>>>>>> starter repos?
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for taking a look. We actually considered supporting more
>>>>>>>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>>>>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>>>>>>>> high. We didn't want to *only* support the Dataflow runner
>>>>>>>>>>> either, so we simply linked to the runners documentation from the README.
>>>>>>>>>>> It could be nice to support that at some point, but I think a better
>>>>>>>>>>> solution is to improve the documentation on the runners page.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>>>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>>>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>>>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>>>>>>>> Go). Thanks for putting it together David!
>>>>>>>>>>>>
>>>>>>>>>>>> My only substantial feedback it that it was tricky to move from
>>>>>>>>>>>> the Direct runner to a different runner (in my case I was targeting
>>>>>>>>>>>> Dataflow) - it might be helpful to have instructions on doing that linked
>>>>>>>>>>>> from the Readme since I imagine starting on Direct then moving to a
>>>>>>>>>>>> different runner is a pretty common path; I don't think that should block
>>>>>>>>>>>> getting this initial version in though, just a future improvement
>>>>>>>>>>>> suggestion :)
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Danny
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please review the PR since the Python and Go starter projects
>>>>>>>>>>>>> are blocked until this one merges (so we get all the legal files right).
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OK. Bringing an important update on licensing to this thread
>>>>>>>>>>>>>>> for consideration. Discussion on
>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has
>>>>>>>>>>>>>>> concluded with key takeaways. These are things that were already true and
>>>>>>>>>>>>>>> people who are good at this stuff already may know, but I'm just going to
>>>>>>>>>>>>>>> say them again as I understand them:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we"
>>>>>>>>>>>>>>> gives "users" the permissions of both licenses - they can take their pick
>>>>>>>>>>>>>>> so they can treat it as MIT-0 licensed.
>>>>>>>>>>>>>>>  - BUT the copyright holders are the contributors to the
>>>>>>>>>>>>>>> project. They must agree that their contributions can be licensed like
>>>>>>>>>>>>>>> this. The ASF ICLA only agrees to ASL2 so we need to let them know. I
>>>>>>>>>>>>>>> suggest a CONTRIBUTING.md that mentions it and maybe a
>>>>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md with a checkbox*.
>>>>>>>>>>>>>>>  - If we want, we can include a README that explains this
>>>>>>>>>>>>>>> and tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>>>>>>>> sure.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My (likely unsurprising) take is that this is worth it
>>>>>>>>>>>>>> (though I also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Exactly how formal we need to get here is a matter of some
>>>>>>>>>>>>>>> debate and risk tolerance. For these repos I think there is very little
>>>>>>>>>>>>>>> risk. One could even argue the contents are so unoriginal as to be
>>>>>>>>>>>>>>> uncopyrightable., but the bar in the US for i.p. is comically low so that's
>>>>>>>>>>>>>>> not a good argument to depend on.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can we create an empty file on each directory so I can
>>>>>>>>>>>>>>>>> fork the repo? It doesn't look like there is a workaround to cloning empty
>>>>>>>>>>>>>>>>> repos in GitHub. Then I can send a pull request.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice
>>>>>>>>>>>>>>>>>>> and there's some step by step at
>>>>>>>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>>>>>>>     The Apache Software Foundation (
>>>>>>>>>>>>>>>>>>> http://www.apache.org/).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE
>>>>>>>>>>>>>>>>>>>>> file should look like? I'm not familiar with it and would like to get it
>>>>>>>>>>>>>>>>>>>>> right.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and
>>>>>>>>>>>>>>>>>>>>>> go", but I'd like to keep them as minimal as possible. We could have
>>>>>>>>>>>>>>>>>>>>>> another repo like `beam-working-examples` for more complete examples where
>>>>>>>>>>>>>>>>>>>>>> each subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I like the goal: for things where the build has
>>>>>>>>>>>>>>>>>>>>>>> extra setup, have an example that is fully functional on its own. There is
>>>>>>>>>>>>>>>>>>>>>>> of course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some
>>>>>>>>>>>>>>>>>>>>>>> of these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> What do folks think about also having a less
>>>>>>>>>>>>>>>>>>>>>>>> minimal set of starters? For Java I am thinking about protobuf / autovalue.
>>>>>>>>>>>>>>>>>>>>>>>> For Python maybe an opinionated setup with tox etc... Again this would just
>>>>>>>>>>>>>>>>>>>>>>>> contain 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>>>> I think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF
>>>>>>>>>>>>>>>>>>>>>>>>>> project, so it is "least surprise"
>>>>>>>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files
>>>>>>>>>>>>>>>>>>>>>>>>>> to carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file,
>>>>>>>>>>>>>>>>>>>>>>>>>> you have to includes the attributions from it. The NOTICE file is required
>>>>>>>>>>>>>>>>>>>>>>>>>> by ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead
>>>>>>>>>>>>>>>>>>>>>>>>>> with ASL2 and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setup (and/or with the Go template). I will say though, the Actions config
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actions issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Donny-Clark <ke...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actions, and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> motivation was to keep it simple. But of course we should keep it simple
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for users, not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> license and requesting the repos be created. Not sure if it needs my level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't have any problems changing it to Apache license. In any case, how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about we create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> licensing requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We want these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> good argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> question for apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which is the most pressing one and the one that is ready, but the rest
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> minimal starter projects for every language. Once we have Java, Python and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Go, it might be a good idea to change the quickstarts to use these instead
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the word count. There is already a dedicated word count walkthrough so I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can help us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know what one would put in a Python repo than, other than a bare setup.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that lists a dependency on apache_beam" is answered by David's initial
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> email and his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MIT but to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tricky because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> being a derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sense to users to be told to checkout this git repo for the language of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> your choice and run. Some repos will have more/less than others when it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comes to setup necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Robert Bradshaw <ro...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setting up a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> what one would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommendations on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> go is similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache-beam and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sachin Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> than Wordcount just for novelty/freshness but agreed with the suggestion
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Huntsperger <dh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the Wordcount example in each repo? I know that makes the repos less
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> minimal, but we could rewrite the quickstarts around these repos instead of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetypes. Less maintenance is preferable, and the github repos are more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flexible and maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that people can pull out a specific version of the examples that coincides
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Brian Hulette <bh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> examples, I don't expect them to break commonly, but I think it would be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> good to make sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unfortunate if we discovered a breakage after the release. Agree we should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verify RCs (document as part of the release process), or even better, add
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automation to verify the repo against snapshots. The automation could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> nice to have anyway since it provides an example for users to follow if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they want to test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can we drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of a separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> examples, I don't expect them to break commonly, but I think it would be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> good to make sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> repo per language, and having all the build systems we want to support for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> them. As long as we document which files are for which build system. That
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> way there are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> definitely more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the main repo, or a single starter repo containing all the starters or one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> per language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> get them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4:06 PM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> right now we only need to pin down the name of the repo, create it, and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> move the code there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> steps on creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:09 AM Ahmet Altay <al...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in apache/beam already, built with Maven Archetype [1]. It's what powers
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Apache repo, would we put this somewhere like apache/beam-java-template? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> include anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at 11:31 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> create a new Beam Java project, I've been working on a GitHub template
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Beam pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> integration via GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> make sure everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> my personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
I think they were enabled but it was a matter of configuring it to run on
the pull_request event instead of the push event. It looks like tests are
running correctly now (haven't tested Go on an additional PR yet, but I
think it should work).

On Fri, Jun 24, 2022 at 5:03 PM Ahmet Altay <al...@google.com> wrote:

>
>
> On Thu, Jun 9, 2022 at 2:14 PM Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Thu, Jun 9, 2022 at 1:10 PM David Cavazos <dc...@google.com> wrote:
>>
>>> Sorry, I was OOO.
>>>
>>> @Ahmet Altay <al...@google.com> Yes, GitHub actions have been set for
>>> the Java project, but not for other projects like Python.
>>>
>>
>> Nice. Let's enable it for other projects as well?
>>
>
> @David Cavazos <dc...@google.com> - Were you able to enable tests for
> other projects?
>
>
>>
>>
>>>
>>> @Damon Douglas <da...@google.com> Our plan for these starter
>>> repos is providing the minimal viable product for a pre-configured Apache
>>> Beam project. The plan is having one repo for every supported language. I
>>> think having Terraform integration would be great, so feel free to open a
>>> PR.
>>>
>>> FYI to everyone on the thread, the PR for the Python starter project is
>>> out for review. It doesn't look like the tests are running. I think we need
>>> to enable GitHub Actions on this repo (and all others as well). Please help
>>> us review/approve this so we can update the quickstarts.
>>> https://github.com/apache/beam-starter-python/pull/1
>>>
>>> -David
>>>
>>> On Thu, Jun 9, 2022 at 12:59 PM Damon Douglas <da...@google.com>
>>> wrote:
>>>
>>>> Hello Ahmet,
>>>>
>>>> Thank you so much for checking in.  I never got an answer to my
>>>> question.  However, a colleague of mine and I are putting together a quick
>>>> and easy standalone repo demonstration using terraform with Apache Beam
>>>> that we hope will benefit those in the community that target using Dataflow
>>>> on Google Cloud as the execution engine.  I'll report if/when it gets
>>>> approved to open source.
>>>>
>>>
>> Nice and thank you.
>>
>> Could this go to the starter repo now that David clarified? Or are you
>> planning to share it in some other form?
>>
>>
>>>
>>>> Best,
>>>>
>>>> Damon
>>>>
>>>> On Wed, Jun 8, 2022 at 4:37 PM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> just checking:
>>>>>
>>>>> @David Cavazos <dc...@google.com> - were you able to enable GH
>>>>> actions on the new repos?
>>>>> @Damon Douglas <da...@google.com> - Did you get an answer to
>>>>> your question?
>>>>>
>>>>> Thank you!
>>>>> Ahmet
>>>>>
>>>>>
>>>>> On Thu, May 12, 2022 at 11:53 AM Damon Douglas <
>>>>> damondouglas@google.com> wrote:
>>>>>
>>>>>> Good day, @David Cavazos <dc...@google.com> I was recently able
>>>>>> to solve using terraform to create a Cloud Build trigger for provisioning
>>>>>> Dataflow custom templates.  I wanted to check in first before initiating a
>>>>>> pull request on https://github.com/davidcavazos/beam-java.
>>>>>>
>>>>>> I was considering the PR to add a directory called
>>>>>> infrastructure/google with all the terraform someone would need to
>>>>>> provision a service account, custom network, IAM permissions, etc as well
>>>>>> as the Cloud Build integration.  Would this be helpful?  The reason for
>>>>>> infrastructure/google instead of just infrastructure is that I wanted to
>>>>>> leave room for others to potentially add their own cloud variants i.e.
>>>>>> infrastructure/aws.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Damon
>>>>>>
>>>>>> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> @David Cavazos <dc...@google.com> - Were you able to resolve
>>>>>>> this? And what exactly does a person need to do to enable GH actions?
>>>>>>>
>>>>>>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni
>>>>>>>> <re...@google.com>, @Robert Bradshaw <ro...@google.com> can
>>>>>>>> any of you help us enable GitHub actions on all the starter repositories?
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Good news! The Java starter repo has been merged! 🎉
>>>>>>>>>
>>>>>>>>> However, Ahmet noticed that the tests are not running
>>>>>>>>> automatically. I tested them in my personal repo and they work, but I think
>>>>>>>>> GitHub actions have to be enabled in the new starter repos. I don't have
>>>>>>>>> permission to do so, can someone help us enable GitHub actions on all
>>>>>>>>> starter repos?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for taking a look. We actually considered supporting more
>>>>>>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>>>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>>>>>>> high. We didn't want to *only* support the Dataflow runner
>>>>>>>>>> either, so we simply linked to the runners documentation from the README.
>>>>>>>>>> It could be nice to support that at some point, but I think a better
>>>>>>>>>> solution is to improve the documentation on the runners page.
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>>>>>>> Go). Thanks for putting it together David!
>>>>>>>>>>>
>>>>>>>>>>> My only substantial feedback it that it was tricky to move from
>>>>>>>>>>> the Direct runner to a different runner (in my case I was targeting
>>>>>>>>>>> Dataflow) - it might be helpful to have instructions on doing that linked
>>>>>>>>>>> from the Readme since I imagine starting on Direct then moving to a
>>>>>>>>>>> different runner is a pretty common path; I don't think that should block
>>>>>>>>>>> getting this initial version in though, just a future improvement
>>>>>>>>>>> suggestion :)
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Danny
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>>>>>>
>>>>>>>>>>>> Please review the PR since the Python and Go starter projects
>>>>>>>>>>>> are blocked until this one merges (so we get all the legal files right).
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> OK. Bringing an important update on licensing to this thread
>>>>>>>>>>>>>> for consideration. Discussion on
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has
>>>>>>>>>>>>>> concluded with key takeaways. These are things that were already true and
>>>>>>>>>>>>>> people who are good at this stuff already may know, but I'm just going to
>>>>>>>>>>>>>> say them again as I understand them:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>>>>>>>  - BUT the copyright holders are the contributors to the
>>>>>>>>>>>>>> project. They must agree that their contributions can be licensed like
>>>>>>>>>>>>>> this. The ASF ICLA only agrees to ASL2 so we need to let them know. I
>>>>>>>>>>>>>> suggest a CONTRIBUTING.md that mentions it and maybe a
>>>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md with a checkbox*.
>>>>>>>>>>>>>>  - If we want, we can include a README that explains this and
>>>>>>>>>>>>>> tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>>>>>>> sure.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> My (likely unsurprising) take is that this is worth it (though
>>>>>>>>>>>>> I also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Exactly how formal we need to get here is a matter of some
>>>>>>>>>>>>>> debate and risk tolerance. For these repos I think there is very little
>>>>>>>>>>>>>> risk. One could even argue the contents are so unoriginal as to be
>>>>>>>>>>>>>> uncopyrightable., but the bar in the US for i.p. is comically low so that's
>>>>>>>>>>>>>> not a good argument to depend on.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can we create an empty file on each directory so I can fork
>>>>>>>>>>>>>>>> the repo? It doesn't look like there is a workaround to cloning empty repos
>>>>>>>>>>>>>>>> in GitHub. Then I can send a pull request.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>>>>>>     The Apache Software Foundation (
>>>>>>>>>>>>>>>>>> http://www.apache.org/).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE
>>>>>>>>>>>>>>>>>>>> file should look like? I'm not familiar with it and would like to get it
>>>>>>>>>>>>>>>>>>>> right.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and
>>>>>>>>>>>>>>>>>>>>> go", but I'd like to keep them as minimal as possible. We could have
>>>>>>>>>>>>>>>>>>>>> another repo like `beam-working-examples` for more complete examples where
>>>>>>>>>>>>>>>>>>>>> each subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some
>>>>>>>>>>>>>>>>>>>>>> of these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal
>>>>>>>>>>>>>>>>>>>>>>> set of starters? For Java I am thinking about protobuf / autovalue. For
>>>>>>>>>>>>>>>>>>>>>>> Python maybe an opinionated setup with tox etc... Again this would just
>>>>>>>>>>>>>>>>>>>>>>> contain 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I
>>>>>>>>>>>>>>>>>>>>>>>>> think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project,
>>>>>>>>>>>>>>>>>>>>>>>>> so it is "least surprise"
>>>>>>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to
>>>>>>>>>>>>>>>>>>>>>>>>> carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file,
>>>>>>>>>>>>>>>>>>>>>>>>> you have to includes the attributions from it. The NOTICE file is required
>>>>>>>>>>>>>>>>>>>>>>>>> by ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with
>>>>>>>>>>>>>>>>>>>>>>>>> ASL2 and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions
>>>>>>>>>>>>>>>>>>>>>>>>>>>> setup (and/or with the Go template). I will say though, the Actions config
>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any
>>>>>>>>>>>>>>>>>>>>>>>>>>>> actions issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Donny-Clark <ke...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actions, and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> motivation was to keep it simple. But of course we should keep it simple
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for users, not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> license and requesting the repos be created. Not sure if it needs my level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't have any problems changing it to Apache license. In any case, how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about we create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> licensing requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> want these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> question for apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which is the most pressing one and the one that is ready, but the rest
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> minimal starter projects for every language. Once we have Java, Python and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Go, it might be a good idea to change the quickstarts to use these instead
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the word count. There is already a dedicated word count walkthrough so I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> help us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know what one would put in a Python repo than, other than a bare setup.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that lists a dependency on apache_beam" is answered by David's initial
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> email and his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MIT but to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tricky because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> being a derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to users to be told to checkout this git repo for the language of your
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> choice and run. Some repos will have more/less than others when it comes to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setup necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setting up a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> one would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommendations on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> go is similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache-beam and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sachin Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> than Wordcount just for novelty/freshness but agreed with the suggestion
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Huntsperger <dh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the Wordcount example in each repo? I know that makes the repos less
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> minimal, but we could rewrite the quickstarts around these repos instead of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetypes. Less maintenance is preferable, and the github repos are more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flexible and maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unfortunate if we discovered a breakage after the release. Agree we should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verify RCs (document as part of the release process), or even better, add
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automation to verify the repo against snapshots. The automation could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> nice to have anyway since it provides an example for users to follow if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they want to test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can we drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of a separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> repo per language, and having all the build systems we want to support for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> them. As long as we document which files are for which build system. That
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> way there are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> AM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> definitely more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> get them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4:06 PM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now we only need to pin down the name of the repo, create it, and move the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> code there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> steps on creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:09 AM Ahmet Altay <al...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Apache repo, would we put this somewhere like apache/beam-java-template? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> include anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> create a new Beam Java project, I've been working on a GitHub template
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Beam pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> via GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> my personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Ahmet Altay <al...@google.com>.
On Thu, Jun 9, 2022 at 2:14 PM Ahmet Altay <al...@google.com> wrote:

>
>
> On Thu, Jun 9, 2022 at 1:10 PM David Cavazos <dc...@google.com> wrote:
>
>> Sorry, I was OOO.
>>
>> @Ahmet Altay <al...@google.com> Yes, GitHub actions have been set for
>> the Java project, but not for other projects like Python.
>>
>
> Nice. Let's enable it for other projects as well?
>

@David Cavazos <dc...@google.com> - Were you able to enable tests for
other projects?


>
>
>>
>> @Damon Douglas <da...@google.com> Our plan for these starter
>> repos is providing the minimal viable product for a pre-configured Apache
>> Beam project. The plan is having one repo for every supported language. I
>> think having Terraform integration would be great, so feel free to open a
>> PR.
>>
>> FYI to everyone on the thread, the PR for the Python starter project is
>> out for review. It doesn't look like the tests are running. I think we need
>> to enable GitHub Actions on this repo (and all others as well). Please help
>> us review/approve this so we can update the quickstarts.
>> https://github.com/apache/beam-starter-python/pull/1
>>
>> -David
>>
>> On Thu, Jun 9, 2022 at 12:59 PM Damon Douglas <da...@google.com>
>> wrote:
>>
>>> Hello Ahmet,
>>>
>>> Thank you so much for checking in.  I never got an answer to my
>>> question.  However, a colleague of mine and I are putting together a quick
>>> and easy standalone repo demonstration using terraform with Apache Beam
>>> that we hope will benefit those in the community that target using Dataflow
>>> on Google Cloud as the execution engine.  I'll report if/when it gets
>>> approved to open source.
>>>
>>
> Nice and thank you.
>
> Could this go to the starter repo now that David clarified? Or are you
> planning to share it in some other form?
>
>
>>
>>> Best,
>>>
>>> Damon
>>>
>>> On Wed, Jun 8, 2022 at 4:37 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Hello all,
>>>>
>>>> just checking:
>>>>
>>>> @David Cavazos <dc...@google.com> - were you able to enable GH
>>>> actions on the new repos?
>>>> @Damon Douglas <da...@google.com> - Did you get an answer to
>>>> your question?
>>>>
>>>> Thank you!
>>>> Ahmet
>>>>
>>>>
>>>> On Thu, May 12, 2022 at 11:53 AM Damon Douglas <da...@google.com>
>>>> wrote:
>>>>
>>>>> Good day, @David Cavazos <dc...@google.com> I was recently able to
>>>>> solve using terraform to create a Cloud Build trigger for provisioning
>>>>> Dataflow custom templates.  I wanted to check in first before initiating a
>>>>> pull request on https://github.com/davidcavazos/beam-java.
>>>>>
>>>>> I was considering the PR to add a directory called
>>>>> infrastructure/google with all the terraform someone would need to
>>>>> provision a service account, custom network, IAM permissions, etc as well
>>>>> as the Cloud Build integration.  Would this be helpful?  The reason for
>>>>> infrastructure/google instead of just infrastructure is that I wanted to
>>>>> leave room for others to potentially add their own cloud variants i.e.
>>>>> infrastructure/aws.
>>>>>
>>>>> Best,
>>>>>
>>>>> Damon
>>>>>
>>>>> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> @David Cavazos <dc...@google.com> - Were you able to resolve
>>>>>> this? And what exactly does a person need to do to enable GH actions?
>>>>>>
>>>>>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni
>>>>>>> <re...@google.com>, @Robert Bradshaw <ro...@google.com> can
>>>>>>> any of you help us enable GitHub actions on all the starter repositories?
>>>>>>> Thanks!
>>>>>>>
>>>>>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Good news! The Java starter repo has been merged! 🎉
>>>>>>>>
>>>>>>>> However, Ahmet noticed that the tests are not running
>>>>>>>> automatically. I tested them in my personal repo and they work, but I think
>>>>>>>> GitHub actions have to be enabled in the new starter repos. I don't have
>>>>>>>> permission to do so, can someone help us enable GitHub actions on all
>>>>>>>> starter repos?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for taking a look. We actually considered supporting more
>>>>>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>>>>>> high. We didn't want to *only* support the Dataflow runner
>>>>>>>>> either, so we simply linked to the runners documentation from the README.
>>>>>>>>> It could be nice to support that at some point, but I think a better
>>>>>>>>> solution is to improve the documentation on the runners page.
>>>>>>>>>
>>>>>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>>>>>> Go). Thanks for putting it together David!
>>>>>>>>>>
>>>>>>>>>> My only substantial feedback it that it was tricky to move from
>>>>>>>>>> the Direct runner to a different runner (in my case I was targeting
>>>>>>>>>> Dataflow) - it might be helpful to have instructions on doing that linked
>>>>>>>>>> from the Readme since I imagine starting on Direct then moving to a
>>>>>>>>>> different runner is a pretty common path; I don't think that should block
>>>>>>>>>> getting this initial version in though, just a future improvement
>>>>>>>>>> suggestion :)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Danny
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>>>>>
>>>>>>>>>>> Please review the PR since the Python and Go starter projects
>>>>>>>>>>> are blocked until this one merges (so we get all the legal files right).
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> OK. Bringing an important update on licensing to this thread
>>>>>>>>>>>>> for consideration. Discussion on
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded
>>>>>>>>>>>>> with key takeaways. These are things that were already true and people who
>>>>>>>>>>>>> are good at this stuff already may know, but I'm just going to say them
>>>>>>>>>>>>> again as I understand them:
>>>>>>>>>>>>>
>>>>>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>>>>>>  - BUT the copyright holders are the contributors to the
>>>>>>>>>>>>> project. They must agree that their contributions can be licensed like
>>>>>>>>>>>>> this. The ASF ICLA only agrees to ASL2 so we need to let them know. I
>>>>>>>>>>>>> suggest a CONTRIBUTING.md that mentions it and maybe a
>>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md with a checkbox*.
>>>>>>>>>>>>>  - If we want, we can include a README that explains this and
>>>>>>>>>>>>> tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>>>>>> sure.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> My (likely unsurprising) take is that this is worth it (though
>>>>>>>>>>>> I also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> *Exactly how formal we need to get here is a matter of some
>>>>>>>>>>>>> debate and risk tolerance. For these repos I think there is very little
>>>>>>>>>>>>> risk. One could even argue the contents are so unoriginal as to be
>>>>>>>>>>>>> uncopyrightable., but the bar in the US for i.p. is comically low so that's
>>>>>>>>>>>>> not a good argument to depend on.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can we create an empty file on each directory so I can fork
>>>>>>>>>>>>>>> the repo? It doesn't look like there is a workaround to cloning empty repos
>>>>>>>>>>>>>>> in GitHub. Then I can send a pull request.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/
>>>>>>>>>>>>>>>>> ).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE
>>>>>>>>>>>>>>>>>>> file should look like? I'm not familiar with it and would like to get it
>>>>>>>>>>>>>>>>>>> right.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and
>>>>>>>>>>>>>>>>>>>> go", but I'd like to keep them as minimal as possible. We could have
>>>>>>>>>>>>>>>>>>>> another repo like `beam-working-examples` for more complete examples where
>>>>>>>>>>>>>>>>>>>> each subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of
>>>>>>>>>>>>>>>>>>>>> these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal
>>>>>>>>>>>>>>>>>>>>>> set of starters? For Java I am thinking about protobuf / autovalue. For
>>>>>>>>>>>>>>>>>>>>>> Python maybe an opinionated setup with tox etc... Again this would just
>>>>>>>>>>>>>>>>>>>>>> contain 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I
>>>>>>>>>>>>>>>>>>>>>>>> think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project,
>>>>>>>>>>>>>>>>>>>>>>>> so it is "least surprise"
>>>>>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to
>>>>>>>>>>>>>>>>>>>>>>>> carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file,
>>>>>>>>>>>>>>>>>>>>>>>> you have to includes the attributions from it. The NOTICE file is required
>>>>>>>>>>>>>>>>>>>>>>>> by ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with
>>>>>>>>>>>>>>>>>>>>>>>> ASL2 and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions
>>>>>>>>>>>>>>>>>>>>>>>>>>> setup (and/or with the Go template). I will say though, the Actions config
>>>>>>>>>>>>>>>>>>>>>>>>>>> should be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any
>>>>>>>>>>>>>>>>>>>>>>>>>>> actions issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>> Donny-Clark <ke...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>> actions, and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> motivation was to keep it simple. But of course we should keep it simple
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for users, not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> license and requesting the repos be created. Not sure if it needs my level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't have any problems changing it to Apache license. In any case, how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about we create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> want these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which is the most pressing one and the one that is ready, but the rest
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> help us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know what one would put in a Python repo than, other than a bare setup.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that lists a dependency on apache_beam" is answered by David's initial
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> email and his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to users to be told to checkout this git repo for the language of your
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> choice and run. Some repos will have more/less than others when it comes to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setup necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> up a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> one would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommendations on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache-beam and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sachin Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Huntsperger <dh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetypes. Less maintenance is preferable, and the github repos are more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flexible and maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unfortunate if we discovered a breakage after the release. Agree we should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verify RCs (document as part of the release process), or even better, add
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automation to verify the repo against snapshots. The automation could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> nice to have anyway since it provides an example for users to follow if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they want to test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can we drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> per language, and having all the build systems we want to support for them.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> As long as we document which files are for which build system. That way
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> AM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> definitely more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now we only need to pin down the name of the repo, create it, and move the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> code there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> steps on creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:09 AM Ahmet Altay <al...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Apache repo, would we put this somewhere like apache/beam-java-template? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> include anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> create a new Beam Java project, I've been working on a GitHub template
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Beam pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> via GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> my personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Ahmet Altay <al...@google.com>.
On Thu, Jun 9, 2022 at 1:10 PM David Cavazos <dc...@google.com> wrote:

> Sorry, I was OOO.
>
> @Ahmet Altay <al...@google.com> Yes, GitHub actions have been set for the
> Java project, but not for other projects like Python.
>

Nice. Let's enable it for other projects as well?


>
> @Damon Douglas <da...@google.com> Our plan for these starter repos
> is providing the minimal viable product for a pre-configured Apache Beam
> project. The plan is having one repo for every supported language. I think
> having Terraform integration would be great, so feel free to open a PR.
>
> FYI to everyone on the thread, the PR for the Python starter project is
> out for review. It doesn't look like the tests are running. I think we need
> to enable GitHub Actions on this repo (and all others as well). Please help
> us review/approve this so we can update the quickstarts.
> https://github.com/apache/beam-starter-python/pull/1
>
> -David
>
> On Thu, Jun 9, 2022 at 12:59 PM Damon Douglas <da...@google.com>
> wrote:
>
>> Hello Ahmet,
>>
>> Thank you so much for checking in.  I never got an answer to my
>> question.  However, a colleague of mine and I are putting together a quick
>> and easy standalone repo demonstration using terraform with Apache Beam
>> that we hope will benefit those in the community that target using Dataflow
>> on Google Cloud as the execution engine.  I'll report if/when it gets
>> approved to open source.
>>
>
Nice and thank you.

Could this go to the starter repo now that David clarified? Or are you
planning to share it in some other form?


>
>> Best,
>>
>> Damon
>>
>> On Wed, Jun 8, 2022 at 4:37 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> Hello all,
>>>
>>> just checking:
>>>
>>> @David Cavazos <dc...@google.com> - were you able to enable GH
>>> actions on the new repos?
>>> @Damon Douglas <da...@google.com> - Did you get an answer to
>>> your question?
>>>
>>> Thank you!
>>> Ahmet
>>>
>>>
>>> On Thu, May 12, 2022 at 11:53 AM Damon Douglas <da...@google.com>
>>> wrote:
>>>
>>>> Good day, @David Cavazos <dc...@google.com> I was recently able to
>>>> solve using terraform to create a Cloud Build trigger for provisioning
>>>> Dataflow custom templates.  I wanted to check in first before initiating a
>>>> pull request on https://github.com/davidcavazos/beam-java.
>>>>
>>>> I was considering the PR to add a directory called
>>>> infrastructure/google with all the terraform someone would need to
>>>> provision a service account, custom network, IAM permissions, etc as well
>>>> as the Cloud Build integration.  Would this be helpful?  The reason for
>>>> infrastructure/google instead of just infrastructure is that I wanted to
>>>> leave room for others to potentially add their own cloud variants i.e.
>>>> infrastructure/aws.
>>>>
>>>> Best,
>>>>
>>>> Damon
>>>>
>>>> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> @David Cavazos <dc...@google.com> - Were you able to resolve this?
>>>>> And what exactly does a person need to do to enable GH actions?
>>>>>
>>>>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni
>>>>>> <re...@google.com>, @Robert Bradshaw <ro...@google.com> can
>>>>>> any of you help us enable GitHub actions on all the starter repositories?
>>>>>> Thanks!
>>>>>>
>>>>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Good news! The Java starter repo has been merged! 🎉
>>>>>>>
>>>>>>> However, Ahmet noticed that the tests are not running automatically.
>>>>>>> I tested them in my personal repo and they work, but I think GitHub actions
>>>>>>> have to be enabled in the new starter repos. I don't have permission to do
>>>>>>> so, can someone help us enable GitHub actions on all starter repos?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for taking a look. We actually considered supporting more
>>>>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>>>>> high. We didn't want to *only* support the Dataflow runner either,
>>>>>>>> so we simply linked to the runners documentation from the README. It could
>>>>>>>> be nice to support that at some point, but I think a better solution is to
>>>>>>>> improve the documentation on the runners page.
>>>>>>>>
>>>>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>
>>>>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>>>>> Go). Thanks for putting it together David!
>>>>>>>>>
>>>>>>>>> My only substantial feedback it that it was tricky to move from
>>>>>>>>> the Direct runner to a different runner (in my case I was targeting
>>>>>>>>> Dataflow) - it might be helpful to have instructions on doing that linked
>>>>>>>>> from the Readme since I imagine starting on Direct then moving to a
>>>>>>>>> different runner is a pretty common path; I don't think that should block
>>>>>>>>> getting this initial version in though, just a future improvement
>>>>>>>>> suggestion :)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Danny
>>>>>>>>>
>>>>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>>>>
>>>>>>>>>> Please review the PR since the Python and Go starter projects are
>>>>>>>>>> blocked until this one merges (so we get all the legal files right).
>>>>>>>>>>
>>>>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> OK. Bringing an important update on licensing to this thread
>>>>>>>>>>>> for consideration. Discussion on
>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded
>>>>>>>>>>>> with key takeaways. These are things that were already true and people who
>>>>>>>>>>>> are good at this stuff already may know, but I'm just going to say them
>>>>>>>>>>>> again as I understand them:
>>>>>>>>>>>>
>>>>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>>>>>  - BUT the copyright holders are the contributors to the
>>>>>>>>>>>> project. They must agree that their contributions can be licensed like
>>>>>>>>>>>> this. The ASF ICLA only agrees to ASL2 so we need to let them know. I
>>>>>>>>>>>> suggest a CONTRIBUTING.md that mentions it and maybe a
>>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md with a checkbox*.
>>>>>>>>>>>>  - If we want, we can include a README that explains this and
>>>>>>>>>>>> tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>>>>
>>>>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>>>>> sure.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> My (likely unsurprising) take is that this is worth it (though I
>>>>>>>>>>> also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> *Exactly how formal we need to get here is a matter of some
>>>>>>>>>>>> debate and risk tolerance. For these repos I think there is very little
>>>>>>>>>>>> risk. One could even argue the contents are so unoriginal as to be
>>>>>>>>>>>> uncopyrightable., but the bar in the US for i.p. is comically low so that's
>>>>>>>>>>>> not a good argument to depend on.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can we create an empty file on each directory so I can fork
>>>>>>>>>>>>>> the repo? It doesn't look like there is a workaround to cloning empty repos
>>>>>>>>>>>>>> in GitHub. Then I can send a pull request.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/
>>>>>>>>>>>>>>>> ).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and
>>>>>>>>>>>>>>>>>>> go", but I'd like to keep them as minimal as possible. We could have
>>>>>>>>>>>>>>>>>>> another repo like `beam-working-examples` for more complete examples where
>>>>>>>>>>>>>>>>>>> each subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of
>>>>>>>>>>>>>>>>>>>> these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal
>>>>>>>>>>>>>>>>>>>>> set of starters? For Java I am thinking about protobuf / autovalue. For
>>>>>>>>>>>>>>>>>>>>> Python maybe an opinionated setup with tox etc... Again this would just
>>>>>>>>>>>>>>>>>>>>> contain 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I
>>>>>>>>>>>>>>>>>>>>>>> think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project,
>>>>>>>>>>>>>>>>>>>>>>> so it is "least surprise"
>>>>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to
>>>>>>>>>>>>>>>>>>>>>>> carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you
>>>>>>>>>>>>>>>>>>>>>>> have to includes the attributions from it. The NOTICE file is required by
>>>>>>>>>>>>>>>>>>>>>>> ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with
>>>>>>>>>>>>>>>>>>>>>>> ASL2 and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions
>>>>>>>>>>>>>>>>>>>>>>>>>> setup (and/or with the Go template). I will say though, the Actions config
>>>>>>>>>>>>>>>>>>>>>>>>>> should be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any
>>>>>>>>>>>>>>>>>>>>>>>>>> actions issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark
>>>>>>>>>>>>>>>>>>>>>>>>>> <ke...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>> actions, and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation
>>>>>>>>>>>>>>>>>>>>>>>>>>>> was to keep it simple. But of course we should keep it simple for users,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>> license and requesting the repos be created. Not sure if it needs my level
>>>>>>>>>>>>>>>>>>>>>>>>>>>> of privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't have any problems changing it to Apache license. In any case, how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about we create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> want these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> help us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know what one would put in a Python repo than, other than a bare setup.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that lists a dependency on apache_beam" is answered by David's initial
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> email and his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to users to be told to checkout this git repo for the language of your
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> choice and run. Some repos will have more/less than others when it comes to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setup necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> up a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> one would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommendations on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache-beam and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sachin Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Huntsperger <dh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetypes. Less maintenance is preferable, and the github repos are more
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> flexible and maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unfortunate if we discovered a breakage after the release. Agree we should
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> verify RCs (document as part of the release process), or even better, add
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automation to verify the repo against snapshots. The automation could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> nice to have anyway since it provides an example for users to follow if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they want to test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> per language, and having all the build systems we want to support for them.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> As long as we document which files are for which build system. That way
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> AM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> now we only need to pin down the name of the repo, create it, and move the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> code there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> AM Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Apache repo, would we put this somewhere like apache/beam-java-template? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> include anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> create a new Beam Java project, I've been working on a GitHub template
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Beam pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> via GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> new GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Sorry, I was OOO.

@Ahmet Altay <al...@google.com> Yes, GitHub actions have been set for the
Java project, but not for other projects like Python.

@Damon Douglas <da...@google.com> Our plan for these starter repos
is providing the minimal viable product for a pre-configured Apache Beam
project. The plan is having one repo for every supported language. I think
having Terraform integration would be great, so feel free to open a PR.

FYI to everyone on the thread, the PR for the Python starter project is out
for review. It doesn't look like the tests are running. I think we need to
enable GitHub Actions on this repo (and all others as well). Please help us
review/approve this so we can update the quickstarts.
https://github.com/apache/beam-starter-python/pull/1

-David

On Thu, Jun 9, 2022 at 12:59 PM Damon Douglas <da...@google.com>
wrote:

> Hello Ahmet,
>
> Thank you so much for checking in.  I never got an answer to my question.
> However, a colleague of mine and I are putting together a quick and easy
> standalone repo demonstration using terraform with Apache Beam that we hope
> will benefit those in the community that target using Dataflow on Google
> Cloud as the execution engine.  I'll report if/when it gets approved to
> open source.
>
> Best,
>
> Damon
>
> On Wed, Jun 8, 2022 at 4:37 PM Ahmet Altay <al...@google.com> wrote:
>
>> Hello all,
>>
>> just checking:
>>
>> @David Cavazos <dc...@google.com> - were you able to enable GH
>> actions on the new repos?
>> @Damon Douglas <da...@google.com> - Did you get an answer to your
>> question?
>>
>> Thank you!
>> Ahmet
>>
>>
>> On Thu, May 12, 2022 at 11:53 AM Damon Douglas <da...@google.com>
>> wrote:
>>
>>> Good day, @David Cavazos <dc...@google.com> I was recently able to
>>> solve using terraform to create a Cloud Build trigger for provisioning
>>> Dataflow custom templates.  I wanted to check in first before initiating a
>>> pull request on https://github.com/davidcavazos/beam-java.
>>>
>>> I was considering the PR to add a directory called infrastructure/google
>>> with all the terraform someone would need to provision a service account,
>>> custom network, IAM permissions, etc as well as the Cloud Build
>>> integration.  Would this be helpful?  The reason for infrastructure/google
>>> instead of just infrastructure is that I wanted to leave room for others to
>>> potentially add their own cloud variants i.e. infrastructure/aws.
>>>
>>> Best,
>>>
>>> Damon
>>>
>>> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> @David Cavazos <dc...@google.com> - Were you able to resolve this?
>>>> And what exactly does a person need to do to enable GH actions?
>>>>
>>>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni <re...@google.com>
>>>>> , @Robert Bradshaw <ro...@google.com> can any of you help us
>>>>> enable GitHub actions on all the starter repositories? Thanks!
>>>>>
>>>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Good news! The Java starter repo has been merged! 🎉
>>>>>>
>>>>>> However, Ahmet noticed that the tests are not running automatically.
>>>>>> I tested them in my personal repo and they work, but I think GitHub actions
>>>>>> have to be enabled in the new starter repos. I don't have permission to do
>>>>>> so, can someone help us enable GitHub actions on all starter repos?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for taking a look. We actually considered supporting more
>>>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>>>> high. We didn't want to *only* support the Dataflow runner either,
>>>>>>> so we simply linked to the runners documentation from the README. It could
>>>>>>> be nice to support that at some point, but I think a better solution is to
>>>>>>> improve the documentation on the runners page.
>>>>>>>
>>>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>
>>>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>>>> Go). Thanks for putting it together David!
>>>>>>>>
>>>>>>>> My only substantial feedback it that it was tricky to move from the
>>>>>>>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>>>>>>>> it might be helpful to have instructions on doing that linked from the
>>>>>>>> Readme since I imagine starting on Direct then moving to a different runner
>>>>>>>> is a pretty common path; I don't think that should block getting this
>>>>>>>> initial version in though, just a future improvement suggestion :)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Danny
>>>>>>>>
>>>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>>>
>>>>>>>>> Please review the PR since the Python and Go starter projects are
>>>>>>>>> blocked until this one merges (so we get all the legal files right).
>>>>>>>>>
>>>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <
>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> OK. Bringing an important update on licensing to this thread for
>>>>>>>>>>> consideration. Discussion on
>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded
>>>>>>>>>>> with key takeaways. These are things that were already true and people who
>>>>>>>>>>> are good at this stuff already may know, but I'm just going to say them
>>>>>>>>>>> again as I understand them:
>>>>>>>>>>>
>>>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>>>>  - BUT the copyright holders are the contributors to the
>>>>>>>>>>> project. They must agree that their contributions can be licensed like
>>>>>>>>>>> this. The ASF ICLA only agrees to ASL2 so we need to let them know. I
>>>>>>>>>>> suggest a CONTRIBUTING.md that mentions it and maybe a
>>>>>>>>>>> PULL_REQUEST_TEMPLATE.md with a checkbox*.
>>>>>>>>>>>  - If we want, we can include a README that explains this and
>>>>>>>>>>> tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>>>
>>>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>>>> sure.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> My (likely unsurprising) take is that this is worth it (though I
>>>>>>>>>> also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> *Exactly how formal we need to get here is a matter of some
>>>>>>>>>>> debate and risk tolerance. For these repos I think there is very little
>>>>>>>>>>> risk. One could even argue the contents are so unoriginal as to be
>>>>>>>>>>> uncopyrightable., but the bar in the US for i.p. is comically low so that's
>>>>>>>>>>> not a good argument to depend on.
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Can we create an empty file on each directory so I can fork
>>>>>>>>>>>>> the repo? It doesn't look like there is a workaround to cloning empty repos
>>>>>>>>>>>>> in GitHub. Then I can send a pull request.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and
>>>>>>>>>>>>>>>>>> go", but I'd like to keep them as minimal as possible. We could have
>>>>>>>>>>>>>>>>>> another repo like `beam-working-examples` for more complete examples where
>>>>>>>>>>>>>>>>>> each subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of
>>>>>>>>>>>>>>>>>>> these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal
>>>>>>>>>>>>>>>>>>>> set of starters? For Java I am thinking about protobuf / autovalue. For
>>>>>>>>>>>>>>>>>>>> Python maybe an opinionated setup with tox etc... Again this would just
>>>>>>>>>>>>>>>>>>>> contain 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I
>>>>>>>>>>>>>>>>>>>>>> think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so
>>>>>>>>>>>>>>>>>>>>>> it is "least surprise"
>>>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to
>>>>>>>>>>>>>>>>>>>>>> carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you
>>>>>>>>>>>>>>>>>>>>>> have to includes the attributions from it. The NOTICE file is required by
>>>>>>>>>>>>>>>>>>>>>> ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with
>>>>>>>>>>>>>>>>>>>>>> ASL2 and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions
>>>>>>>>>>>>>>>>>>>>>>>>> setup (and/or with the Go template). I will say though, the Actions config
>>>>>>>>>>>>>>>>>>>>>>>>> should be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any
>>>>>>>>>>>>>>>>>>>>>>>>> actions issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>> actions, and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation
>>>>>>>>>>>>>>>>>>>>>>>>>>> was to keep it simple. But of course we should keep it simple for users,
>>>>>>>>>>>>>>>>>>>>>>>>>>> not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>> license and requesting the repos be created. Not sure if it needs my level
>>>>>>>>>>>>>>>>>>>>>>>>>>> of privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>> have any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We
>>>>>>>>>>>>>>>>>>>>>>>>>>>> want these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's
>>>>>>>>>>>>>>>>>>>>>>>>>>>> best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question
>>>>>>>>>>>>>>>>>>>>>>>>>>>> for apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which
>>>>>>>>>>>>>>>>>>>>>>>>>>>> is the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can
>>>>>>>>>>>>>>>>>>>>>>>>>>>> help us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>> go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>> know what one would put in a Python repo than, other than a bare setup.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>> that lists a dependency on apache_beam" is answered by David's initial
>>>>>>>>>>>>>>>>>>>>>>>>>>>> email and his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>>> but to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> users to be told to checkout this git repo for the language of your choice
>>>>>>>>>>>>>>>>>>>>>>>>>>>> and run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting
>>>>>>>>>>>>>>>>>>>>>>>>>>>> up a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what
>>>>>>>>>>>>>>>>>>>>>>>>>>>> one would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have
>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommendations on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go
>>>>>>>>>>>>>>>>>>>>>>>>>>>> is similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install
>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache-beam and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sachin Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Huntsperger <dh...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetypes. Less maintenance is preferable, and the github repos are more
>>>>>>>>>>>>>>>>>>>>>>>>>>>> flexible and maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate
>>>>>>>>>>>>>>>>>>>>>>>>>>>> if we discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can
>>>>>>>>>>>>>>>>>>>>>>>>>>>> we drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo
>>>>>>>>>>>>>>>>>>>>>>>>>>>> per language, and having all the build systems we want to support for them.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> As long as we document which files are for which build system. That way
>>>>>>>>>>>>>>>>>>>>>>>>>>>> there are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely
>>>>>>>>>>>>>>>>>>>>>>>>>>>> more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get
>>>>>>>>>>>>>>>>>>>>>>>>>>>> them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06
>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right
>>>>>>>>>>>>>>>>>>>>>>>>>>>> now we only need to pin down the name of the repo, create it, and move the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> code there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>> on creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09
>>>>>>>>>>>>>>>>>>>>>>>>>>>> AM Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was
>>>>>>>>>>>>>>>>>>>>>>>>>>>> there any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54
>>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Apache repo, would we put this somewhere like apache/beam-java-template? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>> think apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> include anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> create a new Beam Java project, I've been working on a GitHub template
>>>>>>>>>>>>>>>>>>>>>>>>>>>> containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World"
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Beam pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration
>>>>>>>>>>>>>>>>>>>>>>>>>>>> via GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions
>>>>>>>>>>>>>>>>>>>>>>>>>>>> on how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make
>>>>>>>>>>>>>>>>>>>>>>>>>>>> sure everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Damon Douglas <da...@google.com>.
Hello Ahmet,

Thank you so much for checking in.  I never got an answer to my question.
However, a colleague of mine and I are putting together a quick and easy
standalone repo demonstration using terraform with Apache Beam that we hope
will benefit those in the community that target using Dataflow on Google
Cloud as the execution engine.  I'll report if/when it gets approved to
open source.

Best,

Damon

On Wed, Jun 8, 2022 at 4:37 PM Ahmet Altay <al...@google.com> wrote:

> Hello all,
>
> just checking:
>
> @David Cavazos <dc...@google.com> - were you able to enable GH actions
> on the new repos?
> @Damon Douglas <da...@google.com> - Did you get an answer to your
> question?
>
> Thank you!
> Ahmet
>
>
> On Thu, May 12, 2022 at 11:53 AM Damon Douglas <da...@google.com>
> wrote:
>
>> Good day, @David Cavazos <dc...@google.com> I was recently able to
>> solve using terraform to create a Cloud Build trigger for provisioning
>> Dataflow custom templates.  I wanted to check in first before initiating a
>> pull request on https://github.com/davidcavazos/beam-java.
>>
>> I was considering the PR to add a directory called infrastructure/google
>> with all the terraform someone would need to provision a service account,
>> custom network, IAM permissions, etc as well as the Cloud Build
>> integration.  Would this be helpful?  The reason for infrastructure/google
>> instead of just infrastructure is that I wanted to leave room for others to
>> potentially add their own cloud variants i.e. infrastructure/aws.
>>
>> Best,
>>
>> Damon
>>
>> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>>
>>> @David Cavazos <dc...@google.com> - Were you able to resolve this?
>>> And what exactly does a person need to do to enable GH actions?
>>>
>>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni <re...@google.com>
>>>> , @Robert Bradshaw <ro...@google.com> can any of you help us enable
>>>> GitHub actions on all the starter repositories? Thanks!
>>>>
>>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> Good news! The Java starter repo has been merged! 🎉
>>>>>
>>>>> However, Ahmet noticed that the tests are not running automatically. I
>>>>> tested them in my personal repo and they work, but I think GitHub actions
>>>>> have to be enabled in the new starter repos. I don't have permission to do
>>>>> so, can someone help us enable GitHub actions on all starter repos?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks for taking a look. We actually considered supporting more
>>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>>> high. We didn't want to *only* support the Dataflow runner either,
>>>>>> so we simply linked to the runners documentation from the README. It could
>>>>>> be nice to support that at some point, but I think a better solution is to
>>>>>> improve the documentation on the runners page.
>>>>>>
>>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>>> dannymccormick@google.com> wrote:
>>>>>>
>>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>>> Go). Thanks for putting it together David!
>>>>>>>
>>>>>>> My only substantial feedback it that it was tricky to move from the
>>>>>>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>>>>>>> it might be helpful to have instructions on doing that linked from the
>>>>>>> Readme since I imagine starting on Direct then moving to a different runner
>>>>>>> is a pretty common path; I don't think that should block getting this
>>>>>>> initial version in though, just a future improvement suggestion :)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Danny
>>>>>>>
>>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>>
>>>>>>>> Please review the PR since the Python and Go starter projects are
>>>>>>>> blocked until this one merges (so we get all the legal files right).
>>>>>>>>
>>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>
>>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> OK. Bringing an important update on licensing to this thread for
>>>>>>>>>> consideration. Discussion on
>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded
>>>>>>>>>> with key takeaways. These are things that were already true and people who
>>>>>>>>>> are good at this stuff already may know, but I'm just going to say them
>>>>>>>>>> again as I understand them:
>>>>>>>>>>
>>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>>>  - BUT the copyright holders are the contributors to the project.
>>>>>>>>>> They must agree that their contributions can be licensed like this. The ASF
>>>>>>>>>> ICLA only agrees to ASL2 so we need to let them know. I suggest a
>>>>>>>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>>>>>>>> a checkbox*.
>>>>>>>>>>  - If we want, we can include a README that explains this and
>>>>>>>>>> tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>>
>>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>>> sure.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> My (likely unsurprising) take is that this is worth it (though I
>>>>>>>>> also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> *Exactly how formal we need to get here is a matter of some
>>>>>>>>>> debate and risk tolerance. For these repos I think there is very little
>>>>>>>>>> risk. One could even argue the contents are so unoriginal as to be
>>>>>>>>>> uncopyrightable., but the bar in the US for i.p. is comically low so that's
>>>>>>>>>> not a good argument to depend on.
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Can we create an empty file on each directory so I can fork the
>>>>>>>>>>>> repo? It doesn't look like there is a workaround to cloning empty repos in
>>>>>>>>>>>> GitHub. Then I can send a pull request.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and go",
>>>>>>>>>>>>>>>>> but I'd like to keep them as minimal as possible. We could have another
>>>>>>>>>>>>>>>>> repo like `beam-working-examples` for more complete examples where each
>>>>>>>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of
>>>>>>>>>>>>>>>>>> these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal set
>>>>>>>>>>>>>>>>>>> of starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I
>>>>>>>>>>>>>>>>>>>>> think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so
>>>>>>>>>>>>>>>>>>>>> it is "least surprise"
>>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to
>>>>>>>>>>>>>>>>>>>>> carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you
>>>>>>>>>>>>>>>>>>>>> have to includes the attributions from it. The NOTICE file is required by
>>>>>>>>>>>>>>>>>>>>> ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with
>>>>>>>>>>>>>>>>>>>>> ASL2 and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub
>>>>>>>>>>>>>>>>>>>>>>>>> actions, and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation
>>>>>>>>>>>>>>>>>>>>>>>>>> was to keep it simple. But of course we should keep it simple for users,
>>>>>>>>>>>>>>>>>>>>>>>>>> not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT
>>>>>>>>>>>>>>>>>>>>>>>>>> license and requesting the repos be created. Not sure if it needs my level
>>>>>>>>>>>>>>>>>>>>>>>>>> of privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't
>>>>>>>>>>>>>>>>>>>>>>>>>>> have any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We
>>>>>>>>>>>>>>>>>>>>>>>>>>> want these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's
>>>>>>>>>>>>>>>>>>>>>>>>>>> best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question
>>>>>>>>>>>>>>>>>>>>>>>>>>> for apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which
>>>>>>>>>>>>>>>>>>>>>>>>>>> is the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can
>>>>>>>>>>>>>>>>>>>>>>>>>>> help us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>> go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know
>>>>>>>>>>>>>>>>>>>>>>>>>>> what one would put in a Python repo than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on apache_beam" is answered by David's initial email and
>>>>>>>>>>>>>>>>>>>>>>>>>>> his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT
>>>>>>>>>>>>>>>>>>>>>>>>>>> but to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to
>>>>>>>>>>>>>>>>>>>>>>>>>>> users to be told to checkout this git repo for the language of your choice
>>>>>>>>>>>>>>>>>>>>>>>>>>> and run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up
>>>>>>>>>>>>>>>>>>>>>>>>>>> a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one
>>>>>>>>>>>>>>>>>>>>>>>>>>> would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have
>>>>>>>>>>>>>>>>>>>>>>>>>>> recommendations on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam
>>>>>>>>>>>>>>>>>>>>>>>>>>> and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik
>>>>>>>>>>>>>>>>>>>>>>>>>>> <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the
>>>>>>>>>>>>>>>>>>>>>>>>>>> archetypes. Less maintenance is preferable, and the github repos are more
>>>>>>>>>>>>>>>>>>>>>>>>>>> flexible and maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate
>>>>>>>>>>>>>>>>>>>>>>>>>>> if we discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can
>>>>>>>>>>>>>>>>>>>>>>>>>>> we drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo
>>>>>>>>>>>>>>>>>>>>>>>>>>> per language, and having all the build systems we want to support for them.
>>>>>>>>>>>>>>>>>>>>>>>>>>> As long as we document which files are for which build system. That way
>>>>>>>>>>>>>>>>>>>>>>>>>>> there are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM
>>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely
>>>>>>>>>>>>>>>>>>>>>>>>>>> more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the
>>>>>>>>>>>>>>>>>>>>>>>>>>> main repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get
>>>>>>>>>>>>>>>>>>>>>>>>>>> them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM
>>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now
>>>>>>>>>>>>>>>>>>>>>>>>>>> we only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>> on creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09
>>>>>>>>>>>>>>>>>>>>>>>>>>> AM Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was
>>>>>>>>>>>>>>>>>>>>>>>>>>> there any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54
>>>>>>>>>>>>>>>>>>>>>>>>>>> PM Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache
>>>>>>>>>>>>>>>>>>>>>>>>>>> repo, would we put this somewhere like apache/beam-java-template? I think
>>>>>>>>>>>>>>>>>>>>>>>>>>> apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30
>>>>>>>>>>>>>>>>>>>>>>>>>>> AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>>> 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to
>>>>>>>>>>>>>>>>>>>>>>>>>>> create a new Beam Java project, I've been working on a GitHub template
>>>>>>>>>>>>>>>>>>>>>>>>>>> containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle,
>>>>>>>>>>>>>>>>>>>>>>>>>>> sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions
>>>>>>>>>>>>>>>>>>>>>>>>>>> on how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make
>>>>>>>>>>>>>>>>>>>>>>>>>>> sure everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Ahmet Altay <al...@google.com>.
Hello all,

just checking:

@David Cavazos <dc...@google.com> - were you able to enable GH actions
on the new repos?
@Damon Douglas <da...@google.com> - Did you get an answer to your
question?

Thank you!
Ahmet


On Thu, May 12, 2022 at 11:53 AM Damon Douglas <da...@google.com>
wrote:

> Good day, @David Cavazos <dc...@google.com> I was recently able to
> solve using terraform to create a Cloud Build trigger for provisioning
> Dataflow custom templates.  I wanted to check in first before initiating a
> pull request on https://github.com/davidcavazos/beam-java.
>
> I was considering the PR to add a directory called infrastructure/google
> with all the terraform someone would need to provision a service account,
> custom network, IAM permissions, etc as well as the Cloud Build
> integration.  Would this be helpful?  The reason for infrastructure/google
> instead of just infrastructure is that I wanted to leave room for others to
> potentially add their own cloud variants i.e. infrastructure/aws.
>
> Best,
>
> Damon
>
> On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:
>
>> @David Cavazos <dc...@google.com> - Were you able to resolve this?
>> And what exactly does a person need to do to enable GH actions?
>>
>> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni <re...@google.com>, @Robert
>>> Bradshaw <ro...@google.com> can any of you help us enable GitHub
>>> actions on all the starter repositories? Thanks!
>>>
>>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Good news! The Java starter repo has been merged! 🎉
>>>>
>>>> However, Ahmet noticed that the tests are not running automatically. I
>>>> tested them in my personal repo and they work, but I think GitHub actions
>>>> have to be enabled in the new starter repos. I don't have permission to do
>>>> so, can someone help us enable GitHub actions on all starter repos?
>>>>
>>>> Thanks!
>>>>
>>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> Thanks for taking a look. We actually considered supporting more
>>>>> runners in them, but the complexity and maintenance burden on setting up
>>>>> and supporting multiple runners in the testing infrastructure was quite
>>>>> high. We didn't want to *only* support the Dataflow runner either, so
>>>>> we simply linked to the runners documentation from the README. It could be
>>>>> nice to support that at some point, but I think a better solution is to
>>>>> improve the documentation on the runners page.
>>>>>
>>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>>> Go). Thanks for putting it together David!
>>>>>>
>>>>>> My only substantial feedback it that it was tricky to move from the
>>>>>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>>>>>> it might be helpful to have instructions on doing that linked from the
>>>>>> Readme since I imagine starting on Direct then moving to a different runner
>>>>>> is a pretty common path; I don't think that should block getting this
>>>>>> initial version in though, just a future improvement suggestion :)
>>>>>>
>>>>>> Thanks,
>>>>>> Danny
>>>>>>
>>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>>
>>>>>>> Please review the PR since the Python and Go starter projects are
>>>>>>> blocked until this one merges (so we get all the legal files right).
>>>>>>>
>>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> OK. Bringing an important update on licensing to this thread for
>>>>>>>>> consideration. Discussion on
>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded
>>>>>>>>> with key takeaways. These are things that were already true and people who
>>>>>>>>> are good at this stuff already may know, but I'm just going to say them
>>>>>>>>> again as I understand them:
>>>>>>>>>
>>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>>  - BUT the copyright holders are the contributors to the project.
>>>>>>>>> They must agree that their contributions can be licensed like this. The ASF
>>>>>>>>> ICLA only agrees to ASL2 so we need to let them know. I suggest a
>>>>>>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>>>>>>> a checkbox*.
>>>>>>>>>  - If we want, we can include a README that explains this and
>>>>>>>>> tells users they can delete the bits related to ASL2/ASF and
>>>>>>>>> CONTRIBUTING.md if they want to change it however they want.
>>>>>>>>>
>>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>>> sure.
>>>>>>>>>
>>>>>>>>
>>>>>>>> My (likely unsurprising) take is that this is worth it (though I
>>>>>>>> also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>>
>>>>>>>>
>>>>>>>>> *Exactly how formal we need to get here is a matter of some debate
>>>>>>>>> and risk tolerance. For these repos I think there is very little risk. One
>>>>>>>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>>>>>>>> but the bar in the US for i.p. is comically low so that's not a good
>>>>>>>>> argument to depend on.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Friendly ping on this :)
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Can we create an empty file on each directory so I can fork the
>>>>>>>>>>> repo? It doesn't look like there is a workaround to cloning empty repos in
>>>>>>>>>>> GitHub. Then I can send a pull request.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>>
>>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <
>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>>
>>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>>
>>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>> For the starter projects I like them being "clone and go",
>>>>>>>>>>>>>>>> but I'd like to keep them as minimal as possible. We could have another
>>>>>>>>>>>>>>>> repo like `beam-working-examples` for more complete examples where each
>>>>>>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of
>>>>>>>>>>>>>>>>> these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal set
>>>>>>>>>>>>>>>>>> of starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <
>>>>>>>>>>>>>>>>>> rebo@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I
>>>>>>>>>>>>>>>>>>>> think it will be simplest to license it under ASL2 and include a NOTICE
>>>>>>>>>>>>>>>>>>>> file. The user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so
>>>>>>>>>>>>>>>>>>>> it is "least surprise"
>>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to
>>>>>>>>>>>>>>>>>>>> carry prominent notices stating that You changed the files" which won't
>>>>>>>>>>>>>>>>>>>> apply to the user's code and I would guess they simply won't bother with
>>>>>>>>>>>>>>>>>>>> for files in the template. Or maybe there is a clever way to phrase the
>>>>>>>>>>>>>>>>>>>> header so it is already good to go.
>>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you
>>>>>>>>>>>>>>>>>>>> have to includes the attributions from it. The NOTICE file is required by
>>>>>>>>>>>>>>>>>>>> ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2
>>>>>>>>>>>>>>>>>>>> and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions,
>>>>>>>>>>>>>>>>>>>>>>>> and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation
>>>>>>>>>>>>>>>>>>>>>>>>> was to keep it simple. But of course we should keep it simple for users,
>>>>>>>>>>>>>>>>>>>>>>>>> not us :-)
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license
>>>>>>>>>>>>>>>>>>>>>>>>> and requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't
>>>>>>>>>>>>>>>>>>>>>>>>>> have any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want
>>>>>>>>>>>>>>>>>>>>>>>>>> these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible.
>>>>>>>>>>>>>>>>>>>>>>>>>> That's not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's
>>>>>>>>>>>>>>>>>>>>>>>>>> best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is
>>>>>>>>>>>>>>>>>>>>>>>>>> the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help
>>>>>>>>>>>>>>>>>>>>>>>>>> us create them?
>>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth
>>>>>>>>>>>>>>>>>>>>>>>>>> Knowles <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and
>>>>>>>>>>>>>>>>>>>>>>>>>> go" is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know
>>>>>>>>>>>>>>>>>>>>>>>>>> what one would put in a Python repo than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on apache_beam" is answered by David's initial email and
>>>>>>>>>>>>>>>>>>>>>>>>>> his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but
>>>>>>>>>>>>>>>>>>>>>>>>>> to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to
>>>>>>>>>>>>>>>>>>>>>>>>>> users to be told to checkout this git repo for the language of your choice
>>>>>>>>>>>>>>>>>>>>>>>>>> and run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up
>>>>>>>>>>>>>>>>>>>>>>>>>> a project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one
>>>>>>>>>>>>>>>>>>>>>>>>>> would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations
>>>>>>>>>>>>>>>>>>>>>>>>>> on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of
>>>>>>>>>>>>>>>>>>>>>>>>>> generic advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam
>>>>>>>>>>>>>>>>>>>>>>>>>> and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already
>>>>>>>>>>>>>>>>>>>>>>>>>> within the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>> Less maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM
>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate
>>>>>>>>>>>>>>>>>>>>>>>>>> if we discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we
>>>>>>>>>>>>>>>>>>>>>>>>>> drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM
>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM
>>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely
>>>>>>>>>>>>>>>>>>>>>>>>>> more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main
>>>>>>>>>>>>>>>>>>>>>>>>>> repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get
>>>>>>>>>>>>>>>>>>>>>>>>>> them to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM
>>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven
>>>>>>>>>>>>>>>>>>>>>>>>>> archetype, but that wouldn't work very well for Gradle and SBT users. I
>>>>>>>>>>>>>>>>>>>>>>>>>> think a GitHub template might be the more flexible option, and we could
>>>>>>>>>>>>>>>>>>>>>>>>>> have something similar for other languages as well. Having said that, we
>>>>>>>>>>>>>>>>>>>>>>>>>> could still create a Maven archetype. If someone is familiar with that
>>>>>>>>>>>>>>>>>>>>>>>>>> process, please let me know since I'm not too familiar with Maven and its
>>>>>>>>>>>>>>>>>>>>>>>>>> ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now
>>>>>>>>>>>>>>>>>>>>>>>>>> we only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM
>>>>>>>>>>>>>>>>>>>>>>>>>> Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there
>>>>>>>>>>>>>>>>>>>>>>>>>> any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache
>>>>>>>>>>>>>>>>>>>>>>>>>> repo, would we put this somewhere like apache/beam-java-template? I think
>>>>>>>>>>>>>>>>>>>>>>>>>> apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30
>>>>>>>>>>>>>>>>>>>>>>>>>> AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at
>>>>>>>>>>>>>>>>>>>>>>>>>> 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create
>>>>>>>>>>>>>>>>>>>>>>>>>> a new Beam Java project, I've been working on a GitHub template containing
>>>>>>>>>>>>>>>>>>>>>>>>>> a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>>> template:
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle,
>>>>>>>>>>>>>>>>>>>>>>>>>> sbt, and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on
>>>>>>>>>>>>>>>>>>>>>>>>>> how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Damon Douglas <da...@google.com>.
Good day, @David Cavazos <dc...@google.com> I was recently able to solve
using terraform to create a Cloud Build trigger for provisioning Dataflow
custom templates.  I wanted to check in first before initiating a pull
request on https://github.com/davidcavazos/beam-java.

I was considering the PR to add a directory called infrastructure/google
with all the terraform someone would need to provision a service account,
custom network, IAM permissions, etc as well as the Cloud Build
integration.  Would this be helpful?  The reason for infrastructure/google
instead of just infrastructure is that I wanted to leave room for others to
potentially add their own cloud variants i.e. infrastructure/aws.

Best,

Damon

On Mon, May 9, 2022 at 2:04 PM Ahmet Altay <al...@google.com> wrote:

> @David Cavazos <dc...@google.com> - Were you able to resolve this? And
> what exactly does a person need to do to enable GH actions?
>
> On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com>
> wrote:
>
>> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni <re...@google.com>, @Robert
>> Bradshaw <ro...@google.com> can any of you help us enable GitHub
>> actions on all the starter repositories? Thanks!
>>
>> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> Good news! The Java starter repo has been merged! 🎉
>>>
>>> However, Ahmet noticed that the tests are not running automatically. I
>>> tested them in my personal repo and they work, but I think GitHub actions
>>> have to be enabled in the new starter repos. I don't have permission to do
>>> so, can someone help us enable GitHub actions on all starter repos?
>>>
>>> Thanks!
>>>
>>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Thanks for taking a look. We actually considered supporting more
>>>> runners in them, but the complexity and maintenance burden on setting up
>>>> and supporting multiple runners in the testing infrastructure was quite
>>>> high. We didn't want to *only* support the Dataflow runner either, so
>>>> we simply linked to the runners documentation from the README. It could be
>>>> nice to support that at some point, but I think a better solution is to
>>>> improve the documentation on the runners page.
>>>>
>>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> I'm not a Java expert so I can't do a thorough review (and I
>>>>> definitely can't help on the legal end), but I tried using the template for
>>>>> a personal toy project 2 weeks ago and found it really helpful (this was my
>>>>> first time writing a Java pipeline, previously I'd written everything in
>>>>> Go). Thanks for putting it together David!
>>>>>
>>>>> My only substantial feedback it that it was tricky to move from the
>>>>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>>>>> it might be helpful to have instructions on doing that linked from the
>>>>> Readme since I imagine starting on Direct then moving to a different runner
>>>>> is a pretty common path; I don't think that should block getting this
>>>>> initial version in though, just a future improvement suggestion :)
>>>>>
>>>>> Thanks,
>>>>> Danny
>>>>>
>>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>>
>>>>>> Please review the PR since the Python and Go starter projects are
>>>>>> blocked until this one merges (so we get all the legal files right).
>>>>>>
>>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> OK. Bringing an important update on licensing to this thread for
>>>>>>>> consideration. Discussion on
>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with
>>>>>>>> key takeaways. These are things that were already true and people who are
>>>>>>>> good at this stuff already may know, but I'm just going to say them again
>>>>>>>> as I understand them:
>>>>>>>>
>>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>>  - BUT the copyright holders are the contributors to the project.
>>>>>>>> They must agree that their contributions can be licensed like this. The ASF
>>>>>>>> ICLA only agrees to ASL2 so we need to let them know. I suggest a
>>>>>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>>>>>> a checkbox*.
>>>>>>>>  - If we want, we can include a README that explains this and tells
>>>>>>>> users they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if
>>>>>>>> they want to change it however they want.
>>>>>>>>
>>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>>> sure.
>>>>>>>>
>>>>>>>
>>>>>>> My (likely unsurprising) take is that this is worth it (though I
>>>>>>> also agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>>
>>>>>>>
>>>>>>>> *Exactly how formal we need to get here is a matter of some debate
>>>>>>>> and risk tolerance. For these repos I think there is very little risk. One
>>>>>>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>>>>>>> but the bar in the US for i.p. is comically low so that's not a good
>>>>>>>> argument to depend on.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Friendly ping on this :)
>>>>>>>>>
>>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Can we create an empty file on each directory so I can fork the
>>>>>>>>>> repo? It doesn't look like there is a workaround to cloning empty repos in
>>>>>>>>>> GitHub. Then I can send a pull request.
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>>
>>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>>
>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>>> there's some step by step at
>>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>>
>>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>>
>>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>>
>>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>>> file?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>> For the starter projects I like them being "clone and go",
>>>>>>>>>>>>>>> but I'd like to keep them as minimal as possible. We could have another
>>>>>>>>>>>>>>> repo like `beam-working-examples` for more complete examples where each
>>>>>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I like the goal: for things where the build has extra
>>>>>>>>>>>>>>>> setup, have an example that is fully functional on its own. There is of
>>>>>>>>>>>>>>>> course the problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of
>>>>>>>>>>>>>>>> these bits may be past the "clone and go" stage of their project. They
>>>>>>>>>>>>>>>> probably already have a project and now they need a working example to read
>>>>>>>>>>>>>>>> and learn from. So it could be just one additional repo
>>>>>>>>>>>>>>>> `beam-working-examples` where each subdirectory is an independent working
>>>>>>>>>>>>>>>> setup. I do like having it a separate repo to avoid the temptation to
>>>>>>>>>>>>>>>> leverage anything from the Beam build. And each subdirectory should be
>>>>>>>>>>>>>>>> entirely independent and we also have to avoid the temptation to share
>>>>>>>>>>>>>>>> configuration across them, or it would defeat the purpose.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What do folks think about also having a less minimal set
>>>>>>>>>>>>>>>>> of starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think
>>>>>>>>>>>>>>>>>>> it will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it
>>>>>>>>>>>>>>>>>>> is "least surprise"
>>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not
>>>>>>>>>>>>>>>>>>> worthwhile due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you
>>>>>>>>>>>>>>>>>>> have to includes the attributions from it. The NOTICE file is required by
>>>>>>>>>>>>>>>>>>> ASF policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2
>>>>>>>>>>>>>>>>>>> and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions,
>>>>>>>>>>>>>>>>>>>>>>> and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was
>>>>>>>>>>>>>>>>>>>>>>>> to keep it simple. But of course we should keep it simple for users, not us
>>>>>>>>>>>>>>>>>>>>>>>> :-)
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license
>>>>>>>>>>>>>>>>>>>>>>>> and requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't
>>>>>>>>>>>>>>>>>>>>>>>>> have any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want
>>>>>>>>>>>>>>>>>>>>>>>>> these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's
>>>>>>>>>>>>>>>>>>>>>>>>> not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's
>>>>>>>>>>>>>>>>>>>>>>>>> best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is
>>>>>>>>>>>>>>>>>>>>>>>>> the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help
>>>>>>>>>>>>>>>>>>>>>>>>> us create them?
>>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw
>>>>>>>>>>>>>>>>>>>>>>>>> <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>> <ke...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go"
>>>>>>>>>>>>>>>>>>>>>>>>> is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know
>>>>>>>>>>>>>>>>>>>>>>>>> what one would put in a Python repo than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on apache_beam" is answered by David's initial email and
>>>>>>>>>>>>>>>>>>>>>>>>> his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but
>>>>>>>>>>>>>>>>>>>>>>>>> to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to
>>>>>>>>>>>>>>>>>>>>>>>>> users to be told to checkout this git repo for the language of your choice
>>>>>>>>>>>>>>>>>>>>>>>>> and run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one
>>>>>>>>>>>>>>>>>>>>>>>>> would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists
>>>>>>>>>>>>>>>>>>>>>>>>> a dependency on
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations
>>>>>>>>>>>>>>>>>>>>>>>>> on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam
>>>>>>>>>>>>>>>>>>>>>>>>> and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within
>>>>>>>>>>>>>>>>>>>>>>>>> the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each
>>>>>>>>>>>>>>>>>>>>>>>>> language.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>> Less maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that
>>>>>>>>>>>>>>>>>>>>>>>>> people can pull out a specific version of the examples that coincides with
>>>>>>>>>>>>>>>>>>>>>>>>> a specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if
>>>>>>>>>>>>>>>>>>>>>>>>> we discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we
>>>>>>>>>>>>>>>>>>>>>>>>> drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM
>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM
>>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely
>>>>>>>>>>>>>>>>>>>>>>>>> more flexible then the archetypes but the archetypes have a few
>>>>>>>>>>>>>>>>>>>>>>>>> conveniences since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main
>>>>>>>>>>>>>>>>>>>>>>>>> repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the
>>>>>>>>>>>>>>>>>>>>>>>>> starter happen?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them
>>>>>>>>>>>>>>>>>>>>>>>>> to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM
>>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype,
>>>>>>>>>>>>>>>>>>>>>>>>> but that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now
>>>>>>>>>>>>>>>>>>>>>>>>> we only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM
>>>>>>>>>>>>>>>>>>>>>>>>> Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there
>>>>>>>>>>>>>>>>>>>>>>>>> any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache
>>>>>>>>>>>>>>>>>>>>>>>>> repo, would we put this somewhere like apache/beam-java-template? I think
>>>>>>>>>>>>>>>>>>>>>>>>> apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30
>>>>>>>>>>>>>>>>>>>>>>>>> AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31
>>>>>>>>>>>>>>>>>>>>>>>>> AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create
>>>>>>>>>>>>>>>>>>>>>>>>> a new Beam Java project, I've been working on a GitHub template containing
>>>>>>>>>>>>>>>>>>>>>>>>> a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt,
>>>>>>>>>>>>>>>>>>>>>>>>> and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on
>>>>>>>>>>>>>>>>>>>>>>>>> how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Ahmet Altay <al...@google.com>.
@David Cavazos <dc...@google.com> - Were you able to resolve this? And
what exactly does a person need to do to enable GH actions?

On Fri, Apr 29, 2022 at 12:04 PM David Cavazos <dc...@google.com> wrote:

> @Kenneth Knowles <ke...@apache.org>, @Reza Rokni <re...@google.com>, @Robert
> Bradshaw <ro...@google.com> can any of you help us enable GitHub
> actions on all the starter repositories? Thanks!
>
> On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com>
> wrote:
>
>> Good news! The Java starter repo has been merged! 🎉
>>
>> However, Ahmet noticed that the tests are not running automatically. I
>> tested them in my personal repo and they work, but I think GitHub actions
>> have to be enabled in the new starter repos. I don't have permission to do
>> so, can someone help us enable GitHub actions on all starter repos?
>>
>> Thanks!
>>
>> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> Thanks for taking a look. We actually considered supporting more runners
>>> in them, but the complexity and maintenance burden on setting up and
>>> supporting multiple runners in the testing infrastructure was quite high.
>>> We didn't want to *only* support the Dataflow runner either, so we
>>> simply linked to the runners documentation from the README. It could be
>>> nice to support that at some point, but I think a better solution is to
>>> improve the documentation on the runners page.
>>>
>>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> I'm not a Java expert so I can't do a thorough review (and I definitely
>>>> can't help on the legal end), but I tried using the template for a personal
>>>> toy project 2 weeks ago and found it really helpful (this was my first time
>>>> writing a Java pipeline, previously I'd written everything in Go). Thanks
>>>> for putting it together David!
>>>>
>>>> My only substantial feedback it that it was tricky to move from the
>>>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>>>> it might be helpful to have instructions on doing that linked from the
>>>> Readme since I imagine starting on Direct then moving to a different runner
>>>> is a pretty common path; I don't think that should block getting this
>>>> initial version in though, just a future improvement suggestion :)
>>>>
>>>> Thanks,
>>>> Danny
>>>>
>>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>>
>>>>> Please review the PR since the Python and Go starter projects are
>>>>> blocked until this one merges (so we get all the legal files right).
>>>>>
>>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>>
>>>>>
>>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> OK. Bringing an important update on licensing to this thread for
>>>>>>> consideration. Discussion on
>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with
>>>>>>> key takeaways. These are things that were already true and people who are
>>>>>>> good at this stuff already may know, but I'm just going to say them again
>>>>>>> as I understand them:
>>>>>>>
>>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives
>>>>>>> "users" the permissions of both licenses - they can take their pick so they
>>>>>>> can treat it as MIT-0 licensed.
>>>>>>>  - BUT the copyright holders are the contributors to the project.
>>>>>>> They must agree that their contributions can be licensed like this. The ASF
>>>>>>> ICLA only agrees to ASL2 so we need to let them know. I suggest a
>>>>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>>>>> a checkbox*.
>>>>>>>  - If we want, we can include a README that explains this and tells
>>>>>>> users they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if
>>>>>>> they want to change it however they want.
>>>>>>>
>>>>>>> So I guess now the decision is whether all of the above is
>>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>>> sure.
>>>>>>>
>>>>>>
>>>>>> My (likely unsurprising) take is that this is worth it (though I also
>>>>>> agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>>
>>>>>>
>>>>>>> *Exactly how formal we need to get here is a matter of some debate
>>>>>>> and risk tolerance. For these repos I think there is very little risk. One
>>>>>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>>>>>> but the bar in the US for i.p. is comically low so that's not a good
>>>>>>> argument to depend on.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Friendly ping on this :)
>>>>>>>>
>>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Can we create an empty file on each directory so I can fork the
>>>>>>>>> repo? It doesn't look like there is a workaround to cloning empty repos in
>>>>>>>>> GitHub. Then I can send a pull request.
>>>>>>>>>
>>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>>
>>>>>>>>>> I was trying to create a PR to merge the starter project
>>>>>>>>>> contents, but I can't fork the repo because it's empty. Can I either get
>>>>>>>>>> permissions to directly push or bother you with creating an empty README or
>>>>>>>>>> some other file so I can fork it and open a PR? Thanks!
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and
>>>>>>>>>>> there's some step by step at
>>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>>
>>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>>
>>>>>>>>>>>     Apache Beam
>>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>>
>>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>>> file?
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>> For the starter projects I like them being "clone and go",
>>>>>>>>>>>>>> but I'd like to keep them as minimal as possible. We could have another
>>>>>>>>>>>>>> repo like `beam-working-examples` for more complete examples where each
>>>>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I like the goal: for things where the build has extra setup,
>>>>>>>>>>>>>>> have an example that is fully functional on its own. There is of course the
>>>>>>>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The other piece is that a user wanting to know some of these
>>>>>>>>>>>>>>> bits may be past the "clone and go" stage of their project. They probably
>>>>>>>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>>>>>>>> would defeat the purpose.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think
>>>>>>>>>>>>>>>>>> it will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it
>>>>>>>>>>>>>>>>>> is "least surprise"
>>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile
>>>>>>>>>>>>>>>>>> due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have
>>>>>>>>>>>>>>>>>> to includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2
>>>>>>>>>>>>>>>>>> and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions,
>>>>>>>>>>>>>>>>>>>>>> and may be able to help out.
>>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was
>>>>>>>>>>>>>>>>>>>>>>> to keep it simple. But of course we should keep it simple for users, not us
>>>>>>>>>>>>>>>>>>>>>>> :-)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license
>>>>>>>>>>>>>>>>>>>>>>> and requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't
>>>>>>>>>>>>>>>>>>>>>>>> have any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want
>>>>>>>>>>>>>>>>>>>>>>>> these to be pretty
>>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's
>>>>>>>>>>>>>>>>>>>>>>>> not what the Apache
>>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's
>>>>>>>>>>>>>>>>>>>>>>>> best to be explicit
>>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is
>>>>>>>>>>>>>>>>>>>>>>>> the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help
>>>>>>>>>>>>>>>>>>>>>>>> us create them?
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go"
>>>>>>>>>>>>>>>>>>>>>>>> is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know
>>>>>>>>>>>>>>>>>>>>>>>> what one would put in a Python repo than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on apache_beam" is answered by David's initial email and
>>>>>>>>>>>>>>>>>>>>>>>> his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but
>>>>>>>>>>>>>>>>>>>>>>>> to be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to
>>>>>>>>>>>>>>>>>>>>>>>> users to be told to checkout this git repo for the language of your choice
>>>>>>>>>>>>>>>>>>>>>>>> and run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one
>>>>>>>>>>>>>>>>>>>>>>>> would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists
>>>>>>>>>>>>>>>>>>>>>>>> a dependency on
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations
>>>>>>>>>>>>>>>>>>>>>>>> on file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam
>>>>>>>>>>>>>>>>>>>>>>>> and your package.json file
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within
>>>>>>>>>>>>>>>>>>>>>>>> the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>> Less maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith
>>>>>>>>>>>>>>>>>>>>>>>> Malvetti would prefer having repos for all languages. It makes sense for
>>>>>>>>>>>>>>>>>>>>>>>> consistency as well.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people
>>>>>>>>>>>>>>>>>>>>>>>> can pull out a specific version of the examples that coincides with a
>>>>>>>>>>>>>>>>>>>>>>>> specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if
>>>>>>>>>>>>>>>>>>>>>>>> we discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we
>>>>>>>>>>>>>>>>>>>>>>>> drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke
>>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM
>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of
>>>>>>>>>>>>>>>>>>>>>>>> updating the Beam version and other dependencies automatically. Testing is
>>>>>>>>>>>>>>>>>>>>>>>> already set up via GitHub actions for every pull request, so it would
>>>>>>>>>>>>>>>>>>>>>>>> automatically be tested as soon as there is a new dependency version
>>>>>>>>>>>>>>>>>>>>>>>> available.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I
>>>>>>>>>>>>>>>>>>>>>>>> don't expect them to break commonly, but I think it would be good to make
>>>>>>>>>>>>>>>>>>>>>>>> sure tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM
>>>>>>>>>>>>>>>>>>>>>>>> Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main
>>>>>>>>>>>>>>>>>>>>>>>> repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them
>>>>>>>>>>>>>>>>>>>>>>>> to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM
>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype,
>>>>>>>>>>>>>>>>>>>>>>>> but that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we
>>>>>>>>>>>>>>>>>>>>>>>> only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM
>>>>>>>>>>>>>>>>>>>>>>>> Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there
>>>>>>>>>>>>>>>>>>>>>>>> any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache
>>>>>>>>>>>>>>>>>>>>>>>> repo, would we put this somewhere like apache/beam-java-template? I think
>>>>>>>>>>>>>>>>>>>>>>>> apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM
>>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31
>>>>>>>>>>>>>>>>>>>>>>>> AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a
>>>>>>>>>>>>>>>>>>>>>>>> new Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt,
>>>>>>>>>>>>>>>>>>>>>>>> and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on
>>>>>>>>>>>>>>>>>>>>>>>> how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
@Kenneth Knowles <ke...@apache.org>, @Reza Rokni <re...@google.com>, @Robert
Bradshaw <ro...@google.com> can any of you help us enable GitHub actions
on all the starter repositories? Thanks!

On Fri, Apr 22, 2022 at 10:53 AM David Cavazos <dc...@google.com> wrote:

> Good news! The Java starter repo has been merged! 🎉
>
> However, Ahmet noticed that the tests are not running automatically. I
> tested them in my personal repo and they work, but I think GitHub actions
> have to be enabled in the new starter repos. I don't have permission to do
> so, can someone help us enable GitHub actions on all starter repos?
>
> Thanks!
>
> On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com>
> wrote:
>
>> Thanks for taking a look. We actually considered supporting more runners
>> in them, but the complexity and maintenance burden on setting up and
>> supporting multiple runners in the testing infrastructure was quite high.
>> We didn't want to *only* support the Dataflow runner either, so we
>> simply linked to the runners documentation from the README. It could be
>> nice to support that at some point, but I think a better solution is to
>> improve the documentation on the runners page.
>>
>> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <da...@google.com>
>> wrote:
>>
>>> I'm not a Java expert so I can't do a thorough review (and I definitely
>>> can't help on the legal end), but I tried using the template for a personal
>>> toy project 2 weeks ago and found it really helpful (this was my first time
>>> writing a Java pipeline, previously I'd written everything in Go). Thanks
>>> for putting it together David!
>>>
>>> My only substantial feedback it that it was tricky to move from the
>>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>>> it might be helpful to have instructions on doing that linked from the
>>> Readme since I imagine starting on Direct then moving to a different runner
>>> is a pretty common path; I don't think that should block getting this
>>> initial version in though, just a future improvement suggestion :)
>>>
>>> Thanks,
>>> Danny
>>>
>>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> I've added the dual license along with the CONTRIBUTING.md and
>>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>>
>>>> Please review the PR since the Python and Go starter projects are
>>>> blocked until this one merges (so we get all the legal files right).
>>>>
>>>> https://github.com/apache/beam-starter-java/pull/1
>>>>
>>>>
>>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> OK. Bringing an important update on licensing to this thread for
>>>>>> consideration. Discussion on
>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with
>>>>>> key takeaways. These are things that were already true and people who are
>>>>>> good at this stuff already may know, but I'm just going to say them again
>>>>>> as I understand them:
>>>>>>
>>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives "users"
>>>>>> the permissions of both licenses - they can take their pick so they can
>>>>>> treat it as MIT-0 licensed.
>>>>>>  - BUT the copyright holders are the contributors to the project.
>>>>>> They must agree that their contributions can be licensed like this. The ASF
>>>>>> ICLA only agrees to ASL2 so we need to let them know. I suggest a
>>>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>>>> a checkbox*.
>>>>>>  - If we want, we can include a README that explains this and tells
>>>>>> users they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if
>>>>>> they want to change it however they want.
>>>>>>
>>>>>> So I guess now the decision is whether all of the above is
>>>>>> complicated enough for users that it outweighs the benefit. I'm not really
>>>>>> sure.
>>>>>>
>>>>>
>>>>> My (likely unsurprising) take is that this is worth it (though I also
>>>>> agree with your asterisked footnote). A CONTRIBUTING.md and
>>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>>
>>>>>
>>>>>> *Exactly how formal we need to get here is a matter of some debate
>>>>>> and risk tolerance. For these repos I think there is very little risk. One
>>>>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>>>>> but the bar in the US for i.p. is comically low so that's not a good
>>>>>> argument to depend on.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Friendly ping on this :)
>>>>>>>
>>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Can we create an empty file on each directory so I can fork the
>>>>>>>> repo? It doesn't look like there is a workaround to cloning empty repos in
>>>>>>>> GitHub. Then I can send a pull request.
>>>>>>>>
>>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>>
>>>>>>>>> I was trying to create a PR to merge the starter project contents,
>>>>>>>>> but I can't fork the repo because it's empty. Can I either get permissions
>>>>>>>>> to directly push or bother you with creating an empty README or some other
>>>>>>>>> file so I can fork it and open a PR? Thanks!
>>>>>>>>>
>>>>>>>>> [image: image.png]
>>>>>>>>>
>>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and there's
>>>>>>>>>> some step by step at
>>>>>>>>>> https://infra.apache.org/licensing-howto.html
>>>>>>>>>>
>>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>>
>>>>>>>>>>     Apache Beam
>>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>>
>>>>>>>>>>     This product includes software developed at
>>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I found this example NOTICE
>>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>>> file?
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Can someone point me to an example on how the NOTICE file
>>>>>>>>>>>> should look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> +1
>>>>>>>>>>>>> For the starter projects I like them being "clone and go", but
>>>>>>>>>>>>> I'd like to keep them as minimal as possible. We could have another repo
>>>>>>>>>>>>> like `beam-working-examples` for more complete examples where each
>>>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>>> everything.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I like the goal: for things where the build has extra setup,
>>>>>>>>>>>>>> have an example that is fully functional on its own. There is of course the
>>>>>>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The other piece is that a user wanting to know some of these
>>>>>>>>>>>>>> bits may be past the "clone and go" stage of their project. They probably
>>>>>>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>>>>>>> would defeat the purpose.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think
>>>>>>>>>>>>>>>>> it will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it
>>>>>>>>>>>>>>>>> is "least surprise"
>>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile
>>>>>>>>>>>>>>>>> due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have
>>>>>>>>>>>>>>>>> to includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2
>>>>>>>>>>>>>>>>> and a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions,
>>>>>>>>>>>>>>>>>>>>> and may be able to help out.
>>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was
>>>>>>>>>>>>>>>>>>>>>> to keep it simple. But of course we should keep it simple for users, not us
>>>>>>>>>>>>>>>>>>>>>> :-)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license
>>>>>>>>>>>>>>>>>>>>>> and requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have
>>>>>>>>>>>>>>>>>>>>>>> any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to
>>>>>>>>>>>>>>>>>>>>>>> encumber any users of
>>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want
>>>>>>>>>>>>>>>>>>>>>>> these to be pretty
>>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's
>>>>>>>>>>>>>>>>>>>>>>> not what the Apache
>>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good
>>>>>>>>>>>>>>>>>>>>>>> argument could likely be
>>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best
>>>>>>>>>>>>>>>>>>>>>>> to be explicit
>>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is
>>>>>>>>>>>>>>>>>>>>>>> the most pressing one and the one that is ready, but the rest should be
>>>>>>>>>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal
>>>>>>>>>>>>>>>>>>>>>>> starter projects for every language. Once we have Java, Python and Go, it
>>>>>>>>>>>>>>>>>>>>>>> might be a good idea to change the quickstarts to use these instead of the
>>>>>>>>>>>>>>>>>>>>>>> word count. There is already a dedicated word count walkthrough so I think
>>>>>>>>>>>>>>>>>>>>>>> that is already covered.
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go"
>>>>>>>>>>>>>>>>>>>>>>> is a big part of it.
>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know
>>>>>>>>>>>>>>>>>>>>>>> what one would put in a Python repo than, other than a bare setup.py that
>>>>>>>>>>>>>>>>>>>>>>> lists a dependency on apache_beam" is answered by David's initial email and
>>>>>>>>>>>>>>>>>>>>>>> his repo, namely:
>>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to
>>>>>>>>>>>>>>>>>>>>>>> be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky
>>>>>>>>>>>>>>>>>>>>>>> because one doesn't want to
>>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to
>>>>>>>>>>>>>>>>>>>>>>> users to be told to checkout this git repo for the language of your choice
>>>>>>>>>>>>>>>>>>>>>>> and run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert
>>>>>>>>>>>>>>>>>>>>>>> Bradshaw <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one
>>>>>>>>>>>>>>>>>>>>>>> would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on
>>>>>>>>>>>>>>>>>>>>>>> file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and
>>>>>>>>>>>>>>>>>>>>>>> your package.json file
>>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within
>>>>>>>>>>>>>>>>>>>>>>> the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes.
>>>>>>>>>>>>>>>>>>>>>>> Less maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti
>>>>>>>>>>>>>>>>>>>>>>> would prefer having repos for all languages. It makes sense for consistency
>>>>>>>>>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke
>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people
>>>>>>>>>>>>>>>>>>>>>>> can pull out a specific version of the examples that coincides with a
>>>>>>>>>>>>>>>>>>>>>>> specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if
>>>>>>>>>>>>>>>>>>>>>>> we discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we
>>>>>>>>>>>>>>>>>>>>>>> drop the archetype?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke
>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating
>>>>>>>>>>>>>>>>>>>>>>> the Beam version and other dependencies automatically. Testing is already
>>>>>>>>>>>>>>>>>>>>>>> set up via GitHub actions for every pull request, so it would automatically
>>>>>>>>>>>>>>>>>>>>>>> be tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke
>>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main
>>>>>>>>>>>>>>>>>>>>>>> repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them
>>>>>>>>>>>>>>>>>>>>>>> to happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM
>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype,
>>>>>>>>>>>>>>>>>>>>>>> but that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we
>>>>>>>>>>>>>>>>>>>>>>> only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM
>>>>>>>>>>>>>>>>>>>>>>> Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there
>>>>>>>>>>>>>>>>>>>>>>> any progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache
>>>>>>>>>>>>>>>>>>>>>>> repo, would we put this somewhere like apache/beam-java-template? I think
>>>>>>>>>>>>>>>>>>>>>>> apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM
>>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31
>>>>>>>>>>>>>>>>>>>>>>> AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a
>>>>>>>>>>>>>>>>>>>>>>> new Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt,
>>>>>>>>>>>>>>>>>>>>>>> and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on
>>>>>>>>>>>>>>>>>>>>>>> how to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Good news! The Java starter repo has been merged! 🎉

However, Ahmet noticed that the tests are not running automatically. I
tested them in my personal repo and they work, but I think GitHub actions
have to be enabled in the new starter repos. I don't have permission to do
so, can someone help us enable GitHub actions on all starter repos?

Thanks!

On Mon, Apr 11, 2022 at 12:29 PM David Cavazos <dc...@google.com> wrote:

> Thanks for taking a look. We actually considered supporting more runners
> in them, but the complexity and maintenance burden on setting up and
> supporting multiple runners in the testing infrastructure was quite high.
> We didn't want to *only* support the Dataflow runner either, so we simply
> linked to the runners documentation from the README. It could be nice to
> support that at some point, but I think a better solution is to improve the
> documentation on the runners page.
>
> On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <da...@google.com>
> wrote:
>
>> I'm not a Java expert so I can't do a thorough review (and I definitely
>> can't help on the legal end), but I tried using the template for a personal
>> toy project 2 weeks ago and found it really helpful (this was my first time
>> writing a Java pipeline, previously I'd written everything in Go). Thanks
>> for putting it together David!
>>
>> My only substantial feedback it that it was tricky to move from the
>> Direct runner to a different runner (in my case I was targeting Dataflow) -
>> it might be helpful to have instructions on doing that linked from the
>> Readme since I imagine starting on Direct then moving to a different runner
>> is a pretty common path; I don't think that should block getting this
>> initial version in though, just a future improvement suggestion :)
>>
>> Thanks,
>> Danny
>>
>> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com> wrote:
>>
>>> I've added the dual license along with the CONTRIBUTING.md and
>>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>>
>>> Please review the PR since the Python and Go starter projects are
>>> blocked until this one merges (so we get all the legal files right).
>>>
>>> https://github.com/apache/beam-starter-java/pull/1
>>>
>>>
>>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>>
>>>>> OK. Bringing an important update on licensing to this thread for
>>>>> consideration. Discussion on
>>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with
>>>>> key takeaways. These are things that were already true and people who are
>>>>> good at this stuff already may know, but I'm just going to say them again
>>>>> as I understand them:
>>>>>
>>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives "users"
>>>>> the permissions of both licenses - they can take their pick so they can
>>>>> treat it as MIT-0 licensed.
>>>>>  - BUT the copyright holders are the contributors to the project. They
>>>>> must agree that their contributions can be licensed like this. The ASF ICLA
>>>>> only agrees to ASL2 so we need to let them know. I suggest a
>>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>>> a checkbox*.
>>>>>  - If we want, we can include a README that explains this and tells
>>>>> users they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if
>>>>> they want to change it however they want.
>>>>>
>>>>> So I guess now the decision is whether all of the above is complicated
>>>>> enough for users that it outweighs the benefit. I'm not really sure.
>>>>>
>>>>
>>>> My (likely unsurprising) take is that this is worth it (though I also
>>>> agree with your asterisked footnote). A CONTRIBUTING.md and
>>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>>
>>>>
>>>>> *Exactly how formal we need to get here is a matter of some debate and
>>>>> risk tolerance. For these repos I think there is very little risk. One
>>>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>>>> but the bar in the US for i.p. is comically low so that's not a good
>>>>> argument to depend on.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Friendly ping on this :)
>>>>>>
>>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Can we create an empty file on each directory so I can fork the
>>>>>>> repo? It doesn't look like there is a workaround to cloning empty repos in
>>>>>>> GitHub. Then I can send a pull request.
>>>>>>>
>>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>>
>>>>>>>> I was trying to create a PR to merge the starter project contents,
>>>>>>>> but I can't fork the repo because it's empty. Can I either get permissions
>>>>>>>> to directly push or bother you with creating an empty README or some other
>>>>>>>> file so I can fork it and open a PR? Thanks!
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and there's
>>>>>>>>> some step by step at https://infra.apache.org/licensing-howto.html
>>>>>>>>>
>>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>>
>>>>>>>>>     Apache Beam
>>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>>
>>>>>>>>>     This product includes software developed at
>>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I found this example NOTICE
>>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>>> file?
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Can someone point me to an example on how the NOTICE file should
>>>>>>>>>>> look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1
>>>>>>>>>>>> For the starter projects I like them being "clone and go", but
>>>>>>>>>>>> I'd like to keep them as minimal as possible. We could have another repo
>>>>>>>>>>>> like `beam-working-examples` for more complete examples where each
>>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>>> everything.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <
>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I like the goal: for things where the build has extra setup,
>>>>>>>>>>>>> have an example that is fully functional on its own. There is of course the
>>>>>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The other piece is that a user wanting to know some of these
>>>>>>>>>>>>> bits may be past the "clone and go" stage of their project. They probably
>>>>>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>>>>>> would defeat the purpose.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it
>>>>>>>>>>>>>>>> will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>>>>>>>> "least surprise"
>>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile
>>>>>>>>>>>>>>>> due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have
>>>>>>>>>>>>>>>> to includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and
>>>>>>>>>>>>>>>> a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and
>>>>>>>>>>>>>>>>>>>> may be able to help out.
>>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to
>>>>>>>>>>>>>>>>>>>>> keep it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have
>>>>>>>>>>>>>>>>>>>>>> any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber
>>>>>>>>>>>>>>>>>>>>>> any users of
>>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want
>>>>>>>>>>>>>>>>>>>>>> these to be pretty
>>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's
>>>>>>>>>>>>>>>>>>>>>> not what the Apache
>>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument
>>>>>>>>>>>>>>>>>>>>>> could likely be
>>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best
>>>>>>>>>>>>>>>>>>>>>> to be explicit
>>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the
>>>>>>>>>>>>>>>>>>>>>> most pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is
>>>>>>>>>>>>>>>>>>>>>> a big part of it.
>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what
>>>>>>>>>>>>>>>>>>>>>> one would put in a Python repo than, other than a bare setup.py that lists
>>>>>>>>>>>>>>>>>>>>>> a dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to
>>>>>>>>>>>>>>>>>>>>>> be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because
>>>>>>>>>>>>>>>>>>>>>> one doesn't want to
>>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users
>>>>>>>>>>>>>>>>>>>>>> to be told to checkout this git repo for the language of your choice and
>>>>>>>>>>>>>>>>>>>>>> run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw
>>>>>>>>>>>>>>>>>>>>>> <ro...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one
>>>>>>>>>>>>>>>>>>>>>> would put in a Python repo
>>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on
>>>>>>>>>>>>>>>>>>>>>> file layout, etc. more
>>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and
>>>>>>>>>>>>>>>>>>>>>> your package.json file
>>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within
>>>>>>>>>>>>>>>>>>>>>> the Beam repo found in:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes.
>>>>>>>>>>>>>>>>>>>>>> Less maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti
>>>>>>>>>>>>>>>>>>>>>> would prefer having repos for all languages. It makes sense for consistency
>>>>>>>>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik
>>>>>>>>>>>>>>>>>>>>>> <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people
>>>>>>>>>>>>>>>>>>>>>> can pull out a specific version of the examples that coincides with a
>>>>>>>>>>>>>>>>>>>>>> specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we
>>>>>>>>>>>>>>>>>>>>>> drop the archetype?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke
>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating
>>>>>>>>>>>>>>>>>>>>>> the Beam version and other dependencies automatically. Testing is already
>>>>>>>>>>>>>>>>>>>>>> set up via GitHub actions for every pull request, so it would automatically
>>>>>>>>>>>>>>>>>>>>>> be tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke
>>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main
>>>>>>>>>>>>>>>>>>>>>> repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM
>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype,
>>>>>>>>>>>>>>>>>>>>>> but that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we
>>>>>>>>>>>>>>>>>>>>>> only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM
>>>>>>>>>>>>>>>>>>>>>> Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache
>>>>>>>>>>>>>>>>>>>>>> repo, would we put this somewhere like apache/beam-java-template? I think
>>>>>>>>>>>>>>>>>>>>>> apache repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM
>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM
>>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a
>>>>>>>>>>>>>>>>>>>>>> new Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the
>>>>>>>>>>>>>>>>>>>>>> template contains:
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt,
>>>>>>>>>>>>>>>>>>>>>> and Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how
>>>>>>>>>>>>>>>>>>>>>> to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new
>>>>>>>>>>>>>>>>>>>>>> GitHub repo from a template.
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Thanks for taking a look. We actually considered supporting more runners in
them, but the complexity and maintenance burden on setting up and
supporting multiple runners in the testing infrastructure was quite high.
We didn't want to *only* support the Dataflow runner either, so we simply
linked to the runners documentation from the README. It could be nice to
support that at some point, but I think a better solution is to improve the
documentation on the runners page.

On Thu, Apr 7, 2022 at 5:21 AM Danny McCormick <da...@google.com>
wrote:

> I'm not a Java expert so I can't do a thorough review (and I definitely
> can't help on the legal end), but I tried using the template for a personal
> toy project 2 weeks ago and found it really helpful (this was my first time
> writing a Java pipeline, previously I'd written everything in Go). Thanks
> for putting it together David!
>
> My only substantial feedback it that it was tricky to move from the Direct
> runner to a different runner (in my case I was targeting Dataflow) - it
> might be helpful to have instructions on doing that linked from the Readme
> since I imagine starting on Direct then moving to a different runner is a
> pretty common path; I don't think that should block getting this initial
> version in though, just a future improvement suggestion :)
>
> Thanks,
> Danny
>
> On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com> wrote:
>
>> I've added the dual license along with the CONTRIBUTING.md and
>> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>>
>> Please review the PR since the Python and Go starter projects are blocked
>> until this one merges (so we get all the legal files right).
>>
>> https://github.com/apache/beam-starter-java/pull/1
>>
>>
>> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> OK. Bringing an important update on licensing to this thread for
>>>> consideration. Discussion on
>>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with key
>>>> takeaways. These are things that were already true and people who are good
>>>> at this stuff already may know, but I'm just going to say them again as I
>>>> understand them:
>>>>
>>>>  - We can dual license MIT-0 and ASL2, which means "we" gives "users"
>>>> the permissions of both licenses - they can take their pick so they can
>>>> treat it as MIT-0 licensed.
>>>>  - BUT the copyright holders are the contributors to the project. They
>>>> must agree that their contributions can be licensed like this. The ASF ICLA
>>>> only agrees to ASL2 so we need to let them know. I suggest a
>>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>>> a checkbox*.
>>>>  - If we want, we can include a README that explains this and tells
>>>> users they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if
>>>> they want to change it however they want.
>>>>
>>>> So I guess now the decision is whether all of the above is complicated
>>>> enough for users that it outweighs the benefit. I'm not really sure.
>>>>
>>>
>>> My (likely unsurprising) take is that this is worth it (though I also
>>> agree with your asterisked footnote). A CONTRIBUTING.md and
>>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>>
>>>
>>>> *Exactly how formal we need to get here is a matter of some debate and
>>>> risk tolerance. For these repos I think there is very little risk. One
>>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>>> but the bar in the US for i.p. is comically low so that's not a good
>>>> argument to depend on.
>>>>
>>>> Kenn
>>>>
>>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> Friendly ping on this :)
>>>>>
>>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Can we create an empty file on each directory so I can fork the repo?
>>>>>> It doesn't look like there is a workaround to cloning empty repos in
>>>>>> GitHub. Then I can send a pull request.
>>>>>>
>>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>>
>>>>>>> I was trying to create a PR to merge the starter project contents,
>>>>>>> but I can't fork the repo because it's empty. Can I either get permissions
>>>>>>> to directly push or bother you with creating an empty README or some other
>>>>>>> file so I can fork it and open a PR? Thanks!
>>>>>>>
>>>>>>> [image: image.png]
>>>>>>>
>>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I always get mixed up myself. The policies are at
>>>>>>>> https://www.apache.org/legal/src-headers.html#notice and there's
>>>>>>>> some step by step at https://infra.apache.org/licensing-howto.html
>>>>>>>>
>>>>>>>> TL;DR the contents should be like so:
>>>>>>>>
>>>>>>>>     Apache Beam
>>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>>
>>>>>>>>     This product includes software developed at
>>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I found this example NOTICE
>>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>>> file?
>>>>>>>>>
>>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Can someone point me to an example on how the NOTICE file should
>>>>>>>>>> look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> +1
>>>>>>>>>>> For the starter projects I like them being "clone and go", but
>>>>>>>>>>> I'd like to keep them as minimal as possible. We could have another repo
>>>>>>>>>>> like `beam-working-examples` for more complete examples where each
>>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>>> everything.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I like the goal: for things where the build has extra setup,
>>>>>>>>>>>> have an example that is fully functional on its own. There is of course the
>>>>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>>
>>>>>>>>>>>> The other piece is that a user wanting to know some of these
>>>>>>>>>>>> bits may be past the "clone and go" stage of their project. They probably
>>>>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>>>>> would defeat the purpose.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is great!
>>>>>>>>>>>>>
>>>>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Reza
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it
>>>>>>>>>>>>>>> will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>>>>>>> "least surprise"
>>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile
>>>>>>>>>>>>>>> due to its impact on contributor license agreements)
>>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and
>>>>>>>>>>>>>>> a simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and
>>>>>>>>>>>>>>>>>>> may be able to help out.
>>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to
>>>>>>>>>>>>>>>>>>>> keep it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have
>>>>>>>>>>>>>>>>>>>>> any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber
>>>>>>>>>>>>>>>>>>>>> any users of
>>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want
>>>>>>>>>>>>>>>>>>>>> these to be pretty
>>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's not
>>>>>>>>>>>>>>>>>>>>> what the Apache
>>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument
>>>>>>>>>>>>>>>>>>>>> could likely be
>>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best
>>>>>>>>>>>>>>>>>>>>> to be explicit
>>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the
>>>>>>>>>>>>>>>>>>>>> most pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is
>>>>>>>>>>>>>>>>>>>>> a big part of it.
>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what
>>>>>>>>>>>>>>>>>>>>> one would put in a Python repo than, other than a bare setup.py that lists
>>>>>>>>>>>>>>>>>>>>> a dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to
>>>>>>>>>>>>>>>>>>>>> be part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because
>>>>>>>>>>>>>>>>>>>>> one doesn't want to
>>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the
>>>>>>>>>>>>>>>>>>>>> template itself should
>>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users
>>>>>>>>>>>>>>>>>>>>> to be told to checkout this git repo for the language of your choice and
>>>>>>>>>>>>>>>>>>>>> run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would
>>>>>>>>>>>>>>>>>>>>> put in a Python repo
>>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on
>>>>>>>>>>>>>>>>>>>>> file layout, etc. more
>>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and
>>>>>>>>>>>>>>>>>>>>> your package.json file
>>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within the
>>>>>>>>>>>>>>>>>>>>> Beam repo found in:
>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than
>>>>>>>>>>>>>>>>>>>>> Wordcount just for novelty/freshness but agreed with the suggestion that
>>>>>>>>>>>>>>>>>>>>> having an example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the
>>>>>>>>>>>>>>>>>>>>> Wordcount example in each repo? I know that makes the repos less minimal,
>>>>>>>>>>>>>>>>>>>>> but we could rewrite the quickstarts around these repos instead of the
>>>>>>>>>>>>>>>>>>>>> current Wordcount examples. Or maybe we don't need to use the Wordcount
>>>>>>>>>>>>>>>>>>>>> example in the quickstarts...
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti
>>>>>>>>>>>>>>>>>>>>> would prefer having repos for all languages. It makes sense for consistency
>>>>>>>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people
>>>>>>>>>>>>>>>>>>>>> can pull out a specific version of the examples that coincides with a
>>>>>>>>>>>>>>>>>>>>> specific SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop
>>>>>>>>>>>>>>>>>>>>> the archetype?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik
>>>>>>>>>>>>>>>>>>>>> <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating
>>>>>>>>>>>>>>>>>>>>> the Beam version and other dependencies automatically. Testing is already
>>>>>>>>>>>>>>>>>>>>> set up via GitHub actions for every pull request, so it would automatically
>>>>>>>>>>>>>>>>>>>>> be tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke
>>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main
>>>>>>>>>>>>>>>>>>>>> repo, or a single starter repo containing all the starters or one per
>>>>>>>>>>>>>>>>>>>>> language or one per build system?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but
>>>>>>>>>>>>>>>>>>>>> that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we
>>>>>>>>>>>>>>>>>>>>> only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM
>>>>>>>>>>>>>>>>>>>>> Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM
>>>>>>>>>>>>>>>>>>>>> Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo,
>>>>>>>>>>>>>>>>>>>>> would we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM
>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include
>>>>>>>>>>>>>>>>>>>>> anyone else!
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM
>>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a
>>>>>>>>>>>>>>>>>>>>> new Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and
>>>>>>>>>>>>>>>>>>>>> Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via
>>>>>>>>>>>>>>>>>>>>> GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how
>>>>>>>>>>>>>>>>>>>>> to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub
>>>>>>>>>>>>>>>>>>>>> repo from a template.
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my
>>>>>>>>>>>>>>>>>>>>> personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Danny McCormick <da...@google.com>.
I'm not a Java expert so I can't do a thorough review (and I definitely
can't help on the legal end), but I tried using the template for a personal
toy project 2 weeks ago and found it really helpful (this was my first time
writing a Java pipeline, previously I'd written everything in Go). Thanks
for putting it together David!

My only substantial feedback it that it was tricky to move from the Direct
runner to a different runner (in my case I was targeting Dataflow) - it
might be helpful to have instructions on doing that linked from the Readme
since I imagine starting on Direct then moving to a different runner is a
pretty common path; I don't think that should block getting this initial
version in though, just a future improvement suggestion :)

Thanks,
Danny

On Wed, Apr 6, 2022 at 6:19 PM David Cavazos <dc...@google.com> wrote:

> I've added the dual license along with the CONTRIBUTING.md and
> PULL_REQUEST_TEMPLATE.md. The sample is ready for review.
>
> Please review the PR since the Python and Go starter projects are blocked
> until this one merges (so we get all the legal files right).
>
> https://github.com/apache/beam-starter-java/pull/1
>
>
> On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> OK. Bringing an important update on licensing to this thread for
>>> consideration. Discussion on
>>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with key
>>> takeaways. These are things that were already true and people who are good
>>> at this stuff already may know, but I'm just going to say them again as I
>>> understand them:
>>>
>>>  - We can dual license MIT-0 and ASL2, which means "we" gives "users"
>>> the permissions of both licenses - they can take their pick so they can
>>> treat it as MIT-0 licensed.
>>>  - BUT the copyright holders are the contributors to the project. They
>>> must agree that their contributions can be licensed like this. The ASF ICLA
>>> only agrees to ASL2 so we need to let them know. I suggest a
>>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>>> a checkbox*.
>>>  - If we want, we can include a README that explains this and tells
>>> users they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if
>>> they want to change it however they want.
>>>
>>> So I guess now the decision is whether all of the above is complicated
>>> enough for users that it outweighs the benefit. I'm not really sure.
>>>
>>
>> My (likely unsurprising) take is that this is worth it (though I also
>> agree with your asterisked footnote). A CONTRIBUTING.md and
>> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>>
>>
>>> *Exactly how formal we need to get here is a matter of some debate and
>>> risk tolerance. For these repos I think there is very little risk. One
>>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>>> but the bar in the US for i.p. is comically low so that's not a good
>>> argument to depend on.
>>>
>>> Kenn
>>>
>>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Friendly ping on this :)
>>>>
>>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> Can we create an empty file on each directory so I can fork the repo?
>>>>> It doesn't look like there is a workaround to cloning empty repos in
>>>>> GitHub. Then I can send a pull request.
>>>>>
>>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>>
>>>>>> I was trying to create a PR to merge the starter project contents,
>>>>>> but I can't fork the repo because it's empty. Can I either get permissions
>>>>>> to directly push or bother you with creating an empty README or some other
>>>>>> file so I can fork it and open a PR? Thanks!
>>>>>>
>>>>>> [image: image.png]
>>>>>>
>>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> I always get mixed up myself. The policies are at
>>>>>>> https://www.apache.org/legal/src-headers.html#notice and there's
>>>>>>> some step by step at https://infra.apache.org/licensing-howto.html
>>>>>>>
>>>>>>> TL;DR the contents should be like so:
>>>>>>>
>>>>>>>     Apache Beam
>>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>>
>>>>>>>     This product includes software developed at
>>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I found this example NOTICE
>>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>>> file?
>>>>>>>>
>>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Can someone point me to an example on how the NOTICE file should
>>>>>>>>> look like? I'm not familiar with it and would like to get it right.
>>>>>>>>>
>>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>> For the starter projects I like them being "clone and go", but
>>>>>>>>>> I'd like to keep them as minimal as possible. We could have another repo
>>>>>>>>>> like `beam-working-examples` for more complete examples where each
>>>>>>>>>> subdirectory is a self-contained example with all its build files and
>>>>>>>>>> everything.
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I like the goal: for things where the build has extra setup,
>>>>>>>>>>> have an example that is fully functional on its own. There is of course the
>>>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>>>
>>>>>>>>>>> The other piece is that a user wanting to know some of these
>>>>>>>>>>> bits may be past the "clone and go" stage of their project. They probably
>>>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>>>> would defeat the purpose.
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> This is great!
>>>>>>>>>>>>
>>>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Reza
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it
>>>>>>>>>>>>>> will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>>>>>> "least surprise"
>>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due
>>>>>>>>>>>>>> to its impact on contributor license agreements)
>>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup
>>>>>>>>>>>>>>>>> (and/or with the Go template). I will say though, the Actions config should
>>>>>>>>>>>>>>>>> be pretty darn simple for these examples -
>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and
>>>>>>>>>>>>>>>>>> may be able to help out.
>>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to
>>>>>>>>>>>>>>>>>>> keep it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have
>>>>>>>>>>>>>>>>>>>> any problems changing it to Apache license. In any case, how about we
>>>>>>>>>>>>>>>>>>>> create the following repos?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber
>>>>>>>>>>>>>>>>>>>> any users of
>>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want these
>>>>>>>>>>>>>>>>>>>> to be pretty
>>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's not
>>>>>>>>>>>>>>>>>>>> what the Apache
>>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument
>>>>>>>>>>>>>>>>>>>> could likely be
>>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to
>>>>>>>>>>>>>>>>>>>> be explicit
>>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for
>>>>>>>>>>>>>>>>>>>> apache legal?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the
>>>>>>>>>>>>>>>>>>>> most pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a
>>>>>>>>>>>>>>>>>>>> big part of it.
>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what
>>>>>>>>>>>>>>>>>>>> one would put in a Python repo than, other than a bare setup.py that lists
>>>>>>>>>>>>>>>>>>>> a dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be
>>>>>>>>>>>>>>>>>>>> part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because
>>>>>>>>>>>>>>>>>>>> one doesn't want to
>>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users
>>>>>>>>>>>>>>>>>>>> to be told to checkout this git repo for the language of your choice and
>>>>>>>>>>>>>>>>>>>> run. Some repos will have more/less than others when it comes to setup
>>>>>>>>>>>>>>>>>>>> necessary.
>>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would
>>>>>>>>>>>>>>>>>>>> put in a Python repo
>>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on
>>>>>>>>>>>>>>>>>>>> file layout, etc. more
>>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and
>>>>>>>>>>>>>>>>>>>> your package.json file
>>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within the
>>>>>>>>>>>>>>>>>>>> Beam repo found in:
>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin
>>>>>>>>>>>>>>>>>>>> Agarwal <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount
>>>>>>>>>>>>>>>>>>>> just for novelty/freshness but agreed with the suggestion that having an
>>>>>>>>>>>>>>>>>>>> example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David
>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti
>>>>>>>>>>>>>>>>>>>> would prefer having repos for all languages. It makes sense for consistency
>>>>>>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can
>>>>>>>>>>>>>>>>>>>> pull out a specific version of the examples that coincides with a specific
>>>>>>>>>>>>>>>>>>>> SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop
>>>>>>>>>>>>>>>>>>>> the archetype?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a
>>>>>>>>>>>>>>>>>>>> separate repo since we can see how a true minimal project looks like.
>>>>>>>>>>>>>>>>>>>> Having it in the main repo would inherit build file configurations and
>>>>>>>>>>>>>>>>>>>> other settings that would be different from a clean project, so it could be
>>>>>>>>>>>>>>>>>>>> non-trivial to adapt. Also as its own repo, it's easier to clone and
>>>>>>>>>>>>>>>>>>>> modify, or create an instance of the template.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating
>>>>>>>>>>>>>>>>>>>> the Beam version and other dependencies automatically. Testing is already
>>>>>>>>>>>>>>>>>>>> set up via GitHub actions for every pull request, so it would automatically
>>>>>>>>>>>>>>>>>>>> be tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke
>>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo,
>>>>>>>>>>>>>>>>>>>> or a single starter repo containing all the starters or one per language or
>>>>>>>>>>>>>>>>>>>> one per build system?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but
>>>>>>>>>>>>>>>>>>>> that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we
>>>>>>>>>>>>>>>>>>>> only need to pin down the name of the repo, create it, and move the code
>>>>>>>>>>>>>>>>>>>> there. I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet
>>>>>>>>>>>>>>>>>>>> Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in
>>>>>>>>>>>>>>>>>>>> apache/beam already, built with Maven Archetype [1]. It's what powers the
>>>>>>>>>>>>>>>>>>>> Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub
>>>>>>>>>>>>>>>>>>>> template in the quickstart, or co-locate the archetype with the GitHub
>>>>>>>>>>>>>>>>>>>> template)?
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo,
>>>>>>>>>>>>>>>>>>>> would we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM
>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone
>>>>>>>>>>>>>>>>>>>> else!
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM
>>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new
>>>>>>>>>>>>>>>>>>>> Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam
>>>>>>>>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and
>>>>>>>>>>>>>>>>>>>> Maven (Direct runner)
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how
>>>>>>>>>>>>>>>>>>>> to build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub
>>>>>>>>>>>>>>>>>>>> repo from a template.
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
I've added the dual license along with the CONTRIBUTING.md and
PULL_REQUEST_TEMPLATE.md. The sample is ready for review.

Please review the PR since the Python and Go starter projects are blocked
until this one merges (so we get all the legal files right).

https://github.com/apache/beam-starter-java/pull/1


On Mon, Mar 7, 2022 at 10:54 AM Robert Bradshaw <ro...@google.com> wrote:

> On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> OK. Bringing an important update on licensing to this thread for
>> consideration. Discussion on
>> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with key
>> takeaways. These are things that were already true and people who are good
>> at this stuff already may know, but I'm just going to say them again as I
>> understand them:
>>
>>  - We can dual license MIT-0 and ASL2, which means "we" gives "users" the
>> permissions of both licenses - they can take their pick so they can treat
>> it as MIT-0 licensed.
>>  - BUT the copyright holders are the contributors to the project. They
>> must agree that their contributions can be licensed like this. The ASF ICLA
>> only agrees to ASL2 so we need to let them know. I suggest a
>> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
>> a checkbox*.
>>  - If we want, we can include a README that explains this and tells users
>> they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if they
>> want to change it however they want.
>>
>> So I guess now the decision is whether all of the above is complicated
>> enough for users that it outweighs the benefit. I'm not really sure.
>>
>
> My (likely unsurprising) take is that this is worth it (though I also
> agree with your asterisked footnote). A CONTRIBUTING.md and
> PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.
>
>
>> *Exactly how formal we need to get here is a matter of some debate and
>> risk tolerance. For these repos I think there is very little risk. One
>> could even argue the contents are so unoriginal as to be uncopyrightable.,
>> but the bar in the US for i.p. is comically low so that's not a good
>> argument to depend on.
>>
>> Kenn
>>
>> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> Friendly ping on this :)
>>>
>>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Can we create an empty file on each directory so I can fork the repo?
>>>> It doesn't look like there is a workaround to cloning empty repos in
>>>> GitHub. Then I can send a pull request.
>>>>
>>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>>
>>>>> I was trying to create a PR to merge the starter project contents, but
>>>>> I can't fork the repo because it's empty. Can I either get permissions to
>>>>> directly push or bother you with creating an empty README or some other
>>>>> file so I can fork it and open a PR? Thanks!
>>>>>
>>>>> [image: image.png]
>>>>>
>>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I always get mixed up myself. The policies are at
>>>>>> https://www.apache.org/legal/src-headers.html#notice and there's
>>>>>> some step by step at https://infra.apache.org/licensing-howto.html
>>>>>>
>>>>>> TL;DR the contents should be like so:
>>>>>>
>>>>>>     Apache Beam
>>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>>
>>>>>>     This product includes software developed at
>>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I found this example NOTICE
>>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice>
>>>>>>> file, but it doesn't look like it does what we want. It looks like it has
>>>>>>> to be written in a formal legal language and I don't feel comfortable
>>>>>>> writing it. Can I ask for help on writing out the contents of the NOTICE
>>>>>>> file?
>>>>>>>
>>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Can someone point me to an example on how the NOTICE file should
>>>>>>>> look like? I'm not familiar with it and would like to get it right.
>>>>>>>>
>>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> +1
>>>>>>>>> For the starter projects I like them being "clone and go", but I'd
>>>>>>>>> like to keep them as minimal as possible. We could have another repo like
>>>>>>>>> `beam-working-examples` for more complete examples where each subdirectory
>>>>>>>>> is a self-contained example with all its build files and everything.
>>>>>>>>>
>>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I like the goal: for things where the build has extra setup, have
>>>>>>>>>> an example that is fully functional on its own. There is of course the
>>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>>
>>>>>>>>>> The other piece is that a user wanting to know some of these bits
>>>>>>>>>> may be past the "clone and go" stage of their project. They probably
>>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>>> would defeat the purpose.
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> This is great!
>>>>>>>>>>>
>>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> Reza
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> SGTM.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it
>>>>>>>>>>>>> will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>>
>>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>>>>> "least surprise"
>>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due
>>>>>>>>>>>>> to its impact on contributor license agreements)
>>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or
>>>>>>>>>>>>>>>> with the Go template). I will say though, the Actions config should be
>>>>>>>>>>>>>>>> pretty darn simple for these examples -
>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Always happy to help with or consult on any actions
>>>>>>>>>>>>>>>> issues 🙂
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and
>>>>>>>>>>>>>>>>> may be able to help out.
>>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to
>>>>>>>>>>>>>>>>>> keep it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber
>>>>>>>>>>>>>>>>>>> any users of
>>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want these
>>>>>>>>>>>>>>>>>>> to be pretty
>>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's not
>>>>>>>>>>>>>>>>>>> what the Apache
>>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument
>>>>>>>>>>>>>>>>>>> could likely be
>>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to
>>>>>>>>>>>>>>>>>>> be explicit
>>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache
>>>>>>>>>>>>>>>>>>> legal?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the
>>>>>>>>>>>>>>>>>>> most pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a
>>>>>>>>>>>>>>>>>>> big part of it.
>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what
>>>>>>>>>>>>>>>>>>> one would put in a Python repo than, other than a bare setup.py that lists
>>>>>>>>>>>>>>>>>>> a dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be
>>>>>>>>>>>>>>>>>>> part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because
>>>>>>>>>>>>>>>>>>> one doesn't want to
>>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to
>>>>>>>>>>>>>>>>>>> be told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would
>>>>>>>>>>>>>>>>>>> put in a Python repo
>>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on
>>>>>>>>>>>>>>>>>>> file layout, etc. more
>>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic
>>>>>>>>>>>>>>>>>>> advice to be found out
>>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is
>>>>>>>>>>>>>>>>>>> similar, and javascript
>>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and
>>>>>>>>>>>>>>>>>>> your package.json file
>>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within the
>>>>>>>>>>>>>>>>>>> Beam repo found in:
>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal
>>>>>>>>>>>>>>>>>>> <sa...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount
>>>>>>>>>>>>>>>>>>> just for novelty/freshness but agreed with the suggestion that having an
>>>>>>>>>>>>>>>>>>> example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos
>>>>>>>>>>>>>>>>>>> <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti
>>>>>>>>>>>>>>>>>>> would prefer having repos for all languages. It makes sense for consistency
>>>>>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can
>>>>>>>>>>>>>>>>>>> pull out a specific version of the examples that coincides with a specific
>>>>>>>>>>>>>>>>>>> SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop
>>>>>>>>>>>>>>>>>>> the archetype?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate
>>>>>>>>>>>>>>>>>>> repo since we can see how a true minimal project looks like. Having it in
>>>>>>>>>>>>>>>>>>> the main repo would inherit build file configurations and other settings
>>>>>>>>>>>>>>>>>>> that would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the
>>>>>>>>>>>>>>>>>>> Beam version and other dependencies automatically. Testing is already set
>>>>>>>>>>>>>>>>>>> up via GitHub actions for every pull request, so it would automatically be
>>>>>>>>>>>>>>>>>>> tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke
>>>>>>>>>>>>>>>>>>> Cwik <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo,
>>>>>>>>>>>>>>>>>>> or a single starter repo containing all the starters or one per language or
>>>>>>>>>>>>>>>>>>> one per build system?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but
>>>>>>>>>>>>>>>>>>> that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only
>>>>>>>>>>>>>>>>>>> need to pin down the name of the repo, create it, and move the code there.
>>>>>>>>>>>>>>>>>>> I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet
>>>>>>>>>>>>>>>>>>> Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo,
>>>>>>>>>>>>>>>>>>> would we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM
>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone
>>>>>>>>>>>>>>>>>>> else!
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM
>>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new
>>>>>>>>>>>>>>>>>>> Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and
>>>>>>>>>>>>>>>>>>> Maven (Direct runner)
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub
>>>>>>>>>>>>>>>>>>> repo from a template.
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Robert Bradshaw <ro...@google.com>.
On Mon, Mar 7, 2022 at 8:13 AM Kenneth Knowles <ke...@apache.org> wrote:

> OK. Bringing an important update on licensing to this thread for
> consideration. Discussion on
> https://issues.apache.org/jira/browse/LEGAL-601 has concluded with key
> takeaways. These are things that were already true and people who are good
> at this stuff already may know, but I'm just going to say them again as I
> understand them:
>
>  - We can dual license MIT-0 and ASL2, which means "we" gives "users" the
> permissions of both licenses - they can take their pick so they can treat
> it as MIT-0 licensed.
>  - BUT the copyright holders are the contributors to the project. They
> must agree that their contributions can be licensed like this. The ASF ICLA
> only agrees to ASL2 so we need to let them know. I suggest a
> CONTRIBUTING.md that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with
> a checkbox*.
>  - If we want, we can include a README that explains this and tells users
> they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if they
> want to change it however they want.
>
> So I guess now the decision is whether all of the above is complicated
> enough for users that it outweighs the benefit. I'm not really sure.
>

My (likely unsurprising) take is that this is worth it (though I also agree
with your asterisked footnote). A CONTRIBUTING.md and
PULL_REQUEST_TEMPLATE.md as suggested seem reasonable.


> *Exactly how formal we need to get here is a matter of some debate and
> risk tolerance. For these repos I think there is very little risk. One
> could even argue the contents are so unoriginal as to be uncopyrightable.,
> but the bar in the US for i.p. is comically low so that's not a good
> argument to depend on.
>
> Kenn
>
> On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com> wrote:
>
>> Friendly ping on this :)
>>
>> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> Can we create an empty file on each directory so I can fork the repo? It
>>> doesn't look like there is a workaround to cloning empty repos in GitHub.
>>> Then I can send a pull request.
>>>
>>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>>
>>>> I was trying to create a PR to merge the starter project contents, but
>>>> I can't fork the repo because it's empty. Can I either get permissions to
>>>> directly push or bother you with creating an empty README or some other
>>>> file so I can fork it and open a PR? Thanks!
>>>>
>>>> [image: image.png]
>>>>
>>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> I always get mixed up myself. The policies are at
>>>>> https://www.apache.org/legal/src-headers.html#notice and there's some
>>>>> step by step at https://infra.apache.org/licensing-howto.html
>>>>>
>>>>> TL;DR the contents should be like so:
>>>>>
>>>>>     Apache Beam
>>>>>     Copyright [2022-] The Apache Software Foundation
>>>>>
>>>>>     This product includes software developed at
>>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> I found this example NOTICE
>>>>>> <https://infra.apache.org/licensing-howto.html#example-notice> file,
>>>>>> but it doesn't look like it does what we want. It looks like it has to be
>>>>>> written in a formal legal language and I don't feel comfortable writing it.
>>>>>> Can I ask for help on writing out the contents of the NOTICE file?
>>>>>>
>>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Can someone point me to an example on how the NOTICE file should
>>>>>>> look like? I'm not familiar with it and would like to get it right.
>>>>>>>
>>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +1
>>>>>>>> For the starter projects I like them being "clone and go", but I'd
>>>>>>>> like to keep them as minimal as possible. We could have another repo like
>>>>>>>> `beam-working-examples` for more complete examples where each subdirectory
>>>>>>>> is a self-contained example with all its build files and everything.
>>>>>>>>
>>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I like the goal: for things where the build has extra setup, have
>>>>>>>>> an example that is fully functional on its own. There is of course the
>>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>>
>>>>>>>>> The other piece is that a user wanting to know some of these bits
>>>>>>>>> may be past the "clone and go" stage of their project. They probably
>>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>>> would defeat the purpose.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This is great!
>>>>>>>>>>
>>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> Reza
>>>>>>>>>>
>>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> SGTM.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Based on discussion on
>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it
>>>>>>>>>>>> will be simplest to license it under ASL2 and include a NOTICE file. The
>>>>>>>>>>>> user will be free to "clone and go".
>>>>>>>>>>>>
>>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>>
>>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>>>> "least surprise"
>>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due
>>>>>>>>>>>> to its impact on contributor license agreements)
>>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>>> is already good to go.
>>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>>
>>>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <
>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or
>>>>>>>>>>>>>>> with the Go template). I will say though, the Actions config should be
>>>>>>>>>>>>>>> pretty darn simple for these examples -
>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and may
>>>>>>>>>>>>>>>> be able to help out.
>>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to
>>>>>>>>>>>>>>>>> keep it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber any
>>>>>>>>>>>>>>>>>> users of
>>>>>>>>>>>>>>>>>> these templates with any particular licensing
>>>>>>>>>>>>>>>>>> requirements (right?)
>>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want these
>>>>>>>>>>>>>>>>>> to be pretty
>>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's not
>>>>>>>>>>>>>>>>>> what the Apache
>>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument
>>>>>>>>>>>>>>>>>> could likely be
>>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to
>>>>>>>>>>>>>>>>>> be explicit
>>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache
>>>>>>>>>>>>>>>>>> legal?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the
>>>>>>>>>>>>>>>>>> most pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a
>>>>>>>>>>>>>>>>>> big part of it.
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one
>>>>>>>>>>>>>>>>>> would put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be
>>>>>>>>>>>>>>>>>> part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a
>>>>>>>>>>>>>>>>>> derivative work of a
>>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to
>>>>>>>>>>>>>>>>>> be told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a
>>>>>>>>>>>>>>>>>> project there is quite
>>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would
>>>>>>>>>>>>>>>>>> put in a Python repo
>>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice
>>>>>>>>>>>>>>>>>> to be found out
>>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar,
>>>>>>>>>>>>>>>>>> and javascript
>>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>>>>>>>> package.json file
>>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within the
>>>>>>>>>>>>>>>>>> Beam repo found in:
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount
>>>>>>>>>>>>>>>>>> just for novelty/freshness but agreed with the suggestion that having an
>>>>>>>>>>>>>>>>>> example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti
>>>>>>>>>>>>>>>>>> would prefer having repos for all languages. It makes sense for consistency
>>>>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can
>>>>>>>>>>>>>>>>>> pull out a specific version of the examples that coincides with a specific
>>>>>>>>>>>>>>>>>> SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian
>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop
>>>>>>>>>>>>>>>>>> the archetype?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate
>>>>>>>>>>>>>>>>>> repo since we can see how a true minimal project looks like. Having it in
>>>>>>>>>>>>>>>>>> the main repo would inherit build file configurations and other settings
>>>>>>>>>>>>>>>>>> that would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the
>>>>>>>>>>>>>>>>>> Beam version and other dependencies automatically. Testing is already set
>>>>>>>>>>>>>>>>>> up via GitHub actions for every pull request, so it would automatically be
>>>>>>>>>>>>>>>>>> tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik
>>>>>>>>>>>>>>>>>> <lc...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo,
>>>>>>>>>>>>>>>>>> or a single starter repo containing all the starters or one per language or
>>>>>>>>>>>>>>>>>> one per build system?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter
>>>>>>>>>>>>>>>>>> happen?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but
>>>>>>>>>>>>>>>>>> that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only
>>>>>>>>>>>>>>>>>> need to pin down the name of the repo, create it, and move the code there.
>>>>>>>>>>>>>>>>>> I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on
>>>>>>>>>>>>>>>>>> creating the repo?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet
>>>>>>>>>>>>>>>>>> Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo,
>>>>>>>>>>>>>>>>>> would we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone
>>>>>>>>>>>>>>>>>> else!
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM
>>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new
>>>>>>>>>>>>>>>>>> Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and
>>>>>>>>>>>>>>>>>> Maven (Direct runner)
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub
>>>>>>>>>>>>>>>>>> repo from a template.
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure
>>>>>>>>>>>>>>>>>> everyone is happy with it 🙂
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with
>>>>>>>>>>>>>>>>>> instructions on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
OK. Bringing an important update on licensing to this thread for
consideration. Discussion on https://issues.apache.org/jira/browse/LEGAL-601
has concluded with key takeaways. These are things that were already true
and people who are good at this stuff already may know, but I'm just going
to say them again as I understand them:

 - We can dual license MIT-0 and ASL2, which means "we" gives "users" the
permissions of both licenses - they can take their pick so they can treat
it as MIT-0 licensed.
 - BUT the copyright holders are the contributors to the project. They must
agree that their contributions can be licensed like this. The ASF ICLA only
agrees to ASL2 so we need to let them know. I suggest a CONTRIBUTING.md
that mentions it and maybe a PULL_REQUEST_TEMPLATE.md with a checkbox*.
 - If we want, we can include a README that explains this and tells users
they can delete the bits related to ASL2/ASF and CONTRIBUTING.md if they
want to change it however they want.

So I guess now the decision is whether all of the above is complicated
enough for users that it outweighs the benefit. I'm not really sure.

*Exactly how formal we need to get here is a matter of some debate and risk
tolerance. For these repos I think there is very little risk. One could
even argue the contents are so unoriginal as to be uncopyrightable., but
the bar in the US for i.p. is comically low so that's not a good argument
to depend on.

Kenn

On Tue, Mar 1, 2022 at 10:28 AM David Cavazos <dc...@google.com> wrote:

> Friendly ping on this :)
>
> On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com>
> wrote:
>
>> Can we create an empty file on each directory so I can fork the repo? It
>> doesn't look like there is a workaround to cloning empty repos in GitHub.
>> Then I can send a pull request.
>>
>> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>>
>>> I was trying to create a PR to merge the starter project contents, but I
>>> can't fork the repo because it's empty. Can I either get permissions to
>>> directly push or bother you with creating an empty README or some other
>>> file so I can fork it and open a PR? Thanks!
>>>
>>> [image: image.png]
>>>
>>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> I always get mixed up myself. The policies are at
>>>> https://www.apache.org/legal/src-headers.html#notice and there's some
>>>> step by step at https://infra.apache.org/licensing-howto.html
>>>>
>>>> TL;DR the contents should be like so:
>>>>
>>>>     Apache Beam
>>>>     Copyright [2022-] The Apache Software Foundation
>>>>
>>>>     This product includes software developed at
>>>>     The Apache Software Foundation (http://www.apache.org/).
>>>>
>>>> Kenn
>>>>
>>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> I found this example NOTICE
>>>>> <https://infra.apache.org/licensing-howto.html#example-notice> file,
>>>>> but it doesn't look like it does what we want. It looks like it has to be
>>>>> written in a formal legal language and I don't feel comfortable writing it.
>>>>> Can I ask for help on writing out the contents of the NOTICE file?
>>>>>
>>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Can someone point me to an example on how the NOTICE file should look
>>>>>> like? I'm not familiar with it and would like to get it right.
>>>>>>
>>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1
>>>>>>> For the starter projects I like them being "clone and go", but I'd
>>>>>>> like to keep them as minimal as possible. We could have another repo like
>>>>>>> `beam-working-examples` for more complete examples where each subdirectory
>>>>>>> is a self-contained example with all its build files and everything.
>>>>>>>
>>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I like the goal: for things where the build has extra setup, have
>>>>>>>> an example that is fully functional on its own. There is of course the
>>>>>>>> problem of "where does it end?" since this is infinity things.
>>>>>>>>
>>>>>>>> The other piece is that a user wanting to know some of these bits
>>>>>>>> may be past the "clone and go" stage of their project. They probably
>>>>>>>> already have a project and now they need a working example to read and
>>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>>> would defeat the purpose.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> This is great!
>>>>>>>>>
>>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Reza
>>>>>>>>>
>>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> SGTM.
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Based on discussion on
>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it will
>>>>>>>>>>> be simplest to license it under ASL2 and include a NOTICE file. The user
>>>>>>>>>>> will be free to "clone and go".
>>>>>>>>>>>
>>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>>
>>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>>> "least surprise"
>>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due to
>>>>>>>>>>> its impact on contributor license agreements)
>>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>>> is already good to go.
>>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>>
>>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <
>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or
>>>>>>>>>>>>>> with the Go template). I will say though, the Actions config should be
>>>>>>>>>>>>>> pretty darn simple for these examples -
>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and may
>>>>>>>>>>>>>>> be able to help out.
>>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to keep
>>>>>>>>>>>>>>>> it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber any
>>>>>>>>>>>>>>>>> users of
>>>>>>>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>>>>>>>> (right?)
>>>>>>>>>>>>>>>>> and we don't even care about attribution. We want these to
>>>>>>>>>>>>>>>>> be pretty
>>>>>>>>>>>>>>>>> much as close to public domain as possible. That's not
>>>>>>>>>>>>>>>>> what the Apache
>>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument
>>>>>>>>>>>>>>>>> could likely be
>>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>>>>>>>> explicit
>>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache
>>>>>>>>>>>>>>>>> legal?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us
>>>>>>>>>>>>>>>>> create them?
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a
>>>>>>>>>>>>>>>>> big part of it.
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one
>>>>>>>>>>>>>>>>> would put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be
>>>>>>>>>>>>>>>>> part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>>>>>>> >> bind the users of such a template as being a derivative
>>>>>>>>>>>>>>>>> work of a
>>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to
>>>>>>>>>>>>>>>>> be told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project
>>>>>>>>>>>>>>>>> there is quite
>>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put
>>>>>>>>>>>>>>>>> in a Python repo
>>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice
>>>>>>>>>>>>>>>>> to be found out
>>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar,
>>>>>>>>>>>>>>>>> and javascript
>>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>>>>>>> package.json file
>>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>> >> >>> > There are several examples already within the
>>>>>>>>>>>>>>>>> Beam repo found in:
>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount
>>>>>>>>>>>>>>>>> just for novelty/freshness but agreed with the suggestion that having an
>>>>>>>>>>>>>>>>> example in each quickstart would be ideal.
>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can
>>>>>>>>>>>>>>>>> pull out a specific version of the examples that coincides with a specific
>>>>>>>>>>>>>>>>> SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette
>>>>>>>>>>>>>>>>> <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't
>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>>>>>>>> archetype?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate
>>>>>>>>>>>>>>>>> repo since we can see how a true minimal project looks like. Having it in
>>>>>>>>>>>>>>>>> the main repo would inherit build file configurations and other settings
>>>>>>>>>>>>>>>>> that would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the
>>>>>>>>>>>>>>>>> Beam version and other dependencies automatically. Testing is already set
>>>>>>>>>>>>>>>>> up via GitHub actions for every pull request, so it would automatically be
>>>>>>>>>>>>>>>>> tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't
>>>>>>>>>>>>>>>>> expect them to break commonly, but I think it would be good to make sure
>>>>>>>>>>>>>>>>> tests aren't failing when a release is published.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per
>>>>>>>>>>>>>>>>> language, and having all the build systems we want to support for them. As
>>>>>>>>>>>>>>>>> long as we document which files are for which build system. That way there
>>>>>>>>>>>>>>>>> are less repos to maintain.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or
>>>>>>>>>>>>>>>>> a single starter repo containing all the starters or one per language or
>>>>>>>>>>>>>>>>> one per build system?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but
>>>>>>>>>>>>>>>>> that wouldn't work very well for Gradle and SBT users. I think a GitHub
>>>>>>>>>>>>>>>>> template might be the more flexible option, and we could have something
>>>>>>>>>>>>>>>>> similar for other languages as well. Having said that, we could still
>>>>>>>>>>>>>>>>> create a Maven archetype. If someone is familiar with that process, please
>>>>>>>>>>>>>>>>> let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only
>>>>>>>>>>>>>>>>> need to pin down the name of the repo, create it, and move the code there.
>>>>>>>>>>>>>>>>> I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating
>>>>>>>>>>>>>>>>> the repo?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet
>>>>>>>>>>>>>>>>> Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo,
>>>>>>>>>>>>>>>>> would we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone
>>>>>>>>>>>>>>>>> else!
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM
>>>>>>>>>>>>>>>>> David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new
>>>>>>>>>>>>>>>>> Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and
>>>>>>>>>>>>>>>>> Maven (Direct runner)
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub
>>>>>>>>>>>>>>>>> repo from a template.
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone
>>>>>>>>>>>>>>>>> is happy with it 🙂
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions
>>>>>>>>>>>>>>>>> on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Friendly ping on this :)

On Fri, Feb 25, 2022 at 12:52 PM David Cavazos <dc...@google.com> wrote:

> Can we create an empty file on each directory so I can fork the repo? It
> doesn't look like there is a workaround to cloning empty repos in GitHub.
> Then I can send a pull request.
>
> On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com>
> wrote:
>
>> Got it, thank you! I'll go ahead and add the NOTICE file.
>>
>> I was trying to create a PR to merge the starter project contents, but I
>> can't fork the repo because it's empty. Can I either get permissions to
>> directly push or bother you with creating an empty README or some other
>> file so I can fork it and open a PR? Thanks!
>>
>> [image: image.png]
>>
>> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> I always get mixed up myself. The policies are at
>>> https://www.apache.org/legal/src-headers.html#notice and there's some
>>> step by step at https://infra.apache.org/licensing-howto.html
>>>
>>> TL;DR the contents should be like so:
>>>
>>>     Apache Beam
>>>     Copyright [2022-] The Apache Software Foundation
>>>
>>>     This product includes software developed at
>>>     The Apache Software Foundation (http://www.apache.org/).
>>>
>>> Kenn
>>>
>>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> I found this example NOTICE
>>>> <https://infra.apache.org/licensing-howto.html#example-notice> file,
>>>> but it doesn't look like it does what we want. It looks like it has to be
>>>> written in a formal legal language and I don't feel comfortable writing it.
>>>> Can I ask for help on writing out the contents of the NOTICE file?
>>>>
>>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> Can someone point me to an example on how the NOTICE file should look
>>>>> like? I'm not familiar with it and would like to get it right.
>>>>>
>>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>> For the starter projects I like them being "clone and go", but I'd
>>>>>> like to keep them as minimal as possible. We could have another repo like
>>>>>> `beam-working-examples` for more complete examples where each subdirectory
>>>>>> is a self-contained example with all its build files and everything.
>>>>>>
>>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> I like the goal: for things where the build has extra setup, have an
>>>>>>> example that is fully functional on its own. There is of course the problem
>>>>>>> of "where does it end?" since this is infinity things.
>>>>>>>
>>>>>>> The other piece is that a user wanting to know some of these bits
>>>>>>> may be past the "clone and go" stage of their project. They probably
>>>>>>> already have a project and now they need a working example to read and
>>>>>>> learn from. So it could be just one additional repo `beam-working-examples`
>>>>>>> where each subdirectory is an independent working setup. I do like having
>>>>>>> it a separate repo to avoid the temptation to leverage anything from the
>>>>>>> Beam build. And each subdirectory should be entirely independent and we
>>>>>>> also have to avoid the temptation to share configuration across them, or it
>>>>>>> would defeat the purpose.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>>> rarokni@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This is great!
>>>>>>>>
>>>>>>>> What do folks think about also having a less minimal set of
>>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>>> 'hello' world samples to get folks going.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Reza
>>>>>>>>
>>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>>>>>>
>>>>>>>>> SGTM.
>>>>>>>>>
>>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Based on discussion on
>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it will
>>>>>>>>>> be simplest to license it under ASL2 and include a NOTICE file. The user
>>>>>>>>>> will be free to "clone and go".
>>>>>>>>>>
>>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>>
>>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is
>>>>>>>>>> "least surprise"
>>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due to
>>>>>>>>>> its impact on contributor license agreements)
>>>>>>>>>>  - ASL2 says "You must cause any modified files to carry
>>>>>>>>>> prominent notices stating that You changed the files" which won't apply to
>>>>>>>>>> the user's code and I would guess they simply won't bother with for files
>>>>>>>>>> in the template. Or maybe there is a clever way to phrase the header so it
>>>>>>>>>> is already good to go.
>>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>>
>>>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Legal question asked at
>>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or
>>>>>>>>>>>>> with the Go template). I will say though, the Actions config should be
>>>>>>>>>>>>> pretty darn simple for these examples -
>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>>
>>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>>
>>>>>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and may
>>>>>>>>>>>>>> be able to help out.
>>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to keep
>>>>>>>>>>>>>>> it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber any
>>>>>>>>>>>>>>>> users of
>>>>>>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>>>>>>> (right?)
>>>>>>>>>>>>>>>> and we don't even care about attribution. We want these to
>>>>>>>>>>>>>>>> be pretty
>>>>>>>>>>>>>>>> much as close to public domain as possible. That's not what
>>>>>>>>>>>>>>>> the Apache
>>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>>>>>>>> likely be
>>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>>>>>>> explicit
>>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache
>>>>>>>>>>>>>>>> legal?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us create
>>>>>>>>>>>>>>>> them?
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big
>>>>>>>>>>>>>>>> part of it.
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one
>>>>>>>>>>>>>>>> would put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be
>>>>>>>>>>>>>>>> part of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>>>>>> >> bind the users of such a template as being a derivative
>>>>>>>>>>>>>>>> work of a
>>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to be
>>>>>>>>>>>>>>>> told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project
>>>>>>>>>>>>>>>> there is quite
>>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put
>>>>>>>>>>>>>>>> in a Python repo
>>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice
>>>>>>>>>>>>>>>> to be found out
>>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar,
>>>>>>>>>>>>>>>> and javascript
>>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>>>>>> package.json file
>>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>> >> >>> > There are several examples already within the Beam
>>>>>>>>>>>>>>>> repo found in:
>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount
>>>>>>>>>>>>>>>> just for novelty/freshness but agreed with the suggestion that having an
>>>>>>>>>>>>>>>> example in each quickstart would be ideal.
>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David
>>>>>>>>>>>>>>>> Huntsperger <dh...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can
>>>>>>>>>>>>>>>> pull out a specific version of the examples that coincides with a specific
>>>>>>>>>>>>>>>> SDK version then we could drop the archetypes.
>>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect
>>>>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>>>>>>> archetype?
>>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David
>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate
>>>>>>>>>>>>>>>> repo since we can see how a true minimal project looks like. Having it in
>>>>>>>>>>>>>>>> the main repo would inherit build file configurations and other settings
>>>>>>>>>>>>>>>> that would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the
>>>>>>>>>>>>>>>> Beam version and other dependencies automatically. Testing is already set
>>>>>>>>>>>>>>>> up via GitHub actions for every pull request, so it would automatically be
>>>>>>>>>>>>>>>> tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect
>>>>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language,
>>>>>>>>>>>>>>>> and having all the build systems we want to support for them. As long as we
>>>>>>>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>>>>>>>> repos to maintain.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more
>>>>>>>>>>>>>>>> flexible then the archetypes but the archetypes have a few conveniences
>>>>>>>>>>>>>>>> since they are integrated with apache/beam repo. For example,
>>>>>>>>>>>>>>>> updates/testing are done at the same time a corresponding change to the
>>>>>>>>>>>>>>>> main repo is done (like library version updates), they are released when
>>>>>>>>>>>>>>>> the SDK is released.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or
>>>>>>>>>>>>>>>> a single starter repo containing all the starters or one per language or
>>>>>>>>>>>>>>>> one per build system?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to
>>>>>>>>>>>>>>>> happen (e.g. release manager owns it)?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only
>>>>>>>>>>>>>>>> need to pin down the name of the repo, create it, and move the code there.
>>>>>>>>>>>>>>>> I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating
>>>>>>>>>>>>>>>> the repo?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet
>>>>>>>>>>>>>>>> Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo,
>>>>>>>>>>>>>>>> would we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone
>>>>>>>>>>>>>>>> else!
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new
>>>>>>>>>>>>>>>> Beam Java project, I've been working on a GitHub template containing a
>>>>>>>>>>>>>>>> minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and
>>>>>>>>>>>>>>>> Maven (Direct runner)
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo
>>>>>>>>>>>>>>>> from a template.
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone
>>>>>>>>>>>>>>>> is happy with it 🙂
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions
>>>>>>>>>>>>>>>> on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Can we create an empty file on each directory so I can fork the repo? It
doesn't look like there is a workaround to cloning empty repos in GitHub.
Then I can send a pull request.

On Fri, Feb 18, 2022 at 10:40 AM David Cavazos <dc...@google.com> wrote:

> Got it, thank you! I'll go ahead and add the NOTICE file.
>
> I was trying to create a PR to merge the starter project contents, but I
> can't fork the repo because it's empty. Can I either get permissions to
> directly push or bother you with creating an empty README or some other
> file so I can fork it and open a PR? Thanks!
>
> [image: image.png]
>
> On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> I always get mixed up myself. The policies are at
>> https://www.apache.org/legal/src-headers.html#notice and there's some
>> step by step at https://infra.apache.org/licensing-howto.html
>>
>> TL;DR the contents should be like so:
>>
>>     Apache Beam
>>     Copyright [2022-] The Apache Software Foundation
>>
>>     This product includes software developed at
>>     The Apache Software Foundation (http://www.apache.org/).
>>
>> Kenn
>>
>> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> I found this example NOTICE
>>> <https://infra.apache.org/licensing-howto.html#example-notice> file,
>>> but it doesn't look like it does what we want. It looks like it has to be
>>> written in a formal legal language and I don't feel comfortable writing it.
>>> Can I ask for help on writing out the contents of the NOTICE file?
>>>
>>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> Can someone point me to an example on how the NOTICE file should look
>>>> like? I'm not familiar with it and would like to get it right.
>>>>
>>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> +1
>>>>> For the starter projects I like them being "clone and go", but I'd
>>>>> like to keep them as minimal as possible. We could have another repo like
>>>>> `beam-working-examples` for more complete examples where each subdirectory
>>>>> is a self-contained example with all its build files and everything.
>>>>>
>>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I like the goal: for things where the build has extra setup, have an
>>>>>> example that is fully functional on its own. There is of course the problem
>>>>>> of "where does it end?" since this is infinity things.
>>>>>>
>>>>>> The other piece is that a user wanting to know some of these bits may
>>>>>> be past the "clone and go" stage of their project. They probably already
>>>>>> have a project and now they need a working example to read and learn from.
>>>>>> So it could be just one additional repo `beam-working-examples` where each
>>>>>> subdirectory is an independent working setup. I do like having it a
>>>>>> separate repo to avoid the temptation to leverage anything from the Beam
>>>>>> build. And each subdirectory should be entirely independent and we also
>>>>>> have to avoid the temptation to share configuration across them, or it
>>>>>> would defeat the purpose.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <
>>>>>> rarokni@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> This is great!
>>>>>>>
>>>>>>> What do folks think about also having a less minimal set of
>>>>>>> starters? For Java I am thinking about protobuf / autovalue. For Python
>>>>>>> maybe an opinionated setup with tox etc... Again this would just contain
>>>>>>> 'hello' world samples to get folks going.
>>>>>>>
>>>>>>> Regards
>>>>>>> Reza
>>>>>>>
>>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>>>>>
>>>>>>>> SGTM.
>>>>>>>>
>>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Based on discussion on
>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it will
>>>>>>>>> be simplest to license it under ASL2 and include a NOTICE file. The user
>>>>>>>>> will be free to "clone and go".
>>>>>>>>>
>>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>>
>>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>>>>>>>> surprise"
>>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due to
>>>>>>>>> its impact on contributor license agreements)
>>>>>>>>>  - ASL2 says "You must cause any modified files to carry prominent
>>>>>>>>> notices stating that You changed the files" which won't apply to the user's
>>>>>>>>> code and I would guess they simply won't bother with for files in the
>>>>>>>>> template. Or maybe there is a clever way to phrase the header so it is
>>>>>>>>> already good to go.
>>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>>
>>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> And I've created the repos just now.
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Legal question asked at
>>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or
>>>>>>>>>>>> with the Go template). I will say though, the Actions config should be
>>>>>>>>>>>> pretty darn simple for these examples -
>>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>>> just want a job with:
>>>>>>>>>>>>
>>>>>>>>>>>>    - checkout
>>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>>
>>>>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Danny
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and may be
>>>>>>>>>>>>> able to help out.
>>>>>>>>>>>>> Kerry
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to keep
>>>>>>>>>>>>>> it simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For these starter projects, we don't want to encumber any
>>>>>>>>>>>>>>> users of
>>>>>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>>>>>> (right?)
>>>>>>>>>>>>>>> and we don't even care about attribution. We want these to
>>>>>>>>>>>>>>> be pretty
>>>>>>>>>>>>>>> much as close to public domain as possible. That's not what
>>>>>>>>>>>>>>> the Apache
>>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>>>>>>> likely be
>>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>>>>>> explicit
>>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache
>>>>>>>>>>>>>>> legal?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us create
>>>>>>>>>>>>>>> them?
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big
>>>>>>>>>>>>>>> part of it.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one
>>>>>>>>>>>>>>> would put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part
>>>>>>>>>>>>>>> of Apache software it needs to be ASL2)
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>>>>> >> bind the users of such a template as being a derivative
>>>>>>>>>>>>>>> work of a
>>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to be
>>>>>>>>>>>>>>> told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project
>>>>>>>>>>>>>>> there is quite
>>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put
>>>>>>>>>>>>>>> in a Python repo
>>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice to
>>>>>>>>>>>>>>> be found out
>>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar,
>>>>>>>>>>>>>>> and javascript
>>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>>>>> package.json file
>>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>> >> >>> > There are several examples already within the Beam
>>>>>>>>>>>>>>> repo found in:
>>>>>>>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just
>>>>>>>>>>>>>>> for novelty/freshness but agreed with the suggestion that having an example
>>>>>>>>>>>>>>> in each quickstart would be ideal.
>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger
>>>>>>>>>>>>>>> <dh...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull
>>>>>>>>>>>>>>> out a specific version of the examples that coincides with a specific SDK
>>>>>>>>>>>>>>> version then we could drop the archetypes.
>>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect
>>>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>>>>>> archetype?
>>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos
>>>>>>>>>>>>>>> <dc...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate
>>>>>>>>>>>>>>> repo since we can see how a true minimal project looks like. Having it in
>>>>>>>>>>>>>>> the main repo would inherit build file configurations and other settings
>>>>>>>>>>>>>>> that would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the
>>>>>>>>>>>>>>> Beam version and other dependencies automatically. Testing is already set
>>>>>>>>>>>>>>> up via GitHub actions for every pull request, so it would automatically be
>>>>>>>>>>>>>>> tested as soon as there is a new dependency version available.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect
>>>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language,
>>>>>>>>>>>>>>> and having all the build systems we want to support for them. As long as we
>>>>>>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>>>>>>> repos to maintain.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible
>>>>>>>>>>>>>>> then the archetypes but the archetypes have a few conveniences since they
>>>>>>>>>>>>>>> are integrated with apache/beam repo. For example, updates/testing are done
>>>>>>>>>>>>>>> at the same time a corresponding change to the main repo is done (like
>>>>>>>>>>>>>>> library version updates), they are released when the SDK is released.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>>>>>>>> per build system?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen
>>>>>>>>>>>>>>> (e.g. release manager owns it)?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only
>>>>>>>>>>>>>>> need to pin down the name of the repo, create it, and move the code there.
>>>>>>>>>>>>>>> I was thinking either `apache/beam-java-template` or
>>>>>>>>>>>>>>> `apache/beam-java-starter`. What do you think?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating
>>>>>>>>>>>>>>> the repo?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet
>>>>>>>>>>>>>>> Altay <al...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would
>>>>>>>>>>>>>>> we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam
>>>>>>>>>>>>>>> Java project, I've been working on a GitHub template containing a minimal
>>>>>>>>>>>>>>> Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>>>>>>>> (Direct runner)
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo
>>>>>>>>>>>>>>> from a template.
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone
>>>>>>>>>>>>>>> is happy with it 🙂
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions
>>>>>>>>>>>>>>> on how to create a new Beam Java pipeline
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Got it, thank you! I'll go ahead and add the NOTICE file.

I was trying to create a PR to merge the starter project contents, but I
can't fork the repo because it's empty. Can I either get permissions to
directly push or bother you with creating an empty README or some other
file so I can fork it and open a PR? Thanks!

[image: image.png]

On Fri, Feb 18, 2022 at 8:32 AM Kenneth Knowles <ke...@apache.org> wrote:

> I always get mixed up myself. The policies are at
> https://www.apache.org/legal/src-headers.html#notice and there's some
> step by step at https://infra.apache.org/licensing-howto.html
>
> TL;DR the contents should be like so:
>
>     Apache Beam
>     Copyright [2022-] The Apache Software Foundation
>
>     This product includes software developed at
>     The Apache Software Foundation (http://www.apache.org/).
>
> Kenn
>
> On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com> wrote:
>
>> I found this example NOTICE
>> <https://infra.apache.org/licensing-howto.html#example-notice> file, but
>> it doesn't look like it does what we want. It looks like it has to be
>> written in a formal legal language and I don't feel comfortable writing it.
>> Can I ask for help on writing out the contents of the NOTICE file?
>>
>> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> Can someone point me to an example on how the NOTICE file should look
>>> like? I'm not familiar with it and would like to get it right.
>>>
>>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> +1
>>>> For the starter projects I like them being "clone and go", but I'd like
>>>> to keep them as minimal as possible. We could have another repo like
>>>> `beam-working-examples` for more complete examples where each subdirectory
>>>> is a self-contained example with all its build files and everything.
>>>>
>>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> I like the goal: for things where the build has extra setup, have an
>>>>> example that is fully functional on its own. There is of course the problem
>>>>> of "where does it end?" since this is infinity things.
>>>>>
>>>>> The other piece is that a user wanting to know some of these bits may
>>>>> be past the "clone and go" stage of their project. They probably already
>>>>> have a project and now they need a working example to read and learn from.
>>>>> So it could be just one additional repo `beam-working-examples` where each
>>>>> subdirectory is an independent working setup. I do like having it a
>>>>> separate repo to avoid the temptation to leverage anything from the Beam
>>>>> build. And each subdirectory should be entirely independent and we also
>>>>> have to avoid the temptation to share configuration across them, or it
>>>>> would defeat the purpose.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <ra...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This is great!
>>>>>>
>>>>>> What do folks think about also having a less minimal set of starters?
>>>>>> For Java I am thinking about protobuf / autovalue. For Python maybe an
>>>>>> opinionated setup with tox etc... Again this would just contain 'hello'
>>>>>> world samples to get folks going.
>>>>>>
>>>>>> Regards
>>>>>> Reza
>>>>>>
>>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>>>>
>>>>>>> SGTM.
>>>>>>>
>>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Based on discussion on
>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it will be
>>>>>>>> simplest to license it under ASL2 and include a NOTICE file. The user will
>>>>>>>> be free to "clone and go".
>>>>>>>>
>>>>>>>> I would bring these points back to the dev list:
>>>>>>>>
>>>>>>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>>>>>>> surprise"
>>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due to
>>>>>>>> its impact on contributor license agreements)
>>>>>>>>  - ASL2 says "You must cause any modified files to carry prominent
>>>>>>>> notices stating that You changed the files" which won't apply to the user's
>>>>>>>> code and I would guess they simply won't bother with for files in the
>>>>>>>> template. Or maybe there is a clever way to phrase the header so it is
>>>>>>>> already good to go.
>>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>>
>>>>>>>> So my overall take is that we should go ahead with ASL2 and a
>>>>>>>> simple NOTICE file. Check the Jira for details.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> And I've created the repos just now.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Legal question asked at
>>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or with
>>>>>>>>>>> the Go template). I will say though, the Actions config should be pretty
>>>>>>>>>>> darn simple for these examples -
>>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>>> just want a job with:
>>>>>>>>>>>
>>>>>>>>>>>    - checkout
>>>>>>>>>>>    - setup-<language>
>>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>>
>>>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Danny
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Danny has extensive experience with GitHub actions, and may be
>>>>>>>>>>>> able to help out.
>>>>>>>>>>>> Kerry
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For these starter projects, we don't want to encumber any
>>>>>>>>>>>>>> users of
>>>>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>>>>> (right?)
>>>>>>>>>>>>>> and we don't even care about attribution. We want these to be
>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>> much as close to public domain as possible. That's not what
>>>>>>>>>>>>>> the Apache
>>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>>>>>> likely be
>>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>>>>> explicit
>>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache
>>>>>>>>>>>>>> legal?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter
>>>>>>>>>>>>>> projects for every language. Once we have Java, Python and Go, it might be
>>>>>>>>>>>>>> a good idea to change the quickstarts to use these instead of the word
>>>>>>>>>>>>>> count. There is already a dedicated word count walkthrough so I think that
>>>>>>>>>>>>>> is already covered.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > If we all agree on the repo names, who can help us create
>>>>>>>>>>>>>> them?
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big
>>>>>>>>>>>>>> part of it.
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one
>>>>>>>>>>>>>> would put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part
>>>>>>>>>>>>>> of Apache software it needs to be ASL2)
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>>>> >> bind the users of such a template as being a derivative
>>>>>>>>>>>>>> work of a
>>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template
>>>>>>>>>>>>>> itself should
>>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to be
>>>>>>>>>>>>>> told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project
>>>>>>>>>>>>>> there is quite
>>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put in
>>>>>>>>>>>>>> a Python repo
>>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice to
>>>>>>>>>>>>>> be found out
>>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>>>>>>>> javascript
>>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>>>> package.json file
>>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>> >> >>> > There are several examples already within the Beam
>>>>>>>>>>>>>> repo found in:
>>>>>>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just
>>>>>>>>>>>>>> for novelty/freshness but agreed with the suggestion that having an example
>>>>>>>>>>>>>> in each quickstart would be ideal.
>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull
>>>>>>>>>>>>>> out a specific version of the examples that coincides with a specific SDK
>>>>>>>>>>>>>> version then we could drop the archetypes.
>>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect
>>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>>>>> archetype?
>>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo
>>>>>>>>>>>>>> since we can see how a true minimal project looks like. Having it in the
>>>>>>>>>>>>>> main repo would inherit build file configurations and other settings that
>>>>>>>>>>>>>> would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect
>>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language,
>>>>>>>>>>>>>> and having all the build systems we want to support for them. As long as we
>>>>>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>>>>>> repos to maintain.
>>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible
>>>>>>>>>>>>>> then the archetypes but the archetypes have a few conveniences since they
>>>>>>>>>>>>>> are integrated with apache/beam repo. For example, updates/testing are done
>>>>>>>>>>>>>> at the same time a corresponding change to the main repo is done (like
>>>>>>>>>>>>>> library version updates), they are released when the SDK is released.
>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>>>>>>> per build system?
>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen
>>>>>>>>>>>>>> (e.g. release manager owns it)?
>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David
>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need
>>>>>>>>>>>>>> to pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating
>>>>>>>>>>>>>> the repo?
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay
>>>>>>>>>>>>>> <al...@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any
>>>>>>>>>>>>>> progress on this? Do you need help?
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would
>>>>>>>>>>>>>> we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam
>>>>>>>>>>>>>> Java project, I've been working on a GitHub template containing a minimal
>>>>>>>>>>>>>> Beam Java pipeline for people to start with.
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>>> contains:
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>>>>>>> (Direct runner)
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo
>>>>>>>>>>>>>> from a template.
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is
>>>>>>>>>>>>>> happy with it 🙂
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal
>>>>>>>>>>>>>> GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on
>>>>>>>>>>>>>> how to create a new Beam Java pipeline
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
I always get mixed up myself. The policies are at
https://www.apache.org/legal/src-headers.html#notice and there's some step
by step at https://infra.apache.org/licensing-howto.html

TL;DR the contents should be like so:

    Apache Beam
    Copyright [2022-] The Apache Software Foundation

    This product includes software developed at
    The Apache Software Foundation (http://www.apache.org/).

Kenn

On Thu, Feb 17, 2022 at 2:28 PM David Cavazos <dc...@google.com> wrote:

> I found this example NOTICE
> <https://infra.apache.org/licensing-howto.html#example-notice> file, but
> it doesn't look like it does what we want. It looks like it has to be
> written in a formal legal language and I don't feel comfortable writing it.
> Can I ask for help on writing out the contents of the NOTICE file?
>
> On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com>
> wrote:
>
>> Can someone point me to an example on how the NOTICE file should look
>> like? I'm not familiar with it and would like to get it right.
>>
>> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> +1
>>> For the starter projects I like them being "clone and go", but I'd like
>>> to keep them as minimal as possible. We could have another repo like
>>> `beam-working-examples` for more complete examples where each subdirectory
>>> is a self-contained example with all its build files and everything.
>>>
>>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> I like the goal: for things where the build has extra setup, have an
>>>> example that is fully functional on its own. There is of course the problem
>>>> of "where does it end?" since this is infinity things.
>>>>
>>>> The other piece is that a user wanting to know some of these bits may
>>>> be past the "clone and go" stage of their project. They probably already
>>>> have a project and now they need a working example to read and learn from.
>>>> So it could be just one additional repo `beam-working-examples` where each
>>>> subdirectory is an independent working setup. I do like having it a
>>>> separate repo to avoid the temptation to leverage anything from the Beam
>>>> build. And each subdirectory should be entirely independent and we also
>>>> have to avoid the temptation to share configuration across them, or it
>>>> would defeat the purpose.
>>>>
>>>> Kenn
>>>>
>>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <ra...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> This is great!
>>>>>
>>>>> What do folks think about also having a less minimal set of starters?
>>>>> For Java I am thinking about protobuf / autovalue. For Python maybe an
>>>>> opinionated setup with tox etc... Again this would just contain 'hello'
>>>>> world samples to get folks going.
>>>>>
>>>>> Regards
>>>>> Reza
>>>>>
>>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>>>
>>>>>> SGTM.
>>>>>>
>>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Based on discussion on
>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it will be
>>>>>>> simplest to license it under ASL2 and include a NOTICE file. The user will
>>>>>>> be free to "clone and go".
>>>>>>>
>>>>>>> I would bring these points back to the dev list:
>>>>>>>
>>>>>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>>>>>> surprise"
>>>>>>>  - Dual-licensing is possible (but I think not worthwhile due to its
>>>>>>> impact on contributor license agreements)
>>>>>>>  - ASL2 says "You must cause any modified files to carry prominent
>>>>>>> notices stating that You changed the files" which won't apply to the user's
>>>>>>> code and I would guess they simply won't bother with for files in the
>>>>>>> template. Or maybe there is a clever way to phrase the header so it is
>>>>>>> already good to go.
>>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to
>>>>>>> includes the attributions from it. The NOTICE file is required by ASF
>>>>>>> policy. We can easily set it up to be a noop for the user.
>>>>>>>
>>>>>>> So my overall take is that we should go ahead with ASL2 and a simple
>>>>>>> NOTICE file. Check the Jira for details.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> And I've created the repos just now.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Legal question asked at
>>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or with
>>>>>>>>>> the Go template). I will say though, the Actions config should be pretty
>>>>>>>>>> darn simple for these examples -
>>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>>> just want a job with:
>>>>>>>>>>
>>>>>>>>>>    - checkout
>>>>>>>>>>    - setup-<language>
>>>>>>>>>>    - inlined script to run tests
>>>>>>>>>>
>>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Danny
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Danny has extensive experience with GitHub actions, and may be
>>>>>>>>>>> able to help out.
>>>>>>>>>>> Kerry
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>>
>>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>>
>>>>>>>>>>>>> For these starter projects, we don't want to encumber any
>>>>>>>>>>>>> users of
>>>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>>>> (right?)
>>>>>>>>>>>>> and we don't even care about attribution. We want these to be
>>>>>>>>>>>>> pretty
>>>>>>>>>>>>> much as close to public domain as possible. That's not what
>>>>>>>>>>>>> the Apache
>>>>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>>>>> likely be
>>>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>>>> explicit
>>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>>>>>>>
>>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects
>>>>>>>>>>>>> for every language. Once we have Java, Python and Go, it might be a good
>>>>>>>>>>>>> idea to change the quickstarts to use these instead of the word count.
>>>>>>>>>>>>> There is already a dedicated word count walkthrough so I think that is
>>>>>>>>>>>>> already covered.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > If we all agree on the repo names, who can help us create
>>>>>>>>>>>>> them?
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big
>>>>>>>>>>>>> part of it.
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one
>>>>>>>>>>>>> would put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part
>>>>>>>>>>>>> of Apache software it needs to be ASL2)
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>>> >> bind the users of such a template as being a derivative
>>>>>>>>>>>>> work of a
>>>>>>>>>>>>> >> too-restrictive licence. The licence of the template itself
>>>>>>>>>>>>> should
>>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to be
>>>>>>>>>>>>> told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project
>>>>>>>>>>>>> there is quite
>>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put in
>>>>>>>>>>>>> a Python repo
>>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a
>>>>>>>>>>>>> dependency on
>>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice to
>>>>>>>>>>>>> be found out
>>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>>>>>>> javascript
>>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>>> package.json file
>>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>> >> >>> > There are several examples already within the Beam
>>>>>>>>>>>>> repo found in:
>>>>>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just
>>>>>>>>>>>>> for novelty/freshness but agreed with the suggestion that having an example
>>>>>>>>>>>>> in each quickstart would be ideal.
>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount
>>>>>>>>>>>>> example in each repo? I know that makes the repos less minimal, but we
>>>>>>>>>>>>> could rewrite the quickstarts around these repos instead of the current
>>>>>>>>>>>>> Wordcount examples. Or maybe we don't need to use the Wordcount example in
>>>>>>>>>>>>> the quickstarts...
>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>>> maintainable.
>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>>>> well.
>>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull
>>>>>>>>>>>>> out a specific version of the examples that coincides with a specific SDK
>>>>>>>>>>>>> version then we could drop the archetypes.
>>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect
>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>>>> archetype?
>>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo
>>>>>>>>>>>>> since we can see how a true minimal project looks like. Having it in the
>>>>>>>>>>>>> main repo would inherit build file configurations and other settings that
>>>>>>>>>>>>> would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect
>>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language,
>>>>>>>>>>>>> and having all the build systems we want to support for them. As long as we
>>>>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>>>>> repos to maintain.
>>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible
>>>>>>>>>>>>> then the archetypes but the archetypes have a few conveniences since they
>>>>>>>>>>>>> are integrated with apache/beam repo. For example, updates/testing are done
>>>>>>>>>>>>> at the same time a corresponding change to the main repo is done (like
>>>>>>>>>>>>> library version updates), they are released when the SDK is released.
>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>>>>>> per build system?
>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen
>>>>>>>>>>>>> (e.g. release manager owns it)?
>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos
>>>>>>>>>>>>> <dc...@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need
>>>>>>>>>>>>> to pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the
>>>>>>>>>>>>> repo?
>>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>>>>>>>> altay@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress
>>>>>>>>>>>>> on this? Do you need help?
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian
>>>>>>>>>>>>> Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would
>>>>>>>>>>>>> we put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam
>>>>>>>>>>>>> Java project, I've been working on a GitHub template containing a minimal
>>>>>>>>>>>>> Beam Java pipeline for people to start with.
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template
>>>>>>>>>>>>> contains:
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>>>>>> (Direct runner)
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub
>>>>>>>>>>>>> actions (around 1-2 minutes to run)
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to
>>>>>>>>>>>>> build, run, test, and add other runners
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo
>>>>>>>>>>>>> from a template.
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is
>>>>>>>>>>>>> happy with it 🙂
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on
>>>>>>>>>>>>> how to create a new Beam Java pipeline
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
I found this example NOTICE
<https://infra.apache.org/licensing-howto.html#example-notice> file, but it
doesn't look like it does what we want. It looks like it has to be written
in a formal legal language and I don't feel comfortable writing it. Can I
ask for help on writing out the contents of the NOTICE file?

On Thu, Feb 17, 2022 at 11:00 AM David Cavazos <dc...@google.com> wrote:

> Can someone point me to an example on how the NOTICE file should look
> like? I'm not familiar with it and would like to get it right.
>
> On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com>
> wrote:
>
>> +1
>> For the starter projects I like them being "clone and go", but I'd like
>> to keep them as minimal as possible. We could have another repo like
>> `beam-working-examples` for more complete examples where each subdirectory
>> is a self-contained example with all its build files and everything.
>>
>> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> I like the goal: for things where the build has extra setup, have an
>>> example that is fully functional on its own. There is of course the problem
>>> of "where does it end?" since this is infinity things.
>>>
>>> The other piece is that a user wanting to know some of these bits may be
>>> past the "clone and go" stage of their project. They probably already have
>>> a project and now they need a working example to read and learn from. So it
>>> could be just one additional repo `beam-working-examples` where each
>>> subdirectory is an independent working setup. I do like having it a
>>> separate repo to avoid the temptation to leverage anything from the Beam
>>> build. And each subdirectory should be entirely independent and we also
>>> have to avoid the temptation to share configuration across them, or it
>>> would defeat the purpose.
>>>
>>> Kenn
>>>
>>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <ra...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> This is great!
>>>>
>>>> What do folks think about also having a less minimal set of starters?
>>>> For Java I am thinking about protobuf / autovalue. For Python maybe an
>>>> opinionated setup with tox etc... Again this would just contain 'hello'
>>>> world samples to get folks going.
>>>>
>>>> Regards
>>>> Reza
>>>>
>>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>>
>>>>> SGTM.
>>>>>
>>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Based on discussion on
>>>>>> https://issues.apache.org/jira/browse/LEGAL-601 I think it will be
>>>>>> simplest to license it under ASL2 and include a NOTICE file. The user will
>>>>>> be free to "clone and go".
>>>>>>
>>>>>> I would bring these points back to the dev list:
>>>>>>
>>>>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>>>>> surprise"
>>>>>>  - Dual-licensing is possible (but I think not worthwhile due to its
>>>>>> impact on contributor license agreements)
>>>>>>  - ASL2 says "You must cause any modified files to carry prominent
>>>>>> notices stating that You changed the files" which won't apply to the user's
>>>>>> code and I would guess they simply won't bother with for files in the
>>>>>> template. Or maybe there is a clever way to phrase the header so it is
>>>>>> already good to go.
>>>>>>  - ASL2 says if the work includes a NOTICE file, you have to includes
>>>>>> the attributions from it. The NOTICE file is required by ASF policy. We can
>>>>>> easily set it up to be a noop for the user.
>>>>>>
>>>>>> So my overall take is that we should go ahead with ASL2 and a simple
>>>>>> NOTICE file. Check the Jira for details.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> And I've created the repos just now.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Legal question asked at
>>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>>
>>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or with
>>>>>>>>> the Go template). I will say though, the Actions config should be pretty
>>>>>>>>> darn simple for these examples -
>>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>>> just want a job with:
>>>>>>>>>
>>>>>>>>>    - checkout
>>>>>>>>>    - setup-<language>
>>>>>>>>>    - inlined script to run tests
>>>>>>>>>
>>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Danny
>>>>>>>>>
>>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Danny has extensive experience with GitHub actions, and may be
>>>>>>>>>> able to help out.
>>>>>>>>>> Kerry
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>>
>>>>>>>>>>> I can take on the task of asking about MIT license and
>>>>>>>>>>> requesting the repos be created. Not sure if it needs my level of
>>>>>>>>>>> privileges but I'm happy to do it anyhow.
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>>> the following repos?
>>>>>>>>>>>>
>>>>>>>>>>>> For these starter projects, we don't want to encumber any users
>>>>>>>>>>>> of
>>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>>> (right?)
>>>>>>>>>>>> and we don't even care about attribution. We want these to be
>>>>>>>>>>>> pretty
>>>>>>>>>>>> much as close to public domain as possible. That's not what the
>>>>>>>>>>>> Apache
>>>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>>>> likely be
>>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>>> explicit
>>>>>>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>>>>>>
>>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>>> >
>>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>>> >
>>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects
>>>>>>>>>>>> for every language. Once we have Java, Python and Go, it might be a good
>>>>>>>>>>>> idea to change the quickstarts to use these instead of the word count.
>>>>>>>>>>>> There is already a dedicated word count walkthrough so I think that is
>>>>>>>>>>>> already covered.
>>>>>>>>>>>> >
>>>>>>>>>>>> > If we all agree on the repo names, who can help us create
>>>>>>>>>>>> them?
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>>> >> >
>>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big
>>>>>>>>>>>> part of it.
>>>>>>>>>>>> >> >
>>>>>>>>>>>> >> > But also the answer to "I simply don't know what one would
>>>>>>>>>>>> put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>>> repo, namely:
>>>>>>>>>>>> >> >
>>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>>>>>>>> Apache software it needs to be ASL2)
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>>> doesn't want to
>>>>>>>>>>>> >> bind the users of such a template as being a derivative work
>>>>>>>>>>>> of a
>>>>>>>>>>>> >> too-restrictive licence. The licence of the template itself
>>>>>>>>>>>> should
>>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <
>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>> >> >>
>>>>>>>>>>>> >> >> I think for consistency it makes sense to users to be
>>>>>>>>>>>> told to checkout this git repo for the language of your choice and run.
>>>>>>>>>>>> Some repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>>> >> >>
>>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>> >> >>>
>>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project
>>>>>>>>>>>> there is quite
>>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>>>>>>>> Python repo
>>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency
>>>>>>>>>>>> on
>>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>>> layout, etc. more
>>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice to be
>>>>>>>>>>>> found out
>>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>>>>>> javascript
>>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>>> package.json file
>>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>>> >> >>>
>>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>> >> >>> > There are several examples already within the Beam
>>>>>>>>>>>> repo found in:
>>>>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just
>>>>>>>>>>>> for novelty/freshness but agreed with the suggestion that having an example
>>>>>>>>>>>> in each quickstart would be ideal.
>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example
>>>>>>>>>>>> in each repo? I know that makes the repos less minimal, but we could
>>>>>>>>>>>> rewrite the quickstarts around these repos instead of the current Wordcount
>>>>>>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>>>>>>>> quickstarts...
>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>>> maintainable.
>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>>> well.
>>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull
>>>>>>>>>>>> out a specific version of the examples that coincides with a specific SDK
>>>>>>>>>>>> version then we could drop the archetypes.
>>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect
>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we
>>>>>>>>>>>> discovered a breakage after the release. Agree we should verify RCs
>>>>>>>>>>>> (document as part of the release process), or even better, add automation
>>>>>>>>>>>> to verify the repo against snapshots. The automation could be nice to have
>>>>>>>>>>>> anyway since it provides an example for users to follow if they want to
>>>>>>>>>>>> test against snapshots and report issues to us sooner.
>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>>> archetype?
>>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo
>>>>>>>>>>>> since we can see how a true minimal project looks like. Having it in the
>>>>>>>>>>>> main repo would inherit build file configurations and other settings that
>>>>>>>>>>>> would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>>> instance of the template.
>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect
>>>>>>>>>>>> them to break commonly, but I think it would be good to make sure tests
>>>>>>>>>>>> aren't failing when a release is published.
>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>>>>>>>> having all the build systems we want to support for them. As long as we
>>>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>>>> repos to maintain.
>>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible
>>>>>>>>>>>> then the archetypes but the archetypes have a few conveniences since they
>>>>>>>>>>>> are integrated with apache/beam repo. For example, updates/testing are done
>>>>>>>>>>>> at the same time a corresponding change to the main repo is done (like
>>>>>>>>>>>> library version updates), they are released when the SDK is released.
>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>>>>> per build system?
>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen
>>>>>>>>>>>> (e.g. release manager owns it)?
>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need
>>>>>>>>>>>> to pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>>>>> What do you think?
>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the
>>>>>>>>>>>> repo?
>>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>>>>>>> altay@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress
>>>>>>>>>>>> on this? Do you need help?
>>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette
>>>>>>>>>>>> <bh...@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we
>>>>>>>>>>>> put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam
>>>>>>>>>>>> Java project, I've been working on a GitHub template containing a minimal
>>>>>>>>>>>> Beam Java pipeline for people to start with.
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>>>>> (Direct runner)
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>>>>>>>> (around 1-2 minutes to run)
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build,
>>>>>>>>>>>> run, test, and add other runners
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo
>>>>>>>>>>>> from a template.
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is
>>>>>>>>>>>> happy with it 🙂
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on
>>>>>>>>>>>> how to create a new Beam Java pipeline
>>>>>>>>>>>>
>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
Can someone point me to an example on how the NOTICE file should look like?
I'm not familiar with it and would like to get it right.

On Thu, Feb 17, 2022 at 10:53 AM David Cavazos <dc...@google.com> wrote:

> +1
> For the starter projects I like them being "clone and go", but I'd like to
> keep them as minimal as possible. We could have another repo like
> `beam-working-examples` for more complete examples where each subdirectory
> is a self-contained example with all its build files and everything.
>
> On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> I like the goal: for things where the build has extra setup, have an
>> example that is fully functional on its own. There is of course the problem
>> of "where does it end?" since this is infinity things.
>>
>> The other piece is that a user wanting to know some of these bits may be
>> past the "clone and go" stage of their project. They probably already have
>> a project and now they need a working example to read and learn from. So it
>> could be just one additional repo `beam-working-examples` where each
>> subdirectory is an independent working setup. I do like having it a
>> separate repo to avoid the temptation to leverage anything from the Beam
>> build. And each subdirectory should be entirely independent and we also
>> have to avoid the temptation to share configuration across them, or it
>> would defeat the purpose.
>>
>> Kenn
>>
>> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <ra...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> This is great!
>>>
>>> What do folks think about also having a less minimal set of starters?
>>> For Java I am thinking about protobuf / autovalue. For Python maybe an
>>> opinionated setup with tox etc... Again this would just contain 'hello'
>>> world samples to get folks going.
>>>
>>> Regards
>>> Reza
>>>
>>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>>
>>>> SGTM.
>>>>
>>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>>
>>>>> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601
>>>>> I think it will be simplest to license it under ASL2 and include a NOTICE
>>>>> file. The user will be free to "clone and go".
>>>>>
>>>>> I would bring these points back to the dev list:
>>>>>
>>>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>>>> surprise"
>>>>>  - Dual-licensing is possible (but I think not worthwhile due to its
>>>>> impact on contributor license agreements)
>>>>>  - ASL2 says "You must cause any modified files to carry prominent
>>>>> notices stating that You changed the files" which won't apply to the user's
>>>>> code and I would guess they simply won't bother with for files in the
>>>>> template. Or maybe there is a clever way to phrase the header so it is
>>>>> already good to go.
>>>>>  - ASL2 says if the work includes a NOTICE file, you have to includes
>>>>> the attributions from it. The NOTICE file is required by ASF policy. We can
>>>>> easily set it up to be a noop for the user.
>>>>>
>>>>> So my overall take is that we should go ahead with ASL2 and a simple
>>>>> NOTICE file. Check the Jira for details.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> And I've created the repos just now.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Legal question asked at
>>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>>> dannymccormick@google.com> wrote:
>>>>>>>
>>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or with
>>>>>>>> the Go template). I will say though, the Actions config should be pretty
>>>>>>>> darn simple for these examples -
>>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>>> just want a job with:
>>>>>>>>
>>>>>>>>    - checkout
>>>>>>>>    - setup-<language>
>>>>>>>>    - inlined script to run tests
>>>>>>>>
>>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Danny
>>>>>>>>
>>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>>> kerrydc@google.com> wrote:
>>>>>>>>
>>>>>>>>> Danny has extensive experience with GitHub actions, and may be
>>>>>>>>> able to help out.
>>>>>>>>> Kerry
>>>>>>>>>
>>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>>
>>>>>>>>>> I can take on the task of asking about MIT license and requesting
>>>>>>>>>> the repos be created. Not sure if it needs my level of privileges but I'm
>>>>>>>>>> happy to do it anyhow.
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > MIT is much more permissive, but I also don't have any
>>>>>>>>>>> problems changing it to Apache license. In any case, how about we create
>>>>>>>>>>> the following repos?
>>>>>>>>>>>
>>>>>>>>>>> For these starter projects, we don't want to encumber any users
>>>>>>>>>>> of
>>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>>> (right?)
>>>>>>>>>>> and we don't even care about attribution. We want these to be
>>>>>>>>>>> pretty
>>>>>>>>>>> much as close to public domain as possible. That's not what the
>>>>>>>>>>> Apache
>>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>>> likely be
>>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>>> explicit
>>>>>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>>>>>
>>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>>> >
>>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>>> >
>>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects
>>>>>>>>>>> for every language. Once we have Java, Python and Go, it might be a good
>>>>>>>>>>> idea to change the quickstarts to use these instead of the word count.
>>>>>>>>>>> There is already a dedicated word count walkthrough so I think that is
>>>>>>>>>>> already covered.
>>>>>>>>>>> >
>>>>>>>>>>> > If we all agree on the repo names, who can help us create them?
>>>>>>>>>>> >
>>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>> >>
>>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part
>>>>>>>>>>> of it.
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > But also the answer to "I simply don't know what one would
>>>>>>>>>>> put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>>> repo, namely:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>>> >> >  - README.md
>>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>>> >>
>>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>>> >>
>>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>>>>>>> Apache software it needs to be ASL2)
>>>>>>>>>>> >>
>>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one
>>>>>>>>>>> doesn't want to
>>>>>>>>>>> >> bind the users of such a template as being a derivative work
>>>>>>>>>>> of a
>>>>>>>>>>> >> too-restrictive licence. The licence of the template itself
>>>>>>>>>>> should
>>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>>> >>
>>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> I think for consistency it makes sense to users to be told
>>>>>>>>>>> to checkout this git repo for the language of your choice and run. Some
>>>>>>>>>>> repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project there
>>>>>>>>>>> is quite
>>>>>>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>>>>>>> Python repo
>>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency
>>>>>>>>>>> on
>>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file
>>>>>>>>>>> layout, etc. more
>>>>>>>>>>> >> >>> than that (though there's plenty of generic advice to be
>>>>>>>>>>> found out
>>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>>>>> javascript
>>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>>> package.json file
>>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>> >> >>> >
>>>>>>>>>>> >> >>> > There are several examples already within the Beam repo
>>>>>>>>>>> found in:
>>>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>>>> >> >>> >
>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>>> >> >>> >
>>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>>> >> >>> >
>>>>>>>>>>> >> >>> >
>>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>>> >> >>> >>
>>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>>>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>>>>>>> each quickstart would be ideal.
>>>>>>>>>>> >> >>> >>
>>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example
>>>>>>>>>>> in each repo? I know that makes the repos less minimal, but we could
>>>>>>>>>>> rewrite the quickstarts around these repos instead of the current Wordcount
>>>>>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>>>>>>> quickstarts...
>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>>> maintainable.
>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would
>>>>>>>>>>> prefer having repos for all languages. It makes sense for consistency as
>>>>>>>>>>> well.
>>>>>>>>>>> >> >>> >>>>
>>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out
>>>>>>>>>>> a specific version of the examples that coincides with a specific SDK
>>>>>>>>>>> version then we could drop the archetypes.
>>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them
>>>>>>>>>>> to break commonly, but I think it would be good to make sure tests aren't
>>>>>>>>>>> failing when a release is published.
>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered
>>>>>>>>>>> a breakage after the release. Agree we should verify RCs (document as part
>>>>>>>>>>> of the release process), or even better, add automation to verify the repo
>>>>>>>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>>>>>>>> provides an example for users to follow if they want to test against
>>>>>>>>>>> snapshots and report issues to us sooner.
>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>>> archetype?
>>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo
>>>>>>>>>>> since we can see how a true minimal project looks like. Having it in the
>>>>>>>>>>> main repo would inherit build file configurations and other settings that
>>>>>>>>>>> would be different from a clean project, so it could be non-trivial to
>>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>>> instance of the template.
>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them
>>>>>>>>>>> to break commonly, but I think it would be good to make sure tests aren't
>>>>>>>>>>> failing when a release is published.
>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>>>>>>> having all the build systems we want to support for them. As long as we
>>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>>> repos to maintain.
>>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible
>>>>>>>>>>> then the archetypes but the archetypes have a few conveniences since they
>>>>>>>>>>> are integrated with apache/beam repo. For example, updates/testing are done
>>>>>>>>>>> at the same time a corresponding change to the main repo is done (like
>>>>>>>>>>> library version updates), they are released when the SDK is released.
>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>>>> per build system?
>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen
>>>>>>>>>>> (e.g. release manager owns it)?
>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to
>>>>>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>>>> What do you think?
>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the
>>>>>>>>>>> repo?
>>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>>>>>> altay@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress
>>>>>>>>>>> on this? Do you need help?
>>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam
>>>>>>>>>>> already, built with Maven Archetype [1]. It's what powers the Java
>>>>>>>>>>> quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template
>>>>>>>>>>> in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we
>>>>>>>>>>> put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David
>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam
>>>>>>>>>>> Java project, I've been working on a GitHub template containing a minimal
>>>>>>>>>>> Beam Java pipeline for people to start with.
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>>>> (Direct runner)
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>>>>>>> (around 1-2 minutes to run)
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build,
>>>>>>>>>>> run, test, and add other runners
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from
>>>>>>>>>>> a template.
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is
>>>>>>>>>>> happy with it 🙂
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on
>>>>>>>>>>> how to create a new Beam Java pipeline
>>>>>>>>>>>
>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
+1
For the starter projects I like them being "clone and go", but I'd like to
keep them as minimal as possible. We could have another repo like
`beam-working-examples` for more complete examples where each subdirectory
is a self-contained example with all its build files and everything.

On Wed, Feb 16, 2022 at 5:59 AM Kenneth Knowles <ke...@apache.org> wrote:

> I like the goal: for things where the build has extra setup, have an
> example that is fully functional on its own. There is of course the problem
> of "where does it end?" since this is infinity things.
>
> The other piece is that a user wanting to know some of these bits may be
> past the "clone and go" stage of their project. They probably already have
> a project and now they need a working example to read and learn from. So it
> could be just one additional repo `beam-working-examples` where each
> subdirectory is an independent working setup. I do like having it a
> separate repo to avoid the temptation to leverage anything from the Beam
> build. And each subdirectory should be entirely independent and we also
> have to avoid the temptation to share configuration across them, or it
> would defeat the purpose.
>
> Kenn
>
> On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <ra...@gmail.com>
> wrote:
>
>> Hi,
>>
>> This is great!
>>
>> What do folks think about also having a less minimal set of starters? For
>> Java I am thinking about protobuf / autovalue. For Python maybe an
>> opinionated setup with tox etc... Again this would just contain 'hello'
>> world samples to get folks going.
>>
>> Regards
>> Reza
>>
>> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>>
>>> SGTM.
>>>
>>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601
>>>> I think it will be simplest to license it under ASL2 and include a NOTICE
>>>> file. The user will be free to "clone and go".
>>>>
>>>> I would bring these points back to the dev list:
>>>>
>>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>>> surprise"
>>>>  - Dual-licensing is possible (but I think not worthwhile due to its
>>>> impact on contributor license agreements)
>>>>  - ASL2 says "You must cause any modified files to carry prominent
>>>> notices stating that You changed the files" which won't apply to the user's
>>>> code and I would guess they simply won't bother with for files in the
>>>> template. Or maybe there is a clever way to phrase the header so it is
>>>> already good to go.
>>>>  - ASL2 says if the work includes a NOTICE file, you have to includes
>>>> the attributions from it. The NOTICE file is required by ASF policy. We can
>>>> easily set it up to be a noop for the user.
>>>>
>>>> So my overall take is that we should go ahead with ASL2 and a simple
>>>> NOTICE file. Check the Jira for details.
>>>>
>>>> Kenn
>>>>
>>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> And I've created the repos just now.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Legal question asked at
>>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>>> dannymccormick@google.com> wrote:
>>>>>>
>>>>>>> Sure - I'm happy to help out with the Actions setup (and/or with the
>>>>>>> Go template). I will say though, the Actions config should be pretty darn
>>>>>>> simple for these examples -
>>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>>> just want a job with:
>>>>>>>
>>>>>>>    - checkout
>>>>>>>    - setup-<language>
>>>>>>>    - inlined script to run tests
>>>>>>>
>>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Danny
>>>>>>>
>>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <
>>>>>>> kerrydc@google.com> wrote:
>>>>>>>
>>>>>>>> Danny has extensive experience with GitHub actions, and may be able
>>>>>>>> to help out.
>>>>>>>> Kerry
>>>>>>>>
>>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>>
>>>>>>>>> I can take on the task of asking about MIT license and requesting
>>>>>>>>> the repos be created. Not sure if it needs my level of privileges but I'm
>>>>>>>>> happy to do it anyhow.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>> >
>>>>>>>>>> > MIT is much more permissive, but I also don't have any problems
>>>>>>>>>> changing it to Apache license. In any case, how about we create the
>>>>>>>>>> following repos?
>>>>>>>>>>
>>>>>>>>>> For these starter projects, we don't want to encumber any users of
>>>>>>>>>> these templates with any particular licensing requirements
>>>>>>>>>> (right?)
>>>>>>>>>> and we don't even care about attribution. We want these to be
>>>>>>>>>> pretty
>>>>>>>>>> much as close to public domain as possible. That's not what the
>>>>>>>>>> Apache
>>>>>>>>>> licence does. (If it's even relevant, a good argument could
>>>>>>>>>> likely be
>>>>>>>>>> made for de minis or fair use, but I think it's best to be
>>>>>>>>>> explicit
>>>>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>>>>
>>>>>>>>>> > apache/beam-starter-java
>>>>>>>>>> > apache/beam-starter-python
>>>>>>>>>> > apache/beam-starter-go
>>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>>> >
>>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>>> >
>>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects
>>>>>>>>>> for every language. Once we have Java, Python and Go, it might be a good
>>>>>>>>>> idea to change the quickstarts to use these instead of the word count.
>>>>>>>>>> There is already a dedicated word count walkthrough so I think that is
>>>>>>>>>> already covered.
>>>>>>>>>> >
>>>>>>>>>> > If we all agree on the repo names, who can help us create them?
>>>>>>>>>> >
>>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>> >>
>>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>>> >> >
>>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part
>>>>>>>>>> of it.
>>>>>>>>>> >> >
>>>>>>>>>> >> > But also the answer to "I simply don't know what one would
>>>>>>>>>> put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>>> repo, namely:
>>>>>>>>>> >> >
>>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>>> >> >  - README.md
>>>>>>>>>> >> >  - example that already runs
>>>>>>>>>> >>
>>>>>>>>>> >> OK, fair enough.
>>>>>>>>>> >>
>>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>>>>>> Apache software it needs to be ASL2)
>>>>>>>>>> >>
>>>>>>>>>> >> On the topic of licence, it's a bit tricky because one doesn't
>>>>>>>>>> want to
>>>>>>>>>> >> bind the users of such a template as being a derivative work
>>>>>>>>>> of a
>>>>>>>>>> >> too-restrictive licence. The licence of the template itself
>>>>>>>>>> should
>>>>>>>>>> >> generally be very permissive.
>>>>>>>>>> >>
>>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> I think for consistency it makes sense to users to be told
>>>>>>>>>> to checkout this git repo for the language of your choice and run. Some
>>>>>>>>>> repos will have more/less than others when it comes to setup necessary.
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project there
>>>>>>>>>> is quite
>>>>>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>>>>>> Python repo
>>>>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file layout,
>>>>>>>>>> etc. more
>>>>>>>>>> >> >>> than that (though there's plenty of generic advice to be
>>>>>>>>>> found out
>>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>>>> javascript
>>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>>> package.json file
>>>>>>>>>> >> >>> gets updated).
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <
>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> > There are several examples already within the Beam repo
>>>>>>>>>> found in:
>>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>>> >> >>> >
>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>>> >> >>> >
>>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>>>>>> each quickstart would be ideal.
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example
>>>>>>>>>> in each repo? I know that makes the repos less minimal, but we could
>>>>>>>>>> rewrite the quickstarts around these repos instead of the current Wordcount
>>>>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>>>>>> quickstarts...
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>> >> >>> >>>>
>>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less
>>>>>>>>>> maintenance is preferable, and the github repos are more flexible and
>>>>>>>>>> maintainable.
>>>>>>>>>> >> >>> >>>>
>>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>>> >> >>> >>>>
>>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>>> >> >>> >>>>
>>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>>>>>>>> having repos for all languages. It makes sense for consistency as well.
>>>>>>>>>> >> >>> >>>>
>>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out
>>>>>>>>>> a specific version of the examples that coincides with a specific SDK
>>>>>>>>>> version then we could drop the archetypes.
>>>>>>>>>> >> >>> >>>>>
>>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them
>>>>>>>>>> to break commonly, but I think it would be good to make sure tests aren't
>>>>>>>>>> failing when a release is published.
>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered
>>>>>>>>>> a breakage after the release. Agree we should verify RCs (document as part
>>>>>>>>>> of the release process), or even better, add automation to verify the repo
>>>>>>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>>>>>>> provides an example for users to follow if they want to test against
>>>>>>>>>> snapshots and report issues to us sooner.
>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>>> archetype?
>>>>>>>>>> >> >>> >>>>>>
>>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo
>>>>>>>>>> since we can see how a true minimal project looks like. Having it in the
>>>>>>>>>> main repo would inherit build file configurations and other settings that
>>>>>>>>>> would be different from a clean project, so it could be non-trivial to
>>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>>> instance of the template.
>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them
>>>>>>>>>> to break commonly, but I think it would be good to make sure tests aren't
>>>>>>>>>> failing when a release is published.
>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>>>>>> having all the build systems we want to support for them. As long as we
>>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>>> repos to maintain.
>>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then
>>>>>>>>>> the archetypes but the archetypes have a few conveniences since they are
>>>>>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>>>>>> version updates), they are released when the SDK is released.
>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>>> per build system?
>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen
>>>>>>>>>> (e.g. release manager owns it)?
>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to
>>>>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>>> What do you think?
>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the
>>>>>>>>>> repo?
>>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>>>>> altay@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on
>>>>>>>>>> this? Do you need help?
>>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already,
>>>>>>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart [2].
>>>>>>>>>> Could we de-dupe these (e.g. reference the GitHub template in the
>>>>>>>>>> quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we
>>>>>>>>>> put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos
>>>>>>>>>> <dc...@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>>>>>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>>>>>>>> Java pipeline for people to start with.
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>>> (Direct runner)
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>>>>>> (around 1-2 minutes to run)
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build,
>>>>>>>>>> run, test, and add other runners
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from
>>>>>>>>>> a template.
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is
>>>>>>>>>> happy with it 🙂
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how
>>>>>>>>>> to create a new Beam Java pipeline
>>>>>>>>>>
>>>>>>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
I like the goal: for things where the build has extra setup, have an
example that is fully functional on its own. There is of course the problem
of "where does it end?" since this is infinity things.

The other piece is that a user wanting to know some of these bits may be
past the "clone and go" stage of their project. They probably already have
a project and now they need a working example to read and learn from. So it
could be just one additional repo `beam-working-examples` where each
subdirectory is an independent working setup. I do like having it a
separate repo to avoid the temptation to leverage anything from the Beam
build. And each subdirectory should be entirely independent and we also
have to avoid the temptation to share configuration across them, or it
would defeat the purpose.

Kenn

On Tue, Feb 15, 2022 at 9:28 PM Reza Ardeshir Rokni <ra...@gmail.com>
wrote:

> Hi,
>
> This is great!
>
> What do folks think about also having a less minimal set of starters? For
> Java I am thinking about protobuf / autovalue. For Python maybe an
> opinionated setup with tox etc... Again this would just contain 'hello'
> world samples to get folks going.
>
> Regards
> Reza
>
> On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:
>
>> SGTM.
>>
>> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601
>>> I think it will be simplest to license it under ASL2 and include a NOTICE
>>> file. The user will be free to "clone and go".
>>>
>>> I would bring these points back to the dev list:
>>>
>>>  - ASL2 is what people expect from an ASF project, so it is "least
>>> surprise"
>>>  - Dual-licensing is possible (but I think not worthwhile due to its
>>> impact on contributor license agreements)
>>>  - ASL2 says "You must cause any modified files to carry prominent
>>> notices stating that You changed the files" which won't apply to the user's
>>> code and I would guess they simply won't bother with for files in the
>>> template. Or maybe there is a clever way to phrase the header so it is
>>> already good to go.
>>>  - ASL2 says if the work includes a NOTICE file, you have to includes
>>> the attributions from it. The NOTICE file is required by ASF policy. We can
>>> easily set it up to be a noop for the user.
>>>
>>> So my overall take is that we should go ahead with ASL2 and a simple
>>> NOTICE file. Check the Jira for details.
>>>
>>> Kenn
>>>
>>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> And I've created the repos just now.
>>>>
>>>> Kenn
>>>>
>>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>>
>>>>> Legal question asked at
>>>>> https://issues.apache.org/jira/browse/LEGAL-601
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> Sure - I'm happy to help out with the Actions setup (and/or with the
>>>>>> Go template). I will say though, the Actions config should be pretty darn
>>>>>> simple for these examples -
>>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>>> seems right, for each language configuration we're targeting we basically
>>>>>> just want a job with:
>>>>>>
>>>>>>    - checkout
>>>>>>    - setup-<language>
>>>>>>    - inlined script to run tests
>>>>>>
>>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>>
>>>>>> Thanks,
>>>>>> Danny
>>>>>>
>>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Danny has extensive experience with GitHub actions, and may be able
>>>>>>> to help out.
>>>>>>> Kerry
>>>>>>>
>>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>>
>>>>>>>> I can take on the task of asking about MIT license and requesting
>>>>>>>> the repos be created. Not sure if it needs my level of privileges but I'm
>>>>>>>> happy to do it anyhow.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>
>>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > MIT is much more permissive, but I also don't have any problems
>>>>>>>>> changing it to Apache license. In any case, how about we create the
>>>>>>>>> following repos?
>>>>>>>>>
>>>>>>>>> For these starter projects, we don't want to encumber any users of
>>>>>>>>> these templates with any particular licensing requirements (right?)
>>>>>>>>> and we don't even care about attribution. We want these to be
>>>>>>>>> pretty
>>>>>>>>> much as close to public domain as possible. That's not what the
>>>>>>>>> Apache
>>>>>>>>> licence does. (If it's even relevant, a good argument could likely
>>>>>>>>> be
>>>>>>>>> made for de minis or fair use, but I think it's best to be explicit
>>>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>>>
>>>>>>>>> > apache/beam-starter-java
>>>>>>>>> > apache/beam-starter-python
>>>>>>>>> > apache/beam-starter-go
>>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>>> > apache/beam-starter-scala
>>>>>>>>> >
>>>>>>>>> > We'll start by populating the Java one which is the most
>>>>>>>>> pressing one and the one that is ready, but the rest should be simpler.
>>>>>>>>> >
>>>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects for
>>>>>>>>> every language. Once we have Java, Python and Go, it might be a good idea
>>>>>>>>> to change the quickstarts to use these instead of the word count. There is
>>>>>>>>> already a dedicated word count walkthrough so I think that is already
>>>>>>>>> covered.
>>>>>>>>> >
>>>>>>>>> > If we all agree on the repo names, who can help us create them?
>>>>>>>>> >
>>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>> >>
>>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <
>>>>>>>>> kenn@apache.org> wrote:
>>>>>>>>> >> >
>>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part
>>>>>>>>> of it.
>>>>>>>>> >> >
>>>>>>>>> >> > But also the answer to "I simply don't know what one would
>>>>>>>>> put in a Python repo than, other than a bare setup.py that lists a
>>>>>>>>> dependency on apache_beam" is answered by David's initial email and his
>>>>>>>>> repo, namely:
>>>>>>>>> >> >
>>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>>> >> >  - README.md
>>>>>>>>> >> >  - example that already runs
>>>>>>>>> >>
>>>>>>>>> >> OK, fair enough.
>>>>>>>>> >>
>>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>>>>> Apache software it needs to be ASL2)
>>>>>>>>> >>
>>>>>>>>> >> On the topic of licence, it's a bit tricky because one doesn't
>>>>>>>>> want to
>>>>>>>>> >> bind the users of such a template as being a derivative work of
>>>>>>>>> a
>>>>>>>>> >> too-restrictive licence. The licence of the template itself
>>>>>>>>> should
>>>>>>>>> >> generally be very permissive.
>>>>>>>>> >>
>>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>> >> >>
>>>>>>>>> >> >> I think for consistency it makes sense to users to be told
>>>>>>>>> to checkout this git repo for the language of your choice and run. Some
>>>>>>>>> repos will have more/less than others when it comes to setup necessary.
>>>>>>>>> >> >>
>>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>> >> >>>
>>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project there
>>>>>>>>> is quite
>>>>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>>>>> Python repo
>>>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>>>>>>> >> >>> apache_beam. We don't have recommendations on file layout,
>>>>>>>>> etc. more
>>>>>>>>> >> >>> than that (though there's plenty of generic advice to be
>>>>>>>>> found out
>>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>>> javascript
>>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>>> package.json file
>>>>>>>>> >> >>> gets updated).
>>>>>>>>> >> >>>
>>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>> >> >>> >
>>>>>>>>> >> >>> > There are several examples already within the Beam repo
>>>>>>>>> found in:
>>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>>> >> >>> >
>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>>> >> >>> >
>>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>>> >> >>> >
>>>>>>>>> >> >>> >
>>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>>> sachinag@google.com> wrote:
>>>>>>>>> >> >>> >>
>>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>>>>> each quickstart would be ideal.
>>>>>>>>> >> >>> >>
>>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>>> >> >>> >>>
>>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>>> >> >>> >>>
>>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example in
>>>>>>>>> each repo? I know that makes the repos less minimal, but we could rewrite
>>>>>>>>> the quickstarts around these repos instead of the current Wordcount
>>>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>>>>> quickstarts...
>>>>>>>>> >> >>> >>>
>>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>> >> >>> >>>>
>>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance
>>>>>>>>> is preferable, and the github repos are more flexible and maintainable.
>>>>>>>>> >> >>> >>>>
>>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>>> >> >>> >>>>
>>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>>> >> >>> >>>>
>>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>>>>>>> having repos for all languages. It makes sense for consistency as well.
>>>>>>>>> >> >>> >>>>
>>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>> >> >>> >>>>>
>>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>>>>>>>> specific version of the examples that coincides with a specific SDK version
>>>>>>>>> then we could drop the archetypes.
>>>>>>>>> >> >>> >>>>>
>>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>
>>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them
>>>>>>>>> to break commonly, but I think it would be good to make sure tests aren't
>>>>>>>>> failing when a release is published.
>>>>>>>>> >> >>> >>>>>>
>>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>>>>>>>> breakage after the release. Agree we should verify RCs (document as part of
>>>>>>>>> the release process), or even better, add automation to verify the repo
>>>>>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>>>>>> provides an example for users to follow if they want to test against
>>>>>>>>> snapshots and report issues to us sooner.
>>>>>>>>> >> >>> >>>>>>
>>>>>>>>> >> >>> >>>>>>
>>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>>> archetype?
>>>>>>>>> >> >>> >>>>>>
>>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>>> >> >>> >>>>>>>
>>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo
>>>>>>>>> since we can see how a true minimal project looks like. Having it in the
>>>>>>>>> main repo would inherit build file configurations and other settings that
>>>>>>>>> would be different from a clean project, so it could be non-trivial to
>>>>>>>>> adapt. Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>>> instance of the template.
>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them
>>>>>>>>> to break commonly, but I think it would be good to make sure tests aren't
>>>>>>>>> failing when a release is published.
>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>>>>> having all the build systems we want to support for them. As long as we
>>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>>> repos to maintain.
>>>>>>>>> >> >>> >>>>>>>>
>>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>>> lcwik@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then
>>>>>>>>> the archetypes but the archetypes have a few conveniences since they are
>>>>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>>>>> version updates), they are released when the SDK is released.
>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a
>>>>>>>>> single starter repo containing all the starters or one per language or one
>>>>>>>>> per build system?
>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>>>>>>>> release manager owns it)?
>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to
>>>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>> What do you think?
>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the
>>>>>>>>> repo?
>>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>>>> altay@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on
>>>>>>>>> this? Do you need help?
>>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>>>>> bhulette@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already,
>>>>>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart [2].
>>>>>>>>> Could we de-dupe these (e.g. reference the GitHub template in the
>>>>>>>>> quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we
>>>>>>>>> put this somewhere like apache/beam-java-template? I think apache
>>>>>>>>> repositories like beam-* are allowed.
>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David
>>>>>>>>> Cavazos <dc...@google.com> wrote:
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>>>>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>>>>>>> Java pipeline for people to start with.
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>>> (Direct runner)
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>>>>> (around 1-2 minutes to run)
>>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build,
>>>>>>>>> run, test, and add other runners
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>>>>>>>> template.
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is
>>>>>>>>> happy with it 🙂
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how
>>>>>>>>> to create a new Beam Java pipeline
>>>>>>>>>
>>>>>>>>

Re: Beam Java starter project template

Posted by Reza Ardeshir Rokni <ra...@gmail.com>.
Hi,

This is great!

What do folks think about also having a less minimal set of starters? For
Java I am thinking about protobuf / autovalue. For Python maybe an
opinionated setup with tox etc... Again this would just contain 'hello'
world samples to get folks going.

Regards
Reza

On Wed, 9 Feb 2022 at 13:56, Robert Burke <re...@google.com> wrote:

> SGTM.
>
> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601 I
>> think it will be simplest to license it under ASL2 and include a NOTICE
>> file. The user will be free to "clone and go".
>>
>> I would bring these points back to the dev list:
>>
>>  - ASL2 is what people expect from an ASF project, so it is "least
>> surprise"
>>  - Dual-licensing is possible (but I think not worthwhile due to its
>> impact on contributor license agreements)
>>  - ASL2 says "You must cause any modified files to carry prominent
>> notices stating that You changed the files" which won't apply to the user's
>> code and I would guess they simply won't bother with for files in the
>> template. Or maybe there is a clever way to phrase the header so it is
>> already good to go.
>>  - ASL2 says if the work includes a NOTICE file, you have to includes the
>> attributions from it. The NOTICE file is required by ASF policy. We can
>> easily set it up to be a noop for the user.
>>
>> So my overall take is that we should go ahead with ASL2 and a simple
>> NOTICE file. Check the Jira for details.
>>
>> Kenn
>>
>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> And I've created the repos just now.
>>>
>>> Kenn
>>>
>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601
>>>>
>>>> Kenn
>>>>
>>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Sure - I'm happy to help out with the Actions setup (and/or with the
>>>>> Go template). I will say though, the Actions config should be pretty darn
>>>>> simple for these examples -
>>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>>> seems right, for each language configuration we're targeting we basically
>>>>> just want a job with:
>>>>>
>>>>>    - checkout
>>>>>    - setup-<language>
>>>>>    - inlined script to run tests
>>>>>
>>>>> Always happy to help with or consult on any actions issues 🙂
>>>>>
>>>>> Thanks,
>>>>> Danny
>>>>>
>>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Danny has extensive experience with GitHub actions, and may be able
>>>>>> to help out.
>>>>>> Kerry
>>>>>>
>>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>>
>>>>>>> I can take on the task of asking about MIT license and requesting
>>>>>>> the repos be created. Not sure if it needs my level of privileges but I'm
>>>>>>> happy to do it anyhow.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > MIT is much more permissive, but I also don't have any problems
>>>>>>>> changing it to Apache license. In any case, how about we create the
>>>>>>>> following repos?
>>>>>>>>
>>>>>>>> For these starter projects, we don't want to encumber any users of
>>>>>>>> these templates with any particular licensing requirements (right?)
>>>>>>>> and we don't even care about attribution. We want these to be pretty
>>>>>>>> much as close to public domain as possible. That's not what the
>>>>>>>> Apache
>>>>>>>> licence does. (If it's even relevant, a good argument could likely
>>>>>>>> be
>>>>>>>> made for de minis or fair use, but I think it's best to be explicit
>>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>>
>>>>>>>> > apache/beam-starter-java
>>>>>>>> > apache/beam-starter-python
>>>>>>>> > apache/beam-starter-go
>>>>>>>> > apache/beam-starter-kotlin
>>>>>>>> > apache/beam-starter-scala
>>>>>>>> >
>>>>>>>> > We'll start by populating the Java one which is the most pressing
>>>>>>>> one and the one that is ready, but the rest should be simpler.
>>>>>>>> >
>>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects for
>>>>>>>> every language. Once we have Java, Python and Go, it might be a good idea
>>>>>>>> to change the quickstarts to use these instead of the word count. There is
>>>>>>>> already a dedicated word count walkthrough so I think that is already
>>>>>>>> covered.
>>>>>>>> >
>>>>>>>> > If we all agree on the repo names, who can help us create them?
>>>>>>>> >
>>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>> >>
>>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>>>>>>> wrote:
>>>>>>>> >> >
>>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part of
>>>>>>>> it.
>>>>>>>> >> >
>>>>>>>> >> > But also the answer to "I simply don't know what one would put
>>>>>>>> in a Python repo than, other than a bare setup.py that lists a dependency
>>>>>>>> on apache_beam" is answered by David's initial email and his repo, namely:
>>>>>>>> >> >
>>>>>>>> >> >  - GitHub Actions configuration
>>>>>>>> >> >  - README.md
>>>>>>>> >> >  - example that already runs
>>>>>>>> >>
>>>>>>>> >> OK, fair enough.
>>>>>>>> >>
>>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>>>> Apache software it needs to be ASL2)
>>>>>>>> >>
>>>>>>>> >> On the topic of licence, it's a bit tricky because one doesn't
>>>>>>>> want to
>>>>>>>> >> bind the users of such a template as being a derivative work of a
>>>>>>>> >> too-restrictive licence. The licence of the template itself
>>>>>>>> should
>>>>>>>> >> generally be very permissive.
>>>>>>>> >>
>>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>>>>> wrote:
>>>>>>>> >> >>
>>>>>>>> >> >> I think for consistency it makes sense to users to be told to
>>>>>>>> checkout this git repo for the language of your choice and run. Some repos
>>>>>>>> will have more/less than others when it comes to setup necessary.
>>>>>>>> >> >>
>>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>> >> >>>
>>>>>>>> >> >>> +1 for doing this for Java, as setting up a project there is
>>>>>>>> quite
>>>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>>>> Python repo
>>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>>>>>> >> >>> apache_beam. We don't have recommendations on file layout,
>>>>>>>> etc. more
>>>>>>>> >> >>> than that (though there's plenty of generic advice to be
>>>>>>>> found out
>>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>>> javascript
>>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>>> package.json file
>>>>>>>> >> >>> gets updated).
>>>>>>>> >> >>>
>>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>>>>>>> wrote:
>>>>>>>> >> >>> >
>>>>>>>> >> >>> > There are several examples already within the Beam repo
>>>>>>>> found in:
>>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>>> >> >>> >
>>>>>>>> https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>>> >> >>> >
>>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>>> >> >>> >
>>>>>>>> >> >>> >
>>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>>> sachinag@google.com> wrote:
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>>>> each quickstart would be ideal.
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example in
>>>>>>>> each repo? I know that makes the repos less minimal, but we could rewrite
>>>>>>>> the quickstarts around these repos instead of the current Wordcount
>>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>>>> quickstarts...
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>> >> >>> >>>>
>>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance
>>>>>>>> is preferable, and the github repos are more flexible and maintainable.
>>>>>>>> >> >>> >>>>
>>>>>>>> >> >>> >>>> How about we create:
>>>>>>>> >> >>> >>>>
>>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>>> >> >>> >>>>
>>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>>>>>> having repos for all languages. It makes sense for consistency as well.
>>>>>>>> >> >>> >>>>
>>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>>> lcwik@google.com> wrote:
>>>>>>>> >> >>> >>>>>
>>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>>>>>>> specific version of the examples that coincides with a specific SDK version
>>>>>>>> then we could drop the archetypes.
>>>>>>>> >> >>> >>>>>
>>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>>> bhulette@google.com> wrote:
>>>>>>>> >> >>> >>>>>>
>>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to
>>>>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>>>>> failing when a release is published.
>>>>>>>> >> >>> >>>>>>
>>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>>>>>>> breakage after the release. Agree we should verify RCs (document as part of
>>>>>>>> the release process), or even better, add automation to verify the repo
>>>>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>>>>> provides an example for users to follow if they want to test against
>>>>>>>> snapshots and report issues to us sooner.
>>>>>>>> >> >>> >>>>>>
>>>>>>>> >> >>> >>>>>>
>>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the
>>>>>>>> archetype?
>>>>>>>> >> >>> >>>>>>
>>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>>> lcwik@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>
>>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>>> >> >>> >>>>>>>
>>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>
>>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since
>>>>>>>> we can see how a true minimal project looks like. Having it in the main
>>>>>>>> repo would inherit build file configurations and other settings that would
>>>>>>>> be different from a clean project, so it could be non-trivial to adapt.
>>>>>>>> Also as its own repo, it's easier to clone and modify, or create an
>>>>>>>> instance of the template.
>>>>>>>> >> >>> >>>>>>>>
>>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>>> as soon as there is a new dependency version available.
>>>>>>>> >> >>> >>>>>>>>
>>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to
>>>>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>>>>> failing when a release is published.
>>>>>>>> >> >>> >>>>>>>>
>>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>>>> having all the build systems we want to support for them. As long as we
>>>>>>>> document which files are for which build system. That way there are less
>>>>>>>> repos to maintain.
>>>>>>>> >> >>> >>>>>>>>
>>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>>> lcwik@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then
>>>>>>>> the archetypes but the archetypes have a few conveniences since they are
>>>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>>>> version updates), they are released when the SDK is released.
>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>>>>>>>> starter repo containing all the starters or one per language or one per
>>>>>>>> build system?
>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>>>>>>> release manager owns it)?
>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that
>>>>>>>> wouldn't work very well for Gradle and SBT users. I think a GitHub template
>>>>>>>> might be the more flexible option, and we could have something similar for
>>>>>>>> other languages as well. Having said that, we could still create a Maven
>>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to
>>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>> What do you think?
>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>>>>> >> >>> >>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>>> altay@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on
>>>>>>>> this? Do you need help?
>>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>>>> bhulette@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already,
>>>>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart [2].
>>>>>>>> Could we de-dupe these (e.g. reference the GitHub template in the
>>>>>>>> quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put
>>>>>>>> this somewhere like apache/beam-java-template? I think apache repositories
>>>>>>>> like beam-* are allowed.
>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos
>>>>>>>> <dc...@google.com> wrote:
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>>>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>>>>>> Java pipeline for people to start with.
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven
>>>>>>>> (Direct runner)
>>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>>>> (around 1-2 minutes to run)
>>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build,
>>>>>>>> run, test, and add other runners
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>>>>>>> template.
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy
>>>>>>>> with it 🙂
>>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how
>>>>>>>> to create a new Beam Java pipeline
>>>>>>>>
>>>>>>>

Re: Beam Java starter project template

Posted by Robert Burke <re...@google.com>.
SGTM.

On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles <ke...@apache.org> wrote:

> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601 I
> think it will be simplest to license it under ASL2 and include a NOTICE
> file. The user will be free to "clone and go".
>
> I would bring these points back to the dev list:
>
>  - ASL2 is what people expect from an ASF project, so it is "least
> surprise"
>  - Dual-licensing is possible (but I think not worthwhile due to its
> impact on contributor license agreements)
>  - ASL2 says "You must cause any modified files to carry prominent notices
> stating that You changed the files" which won't apply to the user's code
> and I would guess they simply won't bother with for files in the template.
> Or maybe there is a clever way to phrase the header so it is already good
> to go.
>  - ASL2 says if the work includes a NOTICE file, you have to includes the
> attributions from it. The NOTICE file is required by ASF policy. We can
> easily set it up to be a noop for the user.
>
> So my overall take is that we should go ahead with ASL2 and a simple
> NOTICE file. Check the Jira for details.
>
> Kenn
>
> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> And I've created the repos just now.
>>
>> Kenn
>>
>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601
>>>
>>> Kenn
>>>
>>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
>>> dannymccormick@google.com> wrote:
>>>
>>>> Sure - I'm happy to help out with the Actions setup (and/or with the Go
>>>> template). I will say though, the Actions config should be pretty darn
>>>> simple for these examples -
>>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>>> seems right, for each language configuration we're targeting we basically
>>>> just want a job with:
>>>>
>>>>    - checkout
>>>>    - setup-<language>
>>>>    - inlined script to run tests
>>>>
>>>> Always happy to help with or consult on any actions issues 🙂
>>>>
>>>> Thanks,
>>>> Danny
>>>>
>>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
>>>> wrote:
>>>>
>>>>> Danny has extensive experience with GitHub actions, and may be able to
>>>>> help out.
>>>>> Kerry
>>>>>
>>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>>>
>>>>>> I'm convinced on all points. My main motivation was to keep it
>>>>>> simple. But of course we should keep it simple for users, not us :-)
>>>>>>
>>>>>> I can take on the task of asking about MIT license and requesting the
>>>>>> repos be created. Not sure if it needs my level of privileges but I'm happy
>>>>>> to do it anyhow.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > MIT is much more permissive, but I also don't have any problems
>>>>>>> changing it to Apache license. In any case, how about we create the
>>>>>>> following repos?
>>>>>>>
>>>>>>> For these starter projects, we don't want to encumber any users of
>>>>>>> these templates with any particular licensing requirements (right?)
>>>>>>> and we don't even care about attribution. We want these to be pretty
>>>>>>> much as close to public domain as possible. That's not what the
>>>>>>> Apache
>>>>>>> licence does. (If it's even relevant, a good argument could likely be
>>>>>>> made for de minis or fair use, but I think it's best to be explicit
>>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>>
>>>>>>> > apache/beam-starter-java
>>>>>>> > apache/beam-starter-python
>>>>>>> > apache/beam-starter-go
>>>>>>> > apache/beam-starter-kotlin
>>>>>>> > apache/beam-starter-scala
>>>>>>> >
>>>>>>> > We'll start by populating the Java one which is the most pressing
>>>>>>> one and the one that is ready, but the rest should be simpler.
>>>>>>> >
>>>>>>> > +David Huntsperger, tldr; these are minimal starter projects for
>>>>>>> every language. Once we have Java, Python and Go, it might be a good idea
>>>>>>> to change the quickstarts to use these instead of the word count. There is
>>>>>>> already a dedicated word count walkthrough so I think that is already
>>>>>>> covered.
>>>>>>> >
>>>>>>> > If we all agree on the repo names, who can help us create them?
>>>>>>> >
>>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>>> robertwb@google.com> wrote:
>>>>>>> >>
>>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>>>>>> wrote:
>>>>>>> >> >
>>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part of
>>>>>>> it.
>>>>>>> >> >
>>>>>>> >> > But also the answer to "I simply don't know what one would put
>>>>>>> in a Python repo than, other than a bare setup.py that lists a dependency
>>>>>>> on apache_beam" is answered by David's initial email and his repo, namely:
>>>>>>> >> >
>>>>>>> >> >  - GitHub Actions configuration
>>>>>>> >> >  - README.md
>>>>>>> >> >  - example that already runs
>>>>>>> >>
>>>>>>> >> OK, fair enough.
>>>>>>> >>
>>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>>> Apache software it needs to be ASL2)
>>>>>>> >>
>>>>>>> >> On the topic of licence, it's a bit tricky because one doesn't
>>>>>>> want to
>>>>>>> >> bind the users of such a template as being a derivative work of a
>>>>>>> >> too-restrictive licence. The licence of the template itself should
>>>>>>> >> generally be very permissive.
>>>>>>> >>
>>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >> >>
>>>>>>> >> >> I think for consistency it makes sense to users to be told to
>>>>>>> checkout this git repo for the language of your choice and run. Some repos
>>>>>>> will have more/less than others when it comes to setup necessary.
>>>>>>> >> >>
>>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>>> robertwb@google.com> wrote:
>>>>>>> >> >>>
>>>>>>> >> >>> +1 for doing this for Java, as setting up a project there is
>>>>>>> quite
>>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>>> Python repo
>>>>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>>>>> >> >>> apache_beam. We don't have recommendations on file layout,
>>>>>>> etc. more
>>>>>>> >> >>> than that (though there's plenty of generic advice to be
>>>>>>> found out
>>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>>> javascript
>>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>>> package.json file
>>>>>>> >> >>> gets updated).
>>>>>>> >> >>>
>>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >> >>> >
>>>>>>> >> >>> > There are several examples already within the Beam repo
>>>>>>> found in:
>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>>> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>>> >> >>> >
>>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>>> >> >>> >
>>>>>>> >> >>> >
>>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>>> sachinag@google.com> wrote:
>>>>>>> >> >>> >>
>>>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>>> each quickstart would be ideal.
>>>>>>> >> >>> >>
>>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>>> dhuntsperger@google.com> wrote:
>>>>>>> >> >>> >>>
>>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>>> >> >>> >>>
>>>>>>> >> >>> >>> Would it make sense to include the Wordcount example in
>>>>>>> each repo? I know that makes the repos less minimal, but we could rewrite
>>>>>>> the quickstarts around these repos instead of the current Wordcount
>>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>>> quickstarts...
>>>>>>> >> >>> >>>
>>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >> >>> >>>>
>>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance
>>>>>>> is preferable, and the github repos are more flexible and maintainable.
>>>>>>> >> >>> >>>>
>>>>>>> >> >>> >>>> How about we create:
>>>>>>> >> >>> >>>>
>>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>>> >> >>> >>>>
>>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>>>>> having repos for all languages. It makes sense for consistency as well.
>>>>>>> >> >>> >>>>
>>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>>> lcwik@google.com> wrote:
>>>>>>> >> >>> >>>>>
>>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>>>>>> specific version of the examples that coincides with a specific SDK version
>>>>>>> then we could drop the archetypes.
>>>>>>> >> >>> >>>>>
>>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>>> bhulette@google.com> wrote:
>>>>>>> >> >>> >>>>>>
>>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to
>>>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>>>> failing when a release is published.
>>>>>>> >> >>> >>>>>>
>>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>>>>>> breakage after the release. Agree we should verify RCs (document as part of
>>>>>>> the release process), or even better, add automation to verify the repo
>>>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>>>> provides an example for users to follow if they want to test against
>>>>>>> snapshots and report issues to us sooner.
>>>>>>> >> >>> >>>>>>
>>>>>>> >> >>> >>>>>>
>>>>>>> >> >>> >>>>>> If we move forward with this can we drop the archetype?
>>>>>>> >> >>> >>>>>>
>>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>>> lcwik@google.com> wrote:
>>>>>>> >> >>> >>>>>>>
>>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>>> >> >>> >>>>>>>
>>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>
>>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since
>>>>>>> we can see how a true minimal project looks like. Having it in the main
>>>>>>> repo would inherit build file configurations and other settings that would
>>>>>>> be different from a clean project, so it could be non-trivial to adapt.
>>>>>>> Also as its own repo, it's easier to clone and modify, or create an
>>>>>>> instance of the template.
>>>>>>> >> >>> >>>>>>>>
>>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam
>>>>>>> version and other dependencies automatically. Testing is already set up via
>>>>>>> GitHub actions for every pull request, so it would automatically be tested
>>>>>>> as soon as there is a new dependency version available.
>>>>>>> >> >>> >>>>>>>>
>>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to
>>>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>>>> failing when a release is published.
>>>>>>> >> >>> >>>>>>>>
>>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>>> having all the build systems we want to support for them. As long as we
>>>>>>> document which files are for which build system. That way there are less
>>>>>>> repos to maintain.
>>>>>>> >> >>> >>>>>>>>
>>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>>> lcwik@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>>
>>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then
>>>>>>> the archetypes but the archetypes have a few conveniences since they are
>>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>>> version updates), they are released when the SDK is released.
>>>>>>> >> >>> >>>>>>>>>
>>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>>>>>>> starter repo containing all the starters or one per language or one per
>>>>>>> build system?
>>>>>>> >> >>> >>>>>>>>>
>>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>>>>>> release manager owns it)?
>>>>>>> >> >>> >>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>
>>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't
>>>>>>> work very well for Gradle and SBT users. I think a GitHub template might be
>>>>>>> the more flexible option, and we could have something similar for other
>>>>>>> languages as well. Having said that, we could still create a Maven
>>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>>> >> >>> >>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin
>>>>>>> down the name of the repo, create it, and move the code there. I was
>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>> What do you think?
>>>>>>> >> >>> >>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>>>> >> >>> >>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>>> altay@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on
>>>>>>> this? Do you need help?
>>>>>>> >> >>> >>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>>> bhulette@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already,
>>>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart [2].
>>>>>>> Could we de-dupe these (e.g. reference the GitHub template in the
>>>>>>> quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put
>>>>>>> this somewhere like apache/beam-java-template? I think apache repositories
>>>>>>> like beam-* are allowed.
>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>> [1]
>>>>>>> https://maven.apache.org/archetype/index.html
>>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>> >> >>> >>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>>> dcavazos@google.com> wrote:
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>>>>> Java pipeline for people to start with.
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
>>>>>>> runner)
>>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>>> (around 1-2 minutes to run)
>>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run,
>>>>>>> test, and add other runners
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>>>>>> template.
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy
>>>>>>> with it 🙂
>>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>>> account, so we need to create an Apache repo to host it
>>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
>>>>>>> create a new Beam Java pipeline
>>>>>>>
>>>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601 I
think it will be simplest to license it under ASL2 and include a NOTICE
file. The user will be free to "clone and go".

I would bring these points back to the dev list:

 - ASL2 is what people expect from an ASF project, so it is "least surprise"
 - Dual-licensing is possible (but I think not worthwhile due to its impact
on contributor license agreements)
 - ASL2 says "You must cause any modified files to carry prominent notices
stating that You changed the files" which won't apply to the user's code
and I would guess they simply won't bother with for files in the template.
Or maybe there is a clever way to phrase the header so it is already good
to go.
 - ASL2 says if the work includes a NOTICE file, you have to includes the
attributions from it. The NOTICE file is required by ASF policy. We can
easily set it up to be a noop for the user.

So my overall take is that we should go ahead with ASL2 and a simple NOTICE
file. Check the Jira for details.

Kenn

On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles <ke...@apache.org> wrote:

> And I've created the repos just now.
>
> Kenn
>
> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org> wrote:
>
>> Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601
>>
>> Kenn
>>
>> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <da...@google.com>
>> wrote:
>>
>>> Sure - I'm happy to help out with the Actions setup (and/or with the Go
>>> template). I will say though, the Actions config should be pretty darn
>>> simple for these examples -
>>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>>> seems right, for each language configuration we're targeting we basically
>>> just want a job with:
>>>
>>>    - checkout
>>>    - setup-<language>
>>>    - inlined script to run tests
>>>
>>> Always happy to help with or consult on any actions issues 🙂
>>>
>>> Thanks,
>>> Danny
>>>
>>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
>>> wrote:
>>>
>>>> Danny has extensive experience with GitHub actions, and may be able to
>>>> help out.
>>>> Kerry
>>>>
>>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>>
>>>>> I'm convinced on all points. My main motivation was to keep it simple.
>>>>> But of course we should keep it simple for users, not us :-)
>>>>>
>>>>> I can take on the task of asking about MIT license and requesting the
>>>>> repos be created. Not sure if it needs my level of privileges but I'm happy
>>>>> to do it anyhow.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > MIT is much more permissive, but I also don't have any problems
>>>>>> changing it to Apache license. In any case, how about we create the
>>>>>> following repos?
>>>>>>
>>>>>> For these starter projects, we don't want to encumber any users of
>>>>>> these templates with any particular licensing requirements (right?)
>>>>>> and we don't even care about attribution. We want these to be pretty
>>>>>> much as close to public domain as possible. That's not what the Apache
>>>>>> licence does. (If it's even relevant, a good argument could likely be
>>>>>> made for de minis or fair use, but I think it's best to be explicit
>>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>>
>>>>>> > apache/beam-starter-java
>>>>>> > apache/beam-starter-python
>>>>>> > apache/beam-starter-go
>>>>>> > apache/beam-starter-kotlin
>>>>>> > apache/beam-starter-scala
>>>>>> >
>>>>>> > We'll start by populating the Java one which is the most pressing
>>>>>> one and the one that is ready, but the rest should be simpler.
>>>>>> >
>>>>>> > +David Huntsperger, tldr; these are minimal starter projects for
>>>>>> every language. Once we have Java, Python and Go, it might be a good idea
>>>>>> to change the quickstarts to use these instead of the word count. There is
>>>>>> already a dedicated word count walkthrough so I think that is already
>>>>>> covered.
>>>>>> >
>>>>>> > If we all agree on the repo names, who can help us create them?
>>>>>> >
>>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>>> robertwb@google.com> wrote:
>>>>>> >>
>>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>>>>> wrote:
>>>>>> >> >
>>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part of
>>>>>> it.
>>>>>> >> >
>>>>>> >> > But also the answer to "I simply don't know what one would put
>>>>>> in a Python repo than, other than a bare setup.py that lists a dependency
>>>>>> on apache_beam" is answered by David's initial email and his repo, namely:
>>>>>> >> >
>>>>>> >> >  - GitHub Actions configuration
>>>>>> >> >  - README.md
>>>>>> >> >  - example that already runs
>>>>>> >>
>>>>>> >> OK, fair enough.
>>>>>> >>
>>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of
>>>>>> Apache software it needs to be ASL2)
>>>>>> >>
>>>>>> >> On the topic of licence, it's a bit tricky because one doesn't
>>>>>> want to
>>>>>> >> bind the users of such a template as being a derivative work of a
>>>>>> >> too-restrictive licence. The licence of the template itself should
>>>>>> >> generally be very permissive.
>>>>>> >>
>>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>>> wrote:
>>>>>> >> >>
>>>>>> >> >> I think for consistency it makes sense to users to be told to
>>>>>> checkout this git repo for the language of your choice and run. Some repos
>>>>>> will have more/less than others when it comes to setup necessary.
>>>>>> >> >>
>>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>>> robertwb@google.com> wrote:
>>>>>> >> >>>
>>>>>> >> >>> +1 for doing this for Java, as setting up a project there is
>>>>>> quite
>>>>>> >> >>> complicated. I simply don't know what one would put in a
>>>>>> Python repo
>>>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>>>> >> >>> apache_beam. We don't have recommendations on file layout,
>>>>>> etc. more
>>>>>> >> >>> than that (though there's plenty of generic advice to be found
>>>>>> out
>>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>>> javascript
>>>>>> >> >>> would be as well (npm install apache-beam and your
>>>>>> package.json file
>>>>>> >> >>> gets updated).
>>>>>> >> >>>
>>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>>>>> wrote:
>>>>>> >> >>> >
>>>>>> >> >>> > There are several examples already within the Beam repo
>>>>>> found in:
>>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>>> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>>> >> >>> >
>>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>>> >> >>> >
>>>>>> >> >>> >
>>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>>> sachinag@google.com> wrote:
>>>>>> >> >>> >>
>>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>>> each quickstart would be ideal.
>>>>>> >> >>> >>
>>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>>> dhuntsperger@google.com> wrote:
>>>>>> >> >>> >>>
>>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>>> >> >>> >>>
>>>>>> >> >>> >>> Would it make sense to include the Wordcount example in
>>>>>> each repo? I know that makes the repos less minimal, but we could rewrite
>>>>>> the quickstarts around these repos instead of the current Wordcount
>>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>>> quickstarts...
>>>>>> >> >>> >>>
>>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >> >>> >>>>
>>>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>>>> preferable, and the github repos are more flexible and maintainable.
>>>>>> >> >>> >>>>
>>>>>> >> >>> >>>> How about we create:
>>>>>> >> >>> >>>>
>>>>>> >> >>> >>>> apache/beam-starter-java
>>>>>> >> >>> >>>> apache/beam-starter-python
>>>>>> >> >>> >>>> apache/beam-starter-go
>>>>>> >> >>> >>>>
>>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>>>> having repos for all languages. It makes sense for consistency as well.
>>>>>> >> >>> >>>>
>>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>>> lcwik@google.com> wrote:
>>>>>> >> >>> >>>>>
>>>>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>>>>> specific version of the examples that coincides with a specific SDK version
>>>>>> then we could drop the archetypes.
>>>>>> >> >>> >>>>>
>>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>>> bhulette@google.com> wrote:
>>>>>> >> >>> >>>>>>
>>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to
>>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>>> failing when a release is published.
>>>>>> >> >>> >>>>>>
>>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>>>>> breakage after the release. Agree we should verify RCs (document as part of
>>>>>> the release process), or even better, add automation to verify the repo
>>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>>> provides an example for users to follow if they want to test against
>>>>>> snapshots and report issues to us sooner.
>>>>>> >> >>> >>>>>>
>>>>>> >> >>> >>>>>>
>>>>>> >> >>> >>>>>> If we move forward with this can we drop the archetype?
>>>>>> >> >>> >>>>>>
>>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>>> lcwik@google.com> wrote:
>>>>>> >> >>> >>>>>>>
>>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>>> >> >>> >>>>>>>
>>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >> >>> >>>>>>>>
>>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since
>>>>>> we can see how a true minimal project looks like. Having it in the main
>>>>>> repo would inherit build file configurations and other settings that would
>>>>>> be different from a clean project, so it could be non-trivial to adapt.
>>>>>> Also as its own repo, it's easier to clone and modify, or create an
>>>>>> instance of the template.
>>>>>> >> >>> >>>>>>>>
>>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam version
>>>>>> and other dependencies automatically. Testing is already set up via GitHub
>>>>>> actions for every pull request, so it would automatically be tested as soon
>>>>>> as there is a new dependency version available.
>>>>>> >> >>> >>>>>>>>
>>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to
>>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>>> failing when a release is published.
>>>>>> >> >>> >>>>>>>>
>>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and
>>>>>> having all the build systems we want to support for them. As long as we
>>>>>> document which files are for which build system. That way there are less
>>>>>> repos to maintain.
>>>>>> >> >>> >>>>>>>>
>>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>>> lcwik@google.com> wrote:
>>>>>> >> >>> >>>>>>>>>
>>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then the
>>>>>> archetypes but the archetypes have a few conveniences since they are
>>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>>> the same time a corresponding change to the main repo is done (like library
>>>>>> version updates), they are released when the SDK is released.
>>>>>> >> >>> >>>>>>>>>
>>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>>>>>> starter repo containing all the starters or one per language or one per
>>>>>> build system?
>>>>>> >> >>> >>>>>>>>>
>>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>>>>> release manager owns it)?
>>>>>> >> >>> >>>>>>>>>
>>>>>> >> >>> >>>>>>>>>
>>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >> >>> >>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't
>>>>>> work very well for Gradle and SBT users. I think a GitHub template might be
>>>>>> the more flexible option, and we could have something similar for other
>>>>>> languages as well. Having said that, we could still create a Maven
>>>>>> archetype. If someone is familiar with that process, please let me know
>>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>>> >> >>> >>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin
>>>>>> down the name of the repo, create it, and move the code there. I was
>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>> What do you think?
>>>>>> >> >>> >>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>>> >> >>> >>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>>> altay@google.com> wrote:
>>>>>> >> >>> >>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on
>>>>>> this? Do you need help?
>>>>>> >> >>> >>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>>> bhulette@google.com> wrote:
>>>>>> >> >>> >>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>>> >> >>> >>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already,
>>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart [2].
>>>>>> Could we de-dupe these (e.g. reference the GitHub template in the
>>>>>> quickstart, or co-locate the archetype with the GitHub template)?
>>>>>> >> >>> >>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put
>>>>>> this somewhere like apache/beam-java-template? I think apache repositories
>>>>>> like beam-* are allowed.
>>>>>> >> >>> >>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>>> >> >>> >>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>> >> >>> >>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>> >> >>> >>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>> dcavazos@google.com> wrote:
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>>>> Java pipeline for people to start with.
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>>> https://github.com/davidcavazos/beam-java
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
>>>>>> runner)
>>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>>> (around 1-2 minutes to run)
>>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run,
>>>>>> test, and add other runners
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>>>>> template.
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>>> >> >>> >>>>>>>>>>>>>>
>>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy
>>>>>> with it 🙂
>>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>>> account, so we need to create an Apache repo to host it
>>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
>>>>>> create a new Beam Java pipeline
>>>>>>
>>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
And I've created the repos just now.

Kenn

On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles <ke...@apache.org> wrote:

> Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601
>
> Kenn
>
> On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <da...@google.com>
> wrote:
>
>> Sure - I'm happy to help out with the Actions setup (and/or with the Go
>> template). I will say though, the Actions config should be pretty darn
>> simple for these examples -
>> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
>> seems right, for each language configuration we're targeting we basically
>> just want a job with:
>>
>>    - checkout
>>    - setup-<language>
>>    - inlined script to run tests
>>
>> Always happy to help with or consult on any actions issues 🙂
>>
>> Thanks,
>> Danny
>>
>> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
>> wrote:
>>
>>> Danny has extensive experience with GitHub actions, and may be able to
>>> help out.
>>> Kerry
>>>
>>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> I'm convinced on all points. My main motivation was to keep it simple.
>>>> But of course we should keep it simple for users, not us :-)
>>>>
>>>> I can take on the task of asking about MIT license and requesting the
>>>> repos be created. Not sure if it needs my level of privileges but I'm happy
>>>> to do it anyhow.
>>>>
>>>> Kenn
>>>>
>>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>> >
>>>>> > MIT is much more permissive, but I also don't have any problems
>>>>> changing it to Apache license. In any case, how about we create the
>>>>> following repos?
>>>>>
>>>>> For these starter projects, we don't want to encumber any users of
>>>>> these templates with any particular licensing requirements (right?)
>>>>> and we don't even care about attribution. We want these to be pretty
>>>>> much as close to public domain as possible. That's not what the Apache
>>>>> licence does. (If it's even relevant, a good argument could likely be
>>>>> made for de minis or fair use, but I think it's best to be explicit
>>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>>
>>>>> > apache/beam-starter-java
>>>>> > apache/beam-starter-python
>>>>> > apache/beam-starter-go
>>>>> > apache/beam-starter-kotlin
>>>>> > apache/beam-starter-scala
>>>>> >
>>>>> > We'll start by populating the Java one which is the most pressing
>>>>> one and the one that is ready, but the rest should be simpler.
>>>>> >
>>>>> > +David Huntsperger, tldr; these are minimal starter projects for
>>>>> every language. Once we have Java, Python and Go, it might be a good idea
>>>>> to change the quickstarts to use these instead of the word count. There is
>>>>> already a dedicated word count walkthrough so I think that is already
>>>>> covered.
>>>>> >
>>>>> > If we all agree on the repo names, who can help us create them?
>>>>> >
>>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
>>>>> robertwb@google.com> wrote:
>>>>> >>
>>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>>>> wrote:
>>>>> >> >
>>>>> >> > Agree with Luke here. "Just git clone and go" is a big part of it.
>>>>> >> >
>>>>> >> > But also the answer to "I simply don't know what one would put in
>>>>> a Python repo than, other than a bare setup.py that lists a dependency on
>>>>> apache_beam" is answered by David's initial email and his repo, namely:
>>>>> >> >
>>>>> >> >  - GitHub Actions configuration
>>>>> >> >  - README.md
>>>>> >> >  - example that already runs
>>>>> >>
>>>>> >> OK, fair enough.
>>>>> >>
>>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of Apache
>>>>> software it needs to be ASL2)
>>>>> >>
>>>>> >> On the topic of licence, it's a bit tricky because one doesn't want
>>>>> to
>>>>> >> bind the users of such a template as being a derivative work of a
>>>>> >> too-restrictive licence. The licence of the template itself should
>>>>> >> generally be very permissive.
>>>>> >>
>>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>>> wrote:
>>>>> >> >>
>>>>> >> >> I think for consistency it makes sense to users to be told to
>>>>> checkout this git repo for the language of your choice and run. Some repos
>>>>> will have more/less than others when it comes to setup necessary.
>>>>> >> >>
>>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>>> robertwb@google.com> wrote:
>>>>> >> >>>
>>>>> >> >>> +1 for doing this for Java, as setting up a project there is
>>>>> quite
>>>>> >> >>> complicated. I simply don't know what one would put in a Python
>>>>> repo
>>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>>> >> >>> apache_beam. We don't have recommendations on file layout, etc.
>>>>> more
>>>>> >> >>> than that (though there's plenty of generic advice to be found
>>>>> out
>>>>> >> >>> there on the topic). I have a hunch go is similar, and
>>>>> javascript
>>>>> >> >>> would be as well (npm install apache-beam and your package.json
>>>>> file
>>>>> >> >>> gets updated).
>>>>> >> >>>
>>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>>>> wrote:
>>>>> >> >>> >
>>>>> >> >>> > There are several examples already within the Beam repo found
>>>>> in:
>>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>>> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>>> >> >>> >
>>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>>> >> >>> >
>>>>> >> >>> >
>>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>>> sachinag@google.com> wrote:
>>>>> >> >>> >>
>>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>>> each quickstart would be ideal.
>>>>> >> >>> >>
>>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>>> dhuntsperger@google.com> wrote:
>>>>> >> >>> >>>
>>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>>> >> >>> >>>
>>>>> >> >>> >>> Would it make sense to include the Wordcount example in
>>>>> each repo? I know that makes the repos less minimal, but we could rewrite
>>>>> the quickstarts around these repos instead of the current Wordcount
>>>>> examples. Or maybe we don't need to use the Wordcount example in the
>>>>> quickstarts...
>>>>> >> >>> >>>
>>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >> >>> >>>>
>>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>>> preferable, and the github repos are more flexible and maintainable.
>>>>> >> >>> >>>>
>>>>> >> >>> >>>> How about we create:
>>>>> >> >>> >>>>
>>>>> >> >>> >>>> apache/beam-starter-java
>>>>> >> >>> >>>> apache/beam-starter-python
>>>>> >> >>> >>>> apache/beam-starter-go
>>>>> >> >>> >>>>
>>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>>> having repos for all languages. It makes sense for consistency as well.
>>>>> >> >>> >>>>
>>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <
>>>>> lcwik@google.com> wrote:
>>>>> >> >>> >>>>>
>>>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>>>> specific version of the examples that coincides with a specific SDK version
>>>>> then we could drop the archetypes.
>>>>> >> >>> >>>>>
>>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>>> bhulette@google.com> wrote:
>>>>> >> >>> >>>>>>
>>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to
>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>> failing when a release is published.
>>>>> >> >>> >>>>>>
>>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>>>> breakage after the release. Agree we should verify RCs (document as part of
>>>>> the release process), or even better, add automation to verify the repo
>>>>> against snapshots. The automation could be nice to have anyway since it
>>>>> provides an example for users to follow if they want to test against
>>>>> snapshots and report issues to us sooner.
>>>>> >> >>> >>>>>>
>>>>> >> >>> >>>>>>
>>>>> >> >>> >>>>>> If we move forward with this can we drop the archetype?
>>>>> >> >>> >>>>>>
>>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>>> lcwik@google.com> wrote:
>>>>> >> >>> >>>>>>>
>>>>> >> >>> >>>>>>> Sounds reasonable.
>>>>> >> >>> >>>>>>>
>>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >> >>> >>>>>>>>
>>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since we
>>>>> can see how a true minimal project looks like. Having it in the main repo
>>>>> would inherit build file configurations and other settings that would be
>>>>> different from a clean project, so it could be non-trivial to adapt. Also
>>>>> as its own repo, it's easier to clone and modify, or create an instance of
>>>>> the template.
>>>>> >> >>> >>>>>>>>
>>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam version
>>>>> and other dependencies automatically. Testing is already set up via GitHub
>>>>> actions for every pull request, so it would automatically be tested as soon
>>>>> as there is a new dependency version available.
>>>>> >> >>> >>>>>>>>
>>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to
>>>>> break commonly, but I think it would be good to make sure tests aren't
>>>>> failing when a release is published.
>>>>> >> >>> >>>>>>>>
>>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and having
>>>>> all the build systems we want to support for them. As long as we document
>>>>> which files are for which build system. That way there are less repos to
>>>>> maintain.
>>>>> >> >>> >>>>>>>>
>>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>>> lcwik@google.com> wrote:
>>>>> >> >>> >>>>>>>>>
>>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then the
>>>>> archetypes but the archetypes have a few conveniences since they are
>>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>>> the same time a corresponding change to the main repo is done (like library
>>>>> version updates), they are released when the SDK is released.
>>>>> >> >>> >>>>>>>>>
>>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>>>>> starter repo containing all the starters or one per language or one per
>>>>> build system?
>>>>> >> >>> >>>>>>>>>
>>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>>>> release manager owns it)?
>>>>> >> >>> >>>>>>>>>
>>>>> >> >>> >>>>>>>>>
>>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >> >>> >>>>>>>>>>
>>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't
>>>>> work very well for Gradle and SBT users. I think a GitHub template might be
>>>>> the more flexible option, and we could have something similar for other
>>>>> languages as well. Having said that, we could still create a Maven
>>>>> archetype. If someone is familiar with that process, please let me know
>>>>> since I'm not too familiar with Maven and its ecosystem.
>>>>> >> >>> >>>>>>>>>>
>>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin
>>>>> down the name of the repo, create it, and move the code there. I was
>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>> What do you think?
>>>>> >> >>> >>>>>>>>>>
>>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>>> >> >>> >>>>>>>>>>
>>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>>> altay@google.com> wrote:
>>>>> >> >>> >>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on
>>>>> this? Do you need help?
>>>>> >> >>> >>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>>> bhulette@google.com> wrote:
>>>>> >> >>> >>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>>> >> >>> >>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already,
>>>>> built with Maven Archetype [1]. It's what powers the Java quickstart [2].
>>>>> Could we de-dupe these (e.g. reference the GitHub template in the
>>>>> quickstart, or co-locate the archetype with the GitHub template)?
>>>>> >> >>> >>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put
>>>>> this somewhere like apache/beam-java-template? I think apache repositories
>>>>> like beam-* are allowed.
>>>>> >> >>> >>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>> Brian
>>>>> >> >>> >>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>> >> >>> >>>>>>>>>>>> [2]
>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>> >> >>> >>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >> >>> >>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>>> >> >>> >>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>>> >> >>> >>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>> dcavazos@google.com> wrote:
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>>> Java pipeline for people to start with.
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>>> https://github.com/davidcavazos/beam-java
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
>>>>> runner)
>>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions
>>>>> (around 1-2 minutes to run)
>>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run,
>>>>> test, and add other runners
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>>>> template.
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>>> >> >>> >>>>>>>>>>>>>>
>>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy
>>>>> with it 🙂
>>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub
>>>>> account, so we need to create an Apache repo to host it
>>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
>>>>> create a new Beam Java pipeline
>>>>>
>>>>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601

Kenn

On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <da...@google.com>
wrote:

> Sure - I'm happy to help out with the Actions setup (and/or with the Go
> template). I will say though, the Actions config should be pretty darn
> simple for these examples -
> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
> seems right, for each language configuration we're targeting we basically
> just want a job with:
>
>    - checkout
>    - setup-<language>
>    - inlined script to run tests
>
> Always happy to help with or consult on any actions issues 🙂
>
> Thanks,
> Danny
>
> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
> wrote:
>
>> Danny has extensive experience with GitHub actions, and may be able to
>> help out.
>> Kerry
>>
>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> I'm convinced on all points. My main motivation was to keep it simple.
>>> But of course we should keep it simple for users, not us :-)
>>>
>>> I can take on the task of asking about MIT license and requesting the
>>> repos be created. Not sure if it needs my level of privileges but I'm happy
>>> to do it anyhow.
>>>
>>> Kenn
>>>
>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>>> wrote:
>>>> >
>>>> > MIT is much more permissive, but I also don't have any problems
>>>> changing it to Apache license. In any case, how about we create the
>>>> following repos?
>>>>
>>>> For these starter projects, we don't want to encumber any users of
>>>> these templates with any particular licensing requirements (right?)
>>>> and we don't even care about attribution. We want these to be pretty
>>>> much as close to public domain as possible. That's not what the Apache
>>>> licence does. (If it's even relevant, a good argument could likely be
>>>> made for de minis or fair use, but I think it's best to be explicit
>>>> about this. Perhaps this'd be a good question for apache legal?
>>>>
>>>> > apache/beam-starter-java
>>>> > apache/beam-starter-python
>>>> > apache/beam-starter-go
>>>> > apache/beam-starter-kotlin
>>>> > apache/beam-starter-scala
>>>> >
>>>> > We'll start by populating the Java one which is the most pressing one
>>>> and the one that is ready, but the rest should be simpler.
>>>> >
>>>> > +David Huntsperger, tldr; these are minimal starter projects for
>>>> every language. Once we have Java, Python and Go, it might be a good idea
>>>> to change the quickstarts to use these instead of the word count. There is
>>>> already a dedicated word count walkthrough so I think that is already
>>>> covered.
>>>> >
>>>> > If we all agree on the repo names, who can help us create them?
>>>> >
>>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>> >>
>>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>>> wrote:
>>>> >> >
>>>> >> > Agree with Luke here. "Just git clone and go" is a big part of it.
>>>> >> >
>>>> >> > But also the answer to "I simply don't know what one would put in
>>>> a Python repo than, other than a bare setup.py that lists a dependency on
>>>> apache_beam" is answered by David's initial email and his repo, namely:
>>>> >> >
>>>> >> >  - GitHub Actions configuration
>>>> >> >  - README.md
>>>> >> >  - example that already runs
>>>> >>
>>>> >> OK, fair enough.
>>>> >>
>>>> >> >  - LICENSE (notably you've got it as MIT but to be part of Apache
>>>> software it needs to be ASL2)
>>>> >>
>>>> >> On the topic of licence, it's a bit tricky because one doesn't want
>>>> to
>>>> >> bind the users of such a template as being a derivative work of a
>>>> >> too-restrictive licence. The licence of the template itself should
>>>> >> generally be very permissive.
>>>> >>
>>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> I think for consistency it makes sense to users to be told to
>>>> checkout this git repo for the language of your choice and run. Some repos
>>>> will have more/less than others when it comes to setup necessary.
>>>> >> >>
>>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>>> robertwb@google.com> wrote:
>>>> >> >>>
>>>> >> >>> +1 for doing this for Java, as setting up a project there is
>>>> quite
>>>> >> >>> complicated. I simply don't know what one would put in a Python
>>>> repo
>>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>>> >> >>> apache_beam. We don't have recommendations on file layout, etc.
>>>> more
>>>> >> >>> than that (though there's plenty of generic advice to be found
>>>> out
>>>> >> >>> there on the topic). I have a hunch go is similar, and javascript
>>>> >> >>> would be as well (npm install apache-beam and your package.json
>>>> file
>>>> >> >>> gets updated).
>>>> >> >>>
>>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >> >>> >
>>>> >> >>> > There are several examples already within the Beam repo found
>>>> in:
>>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>>> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>>> >> >>> >
>>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>>> >> >>> >
>>>> >> >>> >
>>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>>> sachinag@google.com> wrote:
>>>> >> >>> >>
>>>> >> >>> >> I'd love to do something other than Wordcount just for
>>>> novelty/freshness but agreed with the suggestion that having an example in
>>>> each quickstart would be ideal.
>>>> >> >>> >>
>>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>>> dhuntsperger@google.com> wrote:
>>>> >> >>> >>>
>>>> >> >>> >>> + 1 to a separate repo for each language.
>>>> >> >>> >>>
>>>> >> >>> >>> Would it make sense to include the Wordcount example in each
>>>> repo? I know that makes the repos less minimal, but we could rewrite the
>>>> quickstarts around these repos instead of the current Wordcount examples.
>>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>>> >> >>> >>>
>>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >> >>> >>>>
>>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance is
>>>> preferable, and the github repos are more flexible and maintainable.
>>>> >> >>> >>>>
>>>> >> >>> >>>> How about we create:
>>>> >> >>> >>>>
>>>> >> >>> >>>> apache/beam-starter-java
>>>> >> >>> >>>> apache/beam-starter-python
>>>> >> >>> >>>> apache/beam-starter-go
>>>> >> >>> >>>>
>>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer
>>>> having repos for all languages. It makes sense for consistency as well.
>>>> >> >>> >>>>
>>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>>>> wrote:
>>>> >> >>> >>>>>
>>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>>> specific version of the examples that coincides with a specific SDK version
>>>> then we could drop the archetypes.
>>>> >> >>> >>>>>
>>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>>> bhulette@google.com> wrote:
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to
>>>> break commonly, but I think it would be good to make sure tests aren't
>>>> failing when a release is published.
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>>> breakage after the release. Agree we should verify RCs (document as part of
>>>> the release process), or even better, add automation to verify the repo
>>>> against snapshots. The automation could be nice to have anyway since it
>>>> provides an example for users to follow if they want to test against
>>>> snapshots and report issues to us sooner.
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> If we move forward with this can we drop the archetype?
>>>> >> >>> >>>>>>
>>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <
>>>> lcwik@google.com> wrote:
>>>> >> >>> >>>>>>>
>>>> >> >>> >>>>>>> Sounds reasonable.
>>>> >> >>> >>>>>>>
>>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since we
>>>> can see how a true minimal project looks like. Having it in the main repo
>>>> would inherit build file configurations and other settings that would be
>>>> different from a clean project, so it could be non-trivial to adapt. Also
>>>> as its own repo, it's easier to clone and modify, or create an instance of
>>>> the template.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam version
>>>> and other dependencies automatically. Testing is already set up via GitHub
>>>> actions for every pull request, so it would automatically be tested as soon
>>>> as there is a new dependency version available.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to
>>>> break commonly, but I think it would be good to make sure tests aren't
>>>> failing when a release is published.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and having
>>>> all the build systems we want to support for them. As long as we document
>>>> which files are for which build system. That way there are less repos to
>>>> maintain.
>>>> >> >>> >>>>>>>>
>>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>>> lcwik@google.com> wrote:
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then the
>>>> archetypes but the archetypes have a few conveniences since they are
>>>> integrated with apache/beam repo. For example, updates/testing are done at
>>>> the same time a corresponding change to the main repo is done (like library
>>>> version updates), they are released when the SDK is released.
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>>>> starter repo containing all the starters or one per language or one per
>>>> build system?
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>>> release manager owns it)?
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>>
>>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >> >>> >>>>>>>>>>
>>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't
>>>> work very well for Gradle and SBT users. I think a GitHub template might be
>>>> the more flexible option, and we could have something similar for other
>>>> languages as well. Having said that, we could still create a Maven
>>>> archetype. If someone is familiar with that process, please let me know
>>>> since I'm not too familiar with Maven and its ecosystem.
>>>> >> >>> >>>>>>>>>>
>>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin
>>>> down the name of the repo, create it, and move the code there. I was
>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>> What do you think?
>>>> >> >>> >>>>>>>>>>
>>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>>>> >> >>> >>>>>>>>>>
>>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>>> altay@google.com> wrote:
>>>> >> >>> >>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on this?
>>>> Do you need help?
>>>> >> >>> >>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>>> bhulette@google.com> wrote:
>>>> >> >>> >>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>>> >> >>> >>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already, built
>>>> with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could
>>>> we de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>> co-locate the archetype with the GitHub template)?
>>>> >> >>> >>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put
>>>> this somewhere like apache/beam-java-template? I think apache repositories
>>>> like beam-* are allowed.
>>>> >> >>> >>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>> Brian
>>>> >> >>> >>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>> >> >>> >>>>>>>>>>>> [2]
>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>> >> >>> >>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >> >>> >>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>>> >> >>> >>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>>> >> >>> >>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>> dcavazos@google.com> wrote:
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>>> project, I've been working on a GitHub template containing a minimal Beam
>>>> Java pipeline for people to start with.
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>>> https://github.com/davidcavazos/beam-java
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
>>>> runner)
>>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around
>>>> 1-2 minutes to run)
>>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run,
>>>> test, and add other runners
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>>> template.
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> Next steps
>>>> >> >>> >>>>>>>>>>>>>>
>>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy
>>>> with it 🙂
>>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account,
>>>> so we need to create an Apache repo to host it
>>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
>>>> create a new Beam Java pipeline
>>>>
>>>

Re: Beam Java starter project template

Posted by Danny McCormick <da...@google.com>.
Sure - I'm happy to help out with the Actions setup (and/or with the Go
template). I will say though, the Actions config should be pretty darn
simple for these examples -
https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
seems right, for each language configuration we're targeting we basically
just want a job with:

   - checkout
   - setup-<language>
   - inlined script to run tests

Always happy to help with or consult on any actions issues 🙂

Thanks,
Danny

On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark <ke...@google.com>
wrote:

> Danny has extensive experience with GitHub actions, and may be able to
> help out.
> Kerry
>
> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> I'm convinced on all points. My main motivation was to keep it simple.
>> But of course we should keep it simple for users, not us :-)
>>
>> I can take on the task of asking about MIT license and requesting the
>> repos be created. Not sure if it needs my level of privileges but I'm happy
>> to do it anyhow.
>>
>> Kenn
>>
>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>>> wrote:
>>> >
>>> > MIT is much more permissive, but I also don't have any problems
>>> changing it to Apache license. In any case, how about we create the
>>> following repos?
>>>
>>> For these starter projects, we don't want to encumber any users of
>>> these templates with any particular licensing requirements (right?)
>>> and we don't even care about attribution. We want these to be pretty
>>> much as close to public domain as possible. That's not what the Apache
>>> licence does. (If it's even relevant, a good argument could likely be
>>> made for de minis or fair use, but I think it's best to be explicit
>>> about this. Perhaps this'd be a good question for apache legal?
>>>
>>> > apache/beam-starter-java
>>> > apache/beam-starter-python
>>> > apache/beam-starter-go
>>> > apache/beam-starter-kotlin
>>> > apache/beam-starter-scala
>>> >
>>> > We'll start by populating the Java one which is the most pressing one
>>> and the one that is ready, but the rest should be simpler.
>>> >
>>> > +David Huntsperger, tldr; these are minimal starter projects for every
>>> language. Once we have Java, Python and Go, it might be a good idea to
>>> change the quickstarts to use these instead of the word count. There is
>>> already a dedicated word count walkthrough so I think that is already
>>> covered.
>>> >
>>> > If we all agree on the repo names, who can help us create them?
>>> >
>>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>> >>
>>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>>> wrote:
>>> >> >
>>> >> > Agree with Luke here. "Just git clone and go" is a big part of it.
>>> >> >
>>> >> > But also the answer to "I simply don't know what one would put in a
>>> Python repo than, other than a bare setup.py that lists a dependency on
>>> apache_beam" is answered by David's initial email and his repo, namely:
>>> >> >
>>> >> >  - GitHub Actions configuration
>>> >> >  - README.md
>>> >> >  - example that already runs
>>> >>
>>> >> OK, fair enough.
>>> >>
>>> >> >  - LICENSE (notably you've got it as MIT but to be part of Apache
>>> software it needs to be ASL2)
>>> >>
>>> >> On the topic of licence, it's a bit tricky because one doesn't want to
>>> >> bind the users of such a template as being a derivative work of a
>>> >> too-restrictive licence. The licence of the template itself should
>>> >> generally be very permissive.
>>> >>
>>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>>> >> >>
>>> >> >> I think for consistency it makes sense to users to be told to
>>> checkout this git repo for the language of your choice and run. Some repos
>>> will have more/less than others when it comes to setup necessary.
>>> >> >>
>>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>>> robertwb@google.com> wrote:
>>> >> >>>
>>> >> >>> +1 for doing this for Java, as setting up a project there is quite
>>> >> >>> complicated. I simply don't know what one would put in a Python
>>> repo
>>> >> >>> than, other than a bare setup.py that lists a dependency on
>>> >> >>> apache_beam. We don't have recommendations on file layout, etc.
>>> more
>>> >> >>> than that (though there's plenty of generic advice to be found out
>>> >> >>> there on the topic). I have a hunch go is similar, and javascript
>>> >> >>> would be as well (npm install apache-beam and your package.json
>>> file
>>> >> >>> gets updated).
>>> >> >>>
>>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>>> wrote:
>>> >> >>> >
>>> >> >>> > There are several examples already within the Beam repo found
>>> in:
>>> >> >>> > https://github.com/apache/beam/tree/master/examples
>>> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>> >> >>> >
>>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>> >> >>> >
>>> >> >>> >
>>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>>> sachinag@google.com> wrote:
>>> >> >>> >>
>>> >> >>> >> I'd love to do something other than Wordcount just for
>>> novelty/freshness but agreed with the suggestion that having an example in
>>> each quickstart would be ideal.
>>> >> >>> >>
>>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>>> dhuntsperger@google.com> wrote:
>>> >> >>> >>>
>>> >> >>> >>> + 1 to a separate repo for each language.
>>> >> >>> >>>
>>> >> >>> >>> Would it make sense to include the Wordcount example in each
>>> repo? I know that makes the repos less minimal, but we could rewrite the
>>> quickstarts around these repos instead of the current Wordcount examples.
>>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>>> >> >>> >>>
>>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >> >>> >>>>
>>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance is
>>> preferable, and the github repos are more flexible and maintainable.
>>> >> >>> >>>>
>>> >> >>> >>>> How about we create:
>>> >> >>> >>>>
>>> >> >>> >>>> apache/beam-starter-java
>>> >> >>> >>>> apache/beam-starter-python
>>> >> >>> >>>> apache/beam-starter-go
>>> >> >>> >>>>
>>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer having
>>> repos for all languages. It makes sense for consistency as well.
>>> >> >>> >>>>
>>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>>> wrote:
>>> >> >>> >>>>>
>>> >> >>> >>>>> As long as we have tags so that people can pull out a
>>> specific version of the examples that coincides with a specific SDK version
>>> then we could drop the archetypes.
>>> >> >>> >>>>>
>>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>>> bhulette@google.com> wrote:
>>> >> >>> >>>>>>
>>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to
>>> break commonly, but I think it would be good to make sure tests aren't
>>> failing when a release is published.
>>> >> >>> >>>>>>
>>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>>> breakage after the release. Agree we should verify RCs (document as part of
>>> the release process), or even better, add automation to verify the repo
>>> against snapshots. The automation could be nice to have anyway since it
>>> provides an example for users to follow if they want to test against
>>> snapshots and report issues to us sooner.
>>> >> >>> >>>>>>
>>> >> >>> >>>>>>
>>> >> >>> >>>>>> If we move forward with this can we drop the archetype?
>>> >> >>> >>>>>>
>>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>>> wrote:
>>> >> >>> >>>>>>>
>>> >> >>> >>>>>>> Sounds reasonable.
>>> >> >>> >>>>>>>
>>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >> >>> >>>>>>>>
>>> >> >>> >>>>>>>> I personally like the idea of a separate repo since we
>>> can see how a true minimal project looks like. Having it in the main repo
>>> would inherit build file configurations and other settings that would be
>>> different from a clean project, so it could be non-trivial to adapt. Also
>>> as its own repo, it's easier to clone and modify, or create an instance of
>>> the template.
>>> >> >>> >>>>>>>>
>>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam version
>>> and other dependencies automatically. Testing is already set up via GitHub
>>> actions for every pull request, so it would automatically be tested as soon
>>> as there is a new dependency version available.
>>> >> >>> >>>>>>>>
>>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to
>>> break commonly, but I think it would be good to make sure tests aren't
>>> failing when a release is published.
>>> >> >>> >>>>>>>>
>>> >> >>> >>>>>>>> I'm okay with having one repo per language, and having
>>> all the build systems we want to support for them. As long as we document
>>> which files are for which build system. That way there are less repos to
>>> maintain.
>>> >> >>> >>>>>>>>
>>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>>> lcwik@google.com> wrote:
>>> >> >>> >>>>>>>>>
>>> >> >>> >>>>>>>>> The github repo is definitely more flexible then the
>>> archetypes but the archetypes have a few conveniences since they are
>>> integrated with apache/beam repo. For example, updates/testing are done at
>>> the same time a corresponding change to the main repo is done (like library
>>> version updates), they are released when the SDK is released.
>>> >> >>> >>>>>>>>>
>>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>>> starter repo containing all the starters or one per language or one per
>>> build system?
>>> >> >>> >>>>>>>>>
>>> >> >>> >>>>>>>>> When should updates to the starter happen?
>>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>>> release manager owns it)?
>>> >> >>> >>>>>>>>>
>>> >> >>> >>>>>>>>>
>>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >> >>> >>>>>>>>>>
>>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't
>>> work very well for Gradle and SBT users. I think a GitHub template might be
>>> the more flexible option, and we could have something similar for other
>>> languages as well. Having said that, we could still create a Maven
>>> archetype. If someone is familiar with that process, please let me know
>>> since I'm not too familiar with Maven and its ecosystem.
>>> >> >>> >>>>>>>>>>
>>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin
>>> down the name of the repo, create it, and move the code there. I was
>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>> What do you think?
>>> >> >>> >>>>>>>>>>
>>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>>> >> >>> >>>>>>>>>>
>>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>>> altay@google.com> wrote:
>>> >> >>> >>>>>>>>>>>
>>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on this?
>>> Do you need help?
>>> >> >>> >>>>>>>>>>>
>>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>>> bhulette@google.com> wrote:
>>> >> >>> >>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>>> >> >>> >>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already, built
>>> with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could
>>> we de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>> co-locate the archetype with the GitHub template)?
>>> >> >>> >>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>>> somewhere like apache/beam-java-template? I think apache repositories like
>>> beam-* are allowed.
>>> >> >>> >>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>> Brian
>>> >> >>> >>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>> >> >>> >>>>>>>>>>>> [2]
>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>> >> >>> >>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >> >>> >>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>>> >> >>> >>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>> >> >>> >>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>> dcavazos@google.com> wrote:
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>>> project, I've been working on a GitHub template containing a minimal Beam
>>> Java pipeline for people to start with.
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>>> https://github.com/davidcavazos/beam-java
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>> >> >>> >>>>>>>>>>>>>> Minimal test file
>>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
>>> runner)
>>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around
>>> 1-2 minutes to run)
>>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run,
>>> test, and add other runners
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>>> template.
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> Next steps
>>> >> >>> >>>>>>>>>>>>>>
>>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with
>>> it 🙂
>>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account,
>>> so we need to create an Apache repo to host it
>>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
>>> create a new Beam Java pipeline
>>>
>>

Re: Beam Java starter project template

Posted by Kerry Donny-Clark <ke...@google.com>.
Danny has extensive experience with GitHub actions, and may be able to help
out.
Kerry

On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles <ke...@apache.org> wrote:

> I'm convinced on all points. My main motivation was to keep it simple. But
> of course we should keep it simple for users, not us :-)
>
> I can take on the task of asking about MIT license and requesting the
> repos be created. Not sure if it needs my level of privileges but I'm happy
> to do it anyhow.
>
> Kenn
>
> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com>
>> wrote:
>> >
>> > MIT is much more permissive, but I also don't have any problems
>> changing it to Apache license. In any case, how about we create the
>> following repos?
>>
>> For these starter projects, we don't want to encumber any users of
>> these templates with any particular licensing requirements (right?)
>> and we don't even care about attribution. We want these to be pretty
>> much as close to public domain as possible. That's not what the Apache
>> licence does. (If it's even relevant, a good argument could likely be
>> made for de minis or fair use, but I think it's best to be explicit
>> about this. Perhaps this'd be a good question for apache legal?
>>
>> > apache/beam-starter-java
>> > apache/beam-starter-python
>> > apache/beam-starter-go
>> > apache/beam-starter-kotlin
>> > apache/beam-starter-scala
>> >
>> > We'll start by populating the Java one which is the most pressing one
>> and the one that is ready, but the rest should be simpler.
>> >
>> > +David Huntsperger, tldr; these are minimal starter projects for every
>> language. Once we have Java, Python and Go, it might be a good idea to
>> change the quickstarts to use these instead of the word count. There is
>> already a dedicated word count walkthrough so I think that is already
>> covered.
>> >
>> > If we all agree on the repo names, who can help us create them?
>> >
>> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>> >>
>> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
>> wrote:
>> >> >
>> >> > Agree with Luke here. "Just git clone and go" is a big part of it.
>> >> >
>> >> > But also the answer to "I simply don't know what one would put in a
>> Python repo than, other than a bare setup.py that lists a dependency on
>> apache_beam" is answered by David's initial email and his repo, namely:
>> >> >
>> >> >  - GitHub Actions configuration
>> >> >  - README.md
>> >> >  - example that already runs
>> >>
>> >> OK, fair enough.
>> >>
>> >> >  - LICENSE (notably you've got it as MIT but to be part of Apache
>> software it needs to be ASL2)
>> >>
>> >> On the topic of licence, it's a bit tricky because one doesn't want to
>> >> bind the users of such a template as being a derivative work of a
>> >> too-restrictive licence. The licence of the template itself should
>> >> generally be very permissive.
>> >>
>> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>> >> >>
>> >> >> I think for consistency it makes sense to users to be told to
>> checkout this git repo for the language of your choice and run. Some repos
>> will have more/less than others when it comes to setup necessary.
>> >> >>
>> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <
>> robertwb@google.com> wrote:
>> >> >>>
>> >> >>> +1 for doing this for Java, as setting up a project there is quite
>> >> >>> complicated. I simply don't know what one would put in a Python
>> repo
>> >> >>> than, other than a bare setup.py that lists a dependency on
>> >> >>> apache_beam. We don't have recommendations on file layout, etc.
>> more
>> >> >>> than that (though there's plenty of generic advice to be found out
>> >> >>> there on the topic). I have a hunch go is similar, and javascript
>> >> >>> would be as well (npm install apache-beam and your package.json
>> file
>> >> >>> gets updated).
>> >> >>>
>> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com>
>> wrote:
>> >> >>> >
>> >> >>> > There are several examples already within the Beam repo found in:
>> >> >>> > https://github.com/apache/beam/tree/master/examples
>> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>> >> >>> >
>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>> >> >>> >
>> >> >>> >
>> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
>> sachinag@google.com> wrote:
>> >> >>> >>
>> >> >>> >> I'd love to do something other than Wordcount just for
>> novelty/freshness but agreed with the suggestion that having an example in
>> each quickstart would be ideal.
>> >> >>> >>
>> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>> dhuntsperger@google.com> wrote:
>> >> >>> >>>
>> >> >>> >>> + 1 to a separate repo for each language.
>> >> >>> >>>
>> >> >>> >>> Would it make sense to include the Wordcount example in each
>> repo? I know that makes the repos less minimal, but we could rewrite the
>> quickstarts around these repos instead of the current Wordcount examples.
>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>> >> >>> >>>
>> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
>> dcavazos@google.com> wrote:
>> >> >>> >>>>
>> >> >>> >>>> I agree with dropping the archetypes. Less maintenance is
>> preferable, and the github repos are more flexible and maintainable.
>> >> >>> >>>>
>> >> >>> >>>> How about we create:
>> >> >>> >>>>
>> >> >>> >>>> apache/beam-starter-java
>> >> >>> >>>> apache/beam-starter-python
>> >> >>> >>>> apache/beam-starter-go
>> >> >>> >>>>
>> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer having
>> repos for all languages. It makes sense for consistency as well.
>> >> >>> >>>>
>> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
>> wrote:
>> >> >>> >>>>>
>> >> >>> >>>>> As long as we have tags so that people can pull out a
>> specific version of the examples that coincides with a specific SDK version
>> then we could drop the archetypes.
>> >> >>> >>>>>
>> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
>> bhulette@google.com> wrote:
>> >> >>> >>>>>>
>> >> >>> >>>>>> > Being such minimal examples, I don't expect them to break
>> commonly, but I think it would be good to make sure tests aren't failing
>> when a release is published.
>> >> >>> >>>>>>
>> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
>> breakage after the release. Agree we should verify RCs (document as part of
>> the release process), or even better, add automation to verify the repo
>> against snapshots. The automation could be nice to have anyway since it
>> provides an example for users to follow if they want to test against
>> snapshots and report issues to us sooner.
>> >> >>> >>>>>>
>> >> >>> >>>>>>
>> >> >>> >>>>>> If we move forward with this can we drop the archetype?
>> >> >>> >>>>>>
>> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
>> wrote:
>> >> >>> >>>>>>>
>> >> >>> >>>>>>> Sounds reasonable.
>> >> >>> >>>>>>>
>> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>> dcavazos@google.com> wrote:
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> I personally like the idea of a separate repo since we
>> can see how a true minimal project looks like. Having it in the main repo
>> would inherit build file configurations and other settings that would be
>> different from a clean project, so it could be non-trivial to adapt. Also
>> as its own repo, it's easier to clone and modify, or create an instance of
>> the template.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> Dependabot can take care of updating the Beam version and
>> other dependencies automatically. Testing is already set up via GitHub
>> actions for every pull request, so it would automatically be tested as soon
>> as there is a new dependency version available.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to break
>> commonly, but I think it would be good to make sure tests aren't failing
>> when a release is published.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> I'm okay with having one repo per language, and having
>> all the build systems we want to support for them. As long as we document
>> which files are for which build system. That way there are less repos to
>> maintain.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
>> lcwik@google.com> wrote:
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>> The github repo is definitely more flexible then the
>> archetypes but the archetypes have a few conveniences since they are
>> integrated with apache/beam repo. For example, updates/testing are done at
>> the same time a corresponding change to the main repo is done (like library
>> version updates), they are released when the SDK is released.
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
>> starter repo containing all the starters or one per language or one per
>> build system?
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>> When should updates to the starter happen?
>> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g.
>> release manager owns it)?
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>> dcavazos@google.com> wrote:
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work
>> very well for Gradle and SBT users. I think a GitHub template might be the
>> more flexible option, and we could have something similar for other
>> languages as well. Having said that, we could still create a Maven
>> archetype. If someone is familiar with that process, please let me know
>> since I'm not too familiar with Maven and its ecosystem.
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down
>> the name of the repo, create it, and move the code there. I was thinking
>> either `apache/beam-java-template` or `apache/beam-java-starter`. What do
>> you think?
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
>> altay@google.com> wrote:
>> >> >>> >>>>>>>>>>>
>> >> >>> >>>>>>>>>>> This is great David. Was there any progress on this?
>> Do you need help?
>> >> >>> >>>>>>>>>>>
>> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>> bhulette@google.com> wrote:
>> >> >>> >>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>> This is cool, thanks!
>> >> >>> >>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already, built
>> with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could
>> we de-dupe these (e.g. reference the GitHub template in the quickstart, or
>> co-locate the archetype with the GitHub template)?
>> >> >>> >>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>> somewhere like apache/beam-java-template? I think apache repositories like
>> beam-* are allowed.
>> >> >>> >>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>> Brian
>> >> >>> >>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>> >> >>> >>>>>>>>>>>> [2]
>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>> >> >>> >>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>> dcavazos@google.com> wrote:
>> >> >>> >>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>> +Ahmet Altay
>> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
>> >> >>> >>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>> >> >>> >>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>> dcavazos@google.com> wrote:
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> Hi Beam community!
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java
>> project, I've been working on a GitHub template containing a minimal Beam
>> Java pipeline for people to start with.
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
>> https://github.com/davidcavazos/beam-java
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>> >> >>> >>>>>>>>>>>>>> Minimal test file
>> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
>> runner)
>> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around
>> 1-2 minutes to run)
>> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run,
>> test, and add other runners
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
>> template.
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> Next steps
>> >> >>> >>>>>>>>>>>>>>
>> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with
>> it 🙂
>> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account,
>> so we need to create an Apache repo to host it
>> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
>> create a new Beam Java pipeline
>>
>

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
I'm convinced on all points. My main motivation was to keep it simple. But
of course we should keep it simple for users, not us :-)

I can take on the task of asking about MIT license and requesting the repos
be created. Not sure if it needs my level of privileges but I'm happy to do
it anyhow.

Kenn

On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw <ro...@google.com> wrote:

> On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com> wrote:
> >
> > MIT is much more permissive, but I also don't have any problems changing
> it to Apache license. In any case, how about we create the following repos?
>
> For these starter projects, we don't want to encumber any users of
> these templates with any particular licensing requirements (right?)
> and we don't even care about attribution. We want these to be pretty
> much as close to public domain as possible. That's not what the Apache
> licence does. (If it's even relevant, a good argument could likely be
> made for de minis or fair use, but I think it's best to be explicit
> about this. Perhaps this'd be a good question for apache legal?
>
> > apache/beam-starter-java
> > apache/beam-starter-python
> > apache/beam-starter-go
> > apache/beam-starter-kotlin
> > apache/beam-starter-scala
> >
> > We'll start by populating the Java one which is the most pressing one
> and the one that is ready, but the rest should be simpler.
> >
> > +David Huntsperger, tldr; these are minimal starter projects for every
> language. Once we have Java, Python and Go, it might be a good idea to
> change the quickstarts to use these instead of the word count. There is
> already a dedicated word count walkthrough so I think that is already
> covered.
> >
> > If we all agree on the repo names, who can help us create them?
> >
> > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <ro...@google.com>
> wrote:
> >>
> >> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org>
> wrote:
> >> >
> >> > Agree with Luke here. "Just git clone and go" is a big part of it.
> >> >
> >> > But also the answer to "I simply don't know what one would put in a
> Python repo than, other than a bare setup.py that lists a dependency on
> apache_beam" is answered by David's initial email and his repo, namely:
> >> >
> >> >  - GitHub Actions configuration
> >> >  - README.md
> >> >  - example that already runs
> >>
> >> OK, fair enough.
> >>
> >> >  - LICENSE (notably you've got it as MIT but to be part of Apache
> software it needs to be ASL2)
> >>
> >> On the topic of licence, it's a bit tricky because one doesn't want to
> >> bind the users of such a template as being a derivative work of a
> >> too-restrictive licence. The licence of the template itself should
> >> generally be very permissive.
> >>
> >> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
> >> >>
> >> >> I think for consistency it makes sense to users to be told to
> checkout this git repo for the language of your choice and run. Some repos
> will have more/less than others when it comes to setup necessary.
> >> >>
> >> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
> wrote:
> >> >>>
> >> >>> +1 for doing this for Java, as setting up a project there is quite
> >> >>> complicated. I simply don't know what one would put in a Python repo
> >> >>> than, other than a bare setup.py that lists a dependency on
> >> >>> apache_beam. We don't have recommendations on file layout, etc. more
> >> >>> than that (though there's plenty of generic advice to be found out
> >> >>> there on the topic). I have a hunch go is similar, and javascript
> >> >>> would be as well (npm install apache-beam and your package.json file
> >> >>> gets updated).
> >> >>>
> >> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
> >> >>> >
> >> >>> > There are several examples already within the Beam repo found in:
> >> >>> > https://github.com/apache/beam/tree/master/examples
> >> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
> >> >>> >
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
> >> >>> >
> >> >>> >
> >> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <
> sachinag@google.com> wrote:
> >> >>> >>
> >> >>> >> I'd love to do something other than Wordcount just for
> novelty/freshness but agreed with the suggestion that having an example in
> each quickstart would be ideal.
> >> >>> >>
> >> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
> dhuntsperger@google.com> wrote:
> >> >>> >>>
> >> >>> >>> + 1 to a separate repo for each language.
> >> >>> >>>
> >> >>> >>> Would it make sense to include the Wordcount example in each
> repo? I know that makes the repos less minimal, but we could rewrite the
> quickstarts around these repos instead of the current Wordcount examples.
> Or maybe we don't need to use the Wordcount example in the quickstarts...
> >> >>> >>>
> >> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <
> dcavazos@google.com> wrote:
> >> >>> >>>>
> >> >>> >>>> I agree with dropping the archetypes. Less maintenance is
> preferable, and the github repos are more flexible and maintainable.
> >> >>> >>>>
> >> >>> >>>> How about we create:
> >> >>> >>>>
> >> >>> >>>> apache/beam-starter-java
> >> >>> >>>> apache/beam-starter-python
> >> >>> >>>> apache/beam-starter-go
> >> >>> >>>>
> >> >>> >>>> During our OKR planning, +Keith Malvetti would prefer having
> repos for all languages. It makes sense for consistency as well.
> >> >>> >>>>
> >> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
> wrote:
> >> >>> >>>>>
> >> >>> >>>>> As long as we have tags so that people can pull out a
> specific version of the examples that coincides with a specific SDK version
> then we could drop the archetypes.
> >> >>> >>>>>
> >> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
> bhulette@google.com> wrote:
> >> >>> >>>>>>
> >> >>> >>>>>> > Being such minimal examples, I don't expect them to break
> commonly, but I think it would be good to make sure tests aren't failing
> when a release is published.
> >> >>> >>>>>>
> >> >>> >>>>>> Yeah it would be very unfortunate if we discovered a
> breakage after the release. Agree we should verify RCs (document as part of
> the release process), or even better, add automation to verify the repo
> against snapshots. The automation could be nice to have anyway since it
> provides an example for users to follow if they want to test against
> snapshots and report issues to us sooner.
> >> >>> >>>>>>
> >> >>> >>>>>>
> >> >>> >>>>>> If we move forward with this can we drop the archetype?
> >> >>> >>>>>>
> >> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
> wrote:
> >> >>> >>>>>>>
> >> >>> >>>>>>> Sounds reasonable.
> >> >>> >>>>>>>
> >> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
> dcavazos@google.com> wrote:
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> I personally like the idea of a separate repo since we can
> see how a true minimal project looks like. Having it in the main repo would
> inherit build file configurations and other settings that would be
> different from a clean project, so it could be non-trivial to adapt. Also
> as its own repo, it's easier to clone and modify, or create an instance of
> the template.
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> Dependabot can take care of updating the Beam version and
> other dependencies automatically. Testing is already set up via GitHub
> actions for every pull request, so it would automatically be tested as soon
> as there is a new dependency version available.
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> Being such minimal examples, I don't expect them to break
> commonly, but I think it would be good to make sure tests aren't failing
> when a release is published.
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> I'm okay with having one repo per language, and having all
> the build systems we want to support for them. As long as we document which
> files are for which build system. That way there are less repos to maintain.
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <
> lcwik@google.com> wrote:
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> The github repo is definitely more flexible then the
> archetypes but the archetypes have a few conveniences since they are
> integrated with apache/beam repo. For example, updates/testing are done at
> the same time a corresponding change to the main repo is done (like library
> version updates), they are released when the SDK is released.
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> Should these be part of the main repo, or a single
> starter repo containing all the starters or one per language or one per
> build system?
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> When should updates to the starter happen?
> >> >>> >>>>>>>>> How as a community do we get them to happen (e.g. release
> manager owns it)?
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
> dcavazos@google.com> wrote:
> >> >>> >>>>>>>>>>
> >> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work
> very well for Gradle and SBT users. I think a GitHub template might be the
> more flexible option, and we could have something similar for other
> languages as well. Having said that, we could still create a Maven
> archetype. If someone is familiar with that process, please let me know
> since I'm not too familiar with Maven and its ecosystem.
> >> >>> >>>>>>>>>>
> >> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down
> the name of the repo, create it, and move the code there. I was thinking
> either `apache/beam-java-template` or `apache/beam-java-starter`. What do
> you think?
> >> >>> >>>>>>>>>>
> >> >>> >>>>>>>>>> What would be the next steps on creating the repo?
> >> >>> >>>>>>>>>>
> >> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
> altay@google.com> wrote:
> >> >>> >>>>>>>>>>>
> >> >>> >>>>>>>>>>> This is great David. Was there any progress on this? Do
> you need help?
> >> >>> >>>>>>>>>>>
> >> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
> bhulette@google.com> wrote:
> >> >>> >>>>>>>>>>>>
> >> >>> >>>>>>>>>>>> This is cool, thanks!
> >> >>> >>>>>>>>>>>>
> >> >>> >>>>>>>>>>>> We do have a template in apache/beam already, built
> with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could
> we de-dupe these (e.g. reference the GitHub template in the quickstart, or
> co-locate the archetype with the GitHub template)?
> >> >>> >>>>>>>>>>>>
> >> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
> somewhere like apache/beam-java-template? I think apache repositories like
> beam-* are allowed.
> >> >>> >>>>>>>>>>>>
> >> >>> >>>>>>>>>>>> Brian
> >> >>> >>>>>>>>>>>>
> >> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
> >> >>> >>>>>>>>>>>> [2]
> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
> >> >>> >>>>>>>>>>>>
> >> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
> dcavazos@google.com> wrote:
> >> >>> >>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>> +Ahmet Altay
> >> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
> >> >>> >>>>>>>>>>>>> +Kenneth Knowles
> >> >>> >>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
> >> >>> >>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
> dcavazos@google.com> wrote:
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> Hi Beam community!
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project,
> I've been working on a GitHub template containing a minimal Beam Java
> pipeline for people to start with.
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> Link to the GitHub template:
> https://github.com/davidcavazos/beam-java
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
> >> >>> >>>>>>>>>>>>>> Minimal test file
> >> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct
> runner)
> >> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around
> 1-2 minutes to run)
> >> >>> >>>>>>>>>>>>>> README with instructions on how to build, run, test,
> and add other runners
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a
> template.
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> Next steps
> >> >>> >>>>>>>>>>>>>>
> >> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with
> it 🙂
> >> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so
> we need to create an Apache repo to host it
> >> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to
> create a new Beam Java pipeline
>

Re: Beam Java starter project template

Posted by Robert Bradshaw <ro...@google.com>.
On Wed, Feb 2, 2022 at 10:12 AM David Cavazos <dc...@google.com> wrote:
>
> MIT is much more permissive, but I also don't have any problems changing it to Apache license. In any case, how about we create the following repos?

For these starter projects, we don't want to encumber any users of
these templates with any particular licensing requirements (right?)
and we don't even care about attribution. We want these to be pretty
much as close to public domain as possible. That's not what the Apache
licence does. (If it's even relevant, a good argument could likely be
made for de minis or fair use, but I think it's best to be explicit
about this. Perhaps this'd be a good question for apache legal?

> apache/beam-starter-java
> apache/beam-starter-python
> apache/beam-starter-go
> apache/beam-starter-kotlin
> apache/beam-starter-scala
>
> We'll start by populating the Java one which is the most pressing one and the one that is ready, but the rest should be simpler.
>
> +David Huntsperger, tldr; these are minimal starter projects for every language. Once we have Java, Python and Go, it might be a good idea to change the quickstarts to use these instead of the word count. There is already a dedicated word count walkthrough so I think that is already covered.
>
> If we all agree on the repo names, who can help us create them?
>
> On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <ro...@google.com> wrote:
>>
>> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:
>> >
>> > Agree with Luke here. "Just git clone and go" is a big part of it.
>> >
>> > But also the answer to "I simply don't know what one would put in a Python repo than, other than a bare setup.py that lists a dependency on apache_beam" is answered by David's initial email and his repo, namely:
>> >
>> >  - GitHub Actions configuration
>> >  - README.md
>> >  - example that already runs
>>
>> OK, fair enough.
>>
>> >  - LICENSE (notably you've got it as MIT but to be part of Apache software it needs to be ASL2)
>>
>> On the topic of licence, it's a bit tricky because one doesn't want to
>> bind the users of such a template as being a derivative work of a
>> too-restrictive licence. The licence of the template itself should
>> generally be very permissive.
>>
>> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>> >>
>> >> I think for consistency it makes sense to users to be told to checkout this git repo for the language of your choice and run. Some repos will have more/less than others when it comes to setup necessary.
>> >>
>> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com> wrote:
>> >>>
>> >>> +1 for doing this for Java, as setting up a project there is quite
>> >>> complicated. I simply don't know what one would put in a Python repo
>> >>> than, other than a bare setup.py that lists a dependency on
>> >>> apache_beam. We don't have recommendations on file layout, etc. more
>> >>> than that (though there's plenty of generic advice to be found out
>> >>> there on the topic). I have a hunch go is similar, and javascript
>> >>> would be as well (npm install apache-beam and your package.json file
>> >>> gets updated).
>> >>>
>> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>> >>> >
>> >>> > There are several examples already within the Beam repo found in:
>> >>> > https://github.com/apache/beam/tree/master/examples
>> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>> >>> > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>> >>> >
>> >>> >
>> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com> wrote:
>> >>> >>
>> >>> >> I'd love to do something other than Wordcount just for novelty/freshness but agreed with the suggestion that having an example in each quickstart would be ideal.
>> >>> >>
>> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <dh...@google.com> wrote:
>> >>> >>>
>> >>> >>> + 1 to a separate repo for each language.
>> >>> >>>
>> >>> >>> Would it make sense to include the Wordcount example in each repo? I know that makes the repos less minimal, but we could rewrite the quickstarts around these repos instead of the current Wordcount examples. Or maybe we don't need to use the Wordcount example in the quickstarts...
>> >>> >>>
>> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com> wrote:
>> >>> >>>>
>> >>> >>>> I agree with dropping the archetypes. Less maintenance is preferable, and the github repos are more flexible and maintainable.
>> >>> >>>>
>> >>> >>>> How about we create:
>> >>> >>>>
>> >>> >>>> apache/beam-starter-java
>> >>> >>>> apache/beam-starter-python
>> >>> >>>> apache/beam-starter-go
>> >>> >>>>
>> >>> >>>> During our OKR planning, +Keith Malvetti would prefer having repos for all languages. It makes sense for consistency as well.
>> >>> >>>>
>> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>> >>> >>>>>
>> >>> >>>>> As long as we have tags so that people can pull out a specific version of the examples that coincides with a specific SDK version then we could drop the archetypes.
>> >>> >>>>>
>> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com> wrote:
>> >>> >>>>>>
>> >>> >>>>>> > Being such minimal examples, I don't expect them to break commonly, but I think it would be good to make sure tests aren't failing when a release is published.
>> >>> >>>>>>
>> >>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage after the release. Agree we should verify RCs (document as part of the release process), or even better, add automation to verify the repo against snapshots. The automation could be nice to have anyway since it provides an example for users to follow if they want to test against snapshots and report issues to us sooner.
>> >>> >>>>>>
>> >>> >>>>>>
>> >>> >>>>>> If we move forward with this can we drop the archetype?
>> >>> >>>>>>
>> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>> >>> >>>>>>>
>> >>> >>>>>>> Sounds reasonable.
>> >>> >>>>>>>
>> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com> wrote:
>> >>> >>>>>>>>
>> >>> >>>>>>>> I personally like the idea of a separate repo since we can see how a true minimal project looks like. Having it in the main repo would inherit build file configurations and other settings that would be different from a clean project, so it could be non-trivial to adapt. Also as its own repo, it's easier to clone and modify, or create an instance of the template.
>> >>> >>>>>>>>
>> >>> >>>>>>>> Dependabot can take care of updating the Beam version and other dependencies automatically. Testing is already set up via GitHub actions for every pull request, so it would automatically be tested as soon as there is a new dependency version available.
>> >>> >>>>>>>>
>> >>> >>>>>>>> Being such minimal examples, I don't expect them to break commonly, but I think it would be good to make sure tests aren't failing when a release is published.
>> >>> >>>>>>>>
>> >>> >>>>>>>> I'm okay with having one repo per language, and having all the build systems we want to support for them. As long as we document which files are for which build system. That way there are less repos to maintain.
>> >>> >>>>>>>>
>> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> The github repo is definitely more flexible then the archetypes but the archetypes have a few conveniences since they are integrated with apache/beam repo. For example, updates/testing are done at the same time a corresponding change to the main repo is done (like library version updates), they are released when the SDK is released.
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> Should these be part of the main repo, or a single starter repo containing all the starters or one per language or one per build system?
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> When should updates to the starter happen?
>> >>> >>>>>>>>> How as a community do we get them to happen (e.g. release manager owns it)?
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com> wrote:
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work very well for Gradle and SBT users. I think a GitHub template might be the more flexible option, and we could have something similar for other languages as well. Having said that, we could still create a Maven archetype. If someone is familiar with that process, please let me know since I'm not too familiar with Maven and its ecosystem.
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the name of the repo, create it, and move the code there. I was thinking either `apache/beam-java-template` or `apache/beam-java-starter`. What do you think?
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> What would be the next steps on creating the repo?
>> >>> >>>>>>>>>>
>> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> This is great David. Was there any progress on this? Do you need help?
>> >>> >>>>>>>>>>>
>> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com> wrote:
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> This is cool, thanks!
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> We do have a template in apache/beam already, built with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template in the quickstart, or co-locate the archetype with the GitHub template)?
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this somewhere like apache/beam-java-template? I think apache repositories like beam-* are allowed.
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> Brian
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>> >>> >>>>>>>>>>>> [2] https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>> >>> >>>>>>>>>>>>
>> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com> wrote:
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> +Ahmet Altay
>> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>> >>> >>>>>>>>>>>>> +Kenneth Knowles
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
>> >>> >>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com> wrote:
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> Hi Beam community!
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've been working on a GitHub template containing a minimal Beam Java pipeline for people to start with.
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> Link to the GitHub template: https://github.com/davidcavazos/beam-java
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>> >>> >>>>>>>>>>>>>> Minimal test file
>> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2 minutes to run)
>> >>> >>>>>>>>>>>>>> README with instructions on how to build, run, test, and add other runners
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> Next steps
>> >>> >>>>>>>>>>>>>>
>> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we need to create an Apache repo to host it
>> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a new Beam Java pipeline

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
MIT is much more permissive, but I also don't have any problems changing it
to Apache license. In any case, how about we create the following repos?

   - apache/beam-starter-java
   - apache/beam-starter-python
   - apache/beam-starter-go
   - apache/beam-starter-kotlin
   - apache/beam-starter-scala

We'll start by populating the Java one which is the most pressing one and
the one that is ready, but the rest should be simpler.

+David Huntsperger <dh...@google.com>, tldr; these are minimal
starter projects for every language. Once we have Java, Python and Go, it
might be a good idea to change the quickstarts to use these instead of the
word count. There is already a dedicated word count walkthrough so I think
that is already covered.

If we all agree on the repo names, who can help us create them?

On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <ro...@google.com>
wrote:

> On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:
> >
> > Agree with Luke here. "Just git clone and go" is a big part of it.
> >
> > But also the answer to "I simply don't know what one would put in a
> Python repo than, other than a bare setup.py that lists a dependency on
> apache_beam" is answered by David's initial email and his repo, namely:
> >
> >  - GitHub Actions configuration
> >  - README.md
> >  - example that already runs
>
> OK, fair enough.
>
> >  - LICENSE (notably you've got it as MIT but to be part of Apache
> software it needs to be ASL2)
>
> On the topic of licence, it's a bit tricky because one doesn't want to
> bind the users of such a template as being a derivative work of a
> too-restrictive licence. The licence of the template itself should
> generally be very permissive.
>
> > On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
> >>
> >> I think for consistency it makes sense to users to be told to checkout
> this git repo for the language of your choice and run. Some repos will have
> more/less than others when it comes to setup necessary.
> >>
> >> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
> wrote:
> >>>
> >>> +1 for doing this for Java, as setting up a project there is quite
> >>> complicated. I simply don't know what one would put in a Python repo
> >>> than, other than a bare setup.py that lists a dependency on
> >>> apache_beam. We don't have recommendations on file layout, etc. more
> >>> than that (though there's plenty of generic advice to be found out
> >>> there on the topic). I have a hunch go is similar, and javascript
> >>> would be as well (npm install apache-beam and your package.json file
> >>> gets updated).
> >>>
> >>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
> >>> >
> >>> > There are several examples already within the Beam repo found in:
> >>> > https://github.com/apache/beam/tree/master/examples
> >>> > https://github.com/apache/beam/tree/master/sdks/go/examples
> >>> >
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
> >>> >
> >>> >
> >>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com>
> wrote:
> >>> >>
> >>> >> I'd love to do something other than Wordcount just for
> novelty/freshness but agreed with the suggestion that having an example in
> each quickstart would be ideal.
> >>> >>
> >>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
> dhuntsperger@google.com> wrote:
> >>> >>>
> >>> >>> + 1 to a separate repo for each language.
> >>> >>>
> >>> >>> Would it make sense to include the Wordcount example in each repo?
> I know that makes the repos less minimal, but we could rewrite the
> quickstarts around these repos instead of the current Wordcount examples.
> Or maybe we don't need to use the Wordcount example in the quickstarts...
> >>> >>>
> >>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
> wrote:
> >>> >>>>
> >>> >>>> I agree with dropping the archetypes. Less maintenance is
> preferable, and the github repos are more flexible and maintainable.
> >>> >>>>
> >>> >>>> How about we create:
> >>> >>>>
> >>> >>>> apache/beam-starter-java
> >>> >>>> apache/beam-starter-python
> >>> >>>> apache/beam-starter-go
> >>> >>>>
> >>> >>>> During our OKR planning, +Keith Malvetti would prefer having
> repos for all languages. It makes sense for consistency as well.
> >>> >>>>
> >>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com>
> wrote:
> >>> >>>>>
> >>> >>>>> As long as we have tags so that people can pull out a specific
> version of the examples that coincides with a specific SDK version then we
> could drop the archetypes.
> >>> >>>>>
> >>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <
> bhulette@google.com> wrote:
> >>> >>>>>>
> >>> >>>>>> > Being such minimal examples, I don't expect them to break
> commonly, but I think it would be good to make sure tests aren't failing
> when a release is published.
> >>> >>>>>>
> >>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
> after the release. Agree we should verify RCs (document as part of the
> release process), or even better, add automation to verify the repo against
> snapshots. The automation could be nice to have anyway since it provides an
> example for users to follow if they want to test against snapshots and
> report issues to us sooner.
> >>> >>>>>>
> >>> >>>>>>
> >>> >>>>>> If we move forward with this can we drop the archetype?
> >>> >>>>>>
> >>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com>
> wrote:
> >>> >>>>>>>
> >>> >>>>>>> Sounds reasonable.
> >>> >>>>>>>
> >>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
> dcavazos@google.com> wrote:
> >>> >>>>>>>>
> >>> >>>>>>>> I personally like the idea of a separate repo since we can
> see how a true minimal project looks like. Having it in the main repo would
> inherit build file configurations and other settings that would be
> different from a clean project, so it could be non-trivial to adapt. Also
> as its own repo, it's easier to clone and modify, or create an instance of
> the template.
> >>> >>>>>>>>
> >>> >>>>>>>> Dependabot can take care of updating the Beam version and
> other dependencies automatically. Testing is already set up via GitHub
> actions for every pull request, so it would automatically be tested as soon
> as there is a new dependency version available.
> >>> >>>>>>>>
> >>> >>>>>>>> Being such minimal examples, I don't expect them to break
> commonly, but I think it would be good to make sure tests aren't failing
> when a release is published.
> >>> >>>>>>>>
> >>> >>>>>>>> I'm okay with having one repo per language, and having all
> the build systems we want to support for them. As long as we document which
> files are for which build system. That way there are less repos to maintain.
> >>> >>>>>>>>
> >>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
> wrote:
> >>> >>>>>>>>>
> >>> >>>>>>>>> The github repo is definitely more flexible then the
> archetypes but the archetypes have a few conveniences since they are
> integrated with apache/beam repo. For example, updates/testing are done at
> the same time a corresponding change to the main repo is done (like library
> version updates), they are released when the SDK is released.
> >>> >>>>>>>>>
> >>> >>>>>>>>> Should these be part of the main repo, or a single starter
> repo containing all the starters or one per language or one per build
> system?
> >>> >>>>>>>>>
> >>> >>>>>>>>> When should updates to the starter happen?
> >>> >>>>>>>>> How as a community do we get them to happen (e.g. release
> manager owns it)?
> >>> >>>>>>>>>
> >>> >>>>>>>>>
> >>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
> dcavazos@google.com> wrote:
> >>> >>>>>>>>>>
> >>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work
> very well for Gradle and SBT users. I think a GitHub template might be the
> more flexible option, and we could have something similar for other
> languages as well. Having said that, we could still create a Maven
> archetype. If someone is familiar with that process, please let me know
> since I'm not too familiar with Maven and its ecosystem.
> >>> >>>>>>>>>>
> >>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the
> name of the repo, create it, and move the code there. I was thinking either
> `apache/beam-java-template` or `apache/beam-java-starter`. What do you
> think?
> >>> >>>>>>>>>>
> >>> >>>>>>>>>> What would be the next steps on creating the repo?
> >>> >>>>>>>>>>
> >>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <
> altay@google.com> wrote:
> >>> >>>>>>>>>>>
> >>> >>>>>>>>>>> This is great David. Was there any progress on this? Do
> you need help?
> >>> >>>>>>>>>>>
> >>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
> bhulette@google.com> wrote:
> >>> >>>>>>>>>>>>
> >>> >>>>>>>>>>>> This is cool, thanks!
> >>> >>>>>>>>>>>>
> >>> >>>>>>>>>>>> We do have a template in apache/beam already, built with
> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
> de-dupe these (e.g. reference the GitHub template in the quickstart, or
> co-locate the archetype with the GitHub template)?
> >>> >>>>>>>>>>>>
> >>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
> somewhere like apache/beam-java-template? I think apache repositories like
> beam-* are allowed.
> >>> >>>>>>>>>>>>
> >>> >>>>>>>>>>>> Brian
> >>> >>>>>>>>>>>>
> >>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
> >>> >>>>>>>>>>>> [2]
> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
> >>> >>>>>>>>>>>>
> >>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
> dcavazos@google.com> wrote:
> >>> >>>>>>>>>>>>>
> >>> >>>>>>>>>>>>> +Ahmet Altay
> >>> >>>>>>>>>>>>> +Valentyn Tymofieiev
> >>> >>>>>>>>>>>>> +Kenneth Knowles
> >>> >>>>>>>>>>>>>
> >>> >>>>>>>>>>>>> Please feel free to include anyone else!
> >>> >>>>>>>>>>>>>
> >>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
> dcavazos@google.com> wrote:
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> Hi Beam community!
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project,
> I've been working on a GitHub template containing a minimal Beam Java
> pipeline for people to start with.
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> Link to the GitHub template:
> https://github.com/davidcavazos/beam-java
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> So far, here's what the template contains:
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
> >>> >>>>>>>>>>>>>> Minimal test file
> >>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
> >>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
> minutes to run)
> >>> >>>>>>>>>>>>>> README with instructions on how to build, run, test,
> and add other runners
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> Next steps
> >>> >>>>>>>>>>>>>>
> >>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
> >>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we
> need to create an Apache repo to host it
> >>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a
> new Beam Java pipeline
>

Re: Beam Java starter project template

Posted by Robert Bradshaw <ro...@google.com>.
On Tue, Jan 18, 2022 at 6:17 AM Kenneth Knowles <ke...@apache.org> wrote:
>
> Agree with Luke here. "Just git clone and go" is a big part of it.
>
> But also the answer to "I simply don't know what one would put in a Python repo than, other than a bare setup.py that lists a dependency on apache_beam" is answered by David's initial email and his repo, namely:
>
>  - GitHub Actions configuration
>  - README.md
>  - example that already runs

OK, fair enough.

>  - LICENSE (notably you've got it as MIT but to be part of Apache software it needs to be ASL2)

On the topic of licence, it's a bit tricky because one doesn't want to
bind the users of such a template as being a derivative work of a
too-restrictive licence. The licence of the template itself should
generally be very permissive.

> On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:
>>
>> I think for consistency it makes sense to users to be told to checkout this git repo for the language of your choice and run. Some repos will have more/less than others when it comes to setup necessary.
>>
>> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com> wrote:
>>>
>>> +1 for doing this for Java, as setting up a project there is quite
>>> complicated. I simply don't know what one would put in a Python repo
>>> than, other than a bare setup.py that lists a dependency on
>>> apache_beam. We don't have recommendations on file layout, etc. more
>>> than that (though there's plenty of generic advice to be found out
>>> there on the topic). I have a hunch go is similar, and javascript
>>> would be as well (npm install apache-beam and your package.json file
>>> gets updated).
>>>
>>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>>> >
>>> > There are several examples already within the Beam repo found in:
>>> > https://github.com/apache/beam/tree/master/examples
>>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>>> > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>>> >
>>> >
>>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com> wrote:
>>> >>
>>> >> I'd love to do something other than Wordcount just for novelty/freshness but agreed with the suggestion that having an example in each quickstart would be ideal.
>>> >>
>>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <dh...@google.com> wrote:
>>> >>>
>>> >>> + 1 to a separate repo for each language.
>>> >>>
>>> >>> Would it make sense to include the Wordcount example in each repo? I know that makes the repos less minimal, but we could rewrite the quickstarts around these repos instead of the current Wordcount examples. Or maybe we don't need to use the Wordcount example in the quickstarts...
>>> >>>
>>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com> wrote:
>>> >>>>
>>> >>>> I agree with dropping the archetypes. Less maintenance is preferable, and the github repos are more flexible and maintainable.
>>> >>>>
>>> >>>> How about we create:
>>> >>>>
>>> >>>> apache/beam-starter-java
>>> >>>> apache/beam-starter-python
>>> >>>> apache/beam-starter-go
>>> >>>>
>>> >>>> During our OKR planning, +Keith Malvetti would prefer having repos for all languages. It makes sense for consistency as well.
>>> >>>>
>>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>>> >>>>>
>>> >>>>> As long as we have tags so that people can pull out a specific version of the examples that coincides with a specific SDK version then we could drop the archetypes.
>>> >>>>>
>>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com> wrote:
>>> >>>>>>
>>> >>>>>> > Being such minimal examples, I don't expect them to break commonly, but I think it would be good to make sure tests aren't failing when a release is published.
>>> >>>>>>
>>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage after the release. Agree we should verify RCs (document as part of the release process), or even better, add automation to verify the repo against snapshots. The automation could be nice to have anyway since it provides an example for users to follow if they want to test against snapshots and report issues to us sooner.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> If we move forward with this can we drop the archetype?
>>> >>>>>>
>>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>>> >>>>>>>
>>> >>>>>>> Sounds reasonable.
>>> >>>>>>>
>>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com> wrote:
>>> >>>>>>>>
>>> >>>>>>>> I personally like the idea of a separate repo since we can see how a true minimal project looks like. Having it in the main repo would inherit build file configurations and other settings that would be different from a clean project, so it could be non-trivial to adapt. Also as its own repo, it's easier to clone and modify, or create an instance of the template.
>>> >>>>>>>>
>>> >>>>>>>> Dependabot can take care of updating the Beam version and other dependencies automatically. Testing is already set up via GitHub actions for every pull request, so it would automatically be tested as soon as there is a new dependency version available.
>>> >>>>>>>>
>>> >>>>>>>> Being such minimal examples, I don't expect them to break commonly, but I think it would be good to make sure tests aren't failing when a release is published.
>>> >>>>>>>>
>>> >>>>>>>> I'm okay with having one repo per language, and having all the build systems we want to support for them. As long as we document which files are for which build system. That way there are less repos to maintain.
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>> The github repo is definitely more flexible then the archetypes but the archetypes have a few conveniences since they are integrated with apache/beam repo. For example, updates/testing are done at the same time a corresponding change to the main repo is done (like library version updates), they are released when the SDK is released.
>>> >>>>>>>>>
>>> >>>>>>>>> Should these be part of the main repo, or a single starter repo containing all the starters or one per language or one per build system?
>>> >>>>>>>>>
>>> >>>>>>>>> When should updates to the starter happen?
>>> >>>>>>>>> How as a community do we get them to happen (e.g. release manager owns it)?
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work very well for Gradle and SBT users. I think a GitHub template might be the more flexible option, and we could have something similar for other languages as well. Having said that, we could still create a Maven archetype. If someone is familiar with that process, please let me know since I'm not too familiar with Maven and its ecosystem.
>>> >>>>>>>>>>
>>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the name of the repo, create it, and move the code there. I was thinking either `apache/beam-java-template` or `apache/beam-java-starter`. What do you think?
>>> >>>>>>>>>>
>>> >>>>>>>>>> What would be the next steps on creating the repo?
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> This is great David. Was there any progress on this? Do you need help?
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> This is cool, thanks!
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> We do have a template in apache/beam already, built with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template in the quickstart, or co-locate the archetype with the GitHub template)?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this somewhere like apache/beam-java-template? I think apache repositories like beam-* are allowed.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Brian
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>> >>>>>>>>>>>> [2] https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com> wrote:
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> +Ahmet Altay
>>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>>> >>>>>>>>>>>>> +Kenneth Knowles
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Please feel free to include anyone else!
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com> wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Hi Beam community!
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've been working on a GitHub template containing a minimal Beam Java pipeline for people to start with.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Link to the GitHub template: https://github.com/davidcavazos/beam-java
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> So far, here's what the template contains:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>> >>>>>>>>>>>>>> Minimal test file
>>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2 minutes to run)
>>> >>>>>>>>>>>>>> README with instructions on how to build, run, test, and add other runners
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Next steps
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we need to create an Apache repo to host it
>>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a new Beam Java pipeline

Re: Beam Java starter project template

Posted by Kenneth Knowles <ke...@apache.org>.
Agree with Luke here. "Just git clone and go" is a big part of it.

But also the answer to "I simply don't know what one would put in a Python
repo than, other than a bare setup.py that lists a dependency on
apache_beam" is answered by David's initial email and his repo, namely:

 - GitHub Actions configuration
 - README.md
 - example that already runs
 - LICENSE (notably you've got it as MIT but to be part of Apache software
it needs to be ASL2)

Kenn

On Fri, Jan 14, 2022 at 2:34 PM Luke Cwik <lc...@google.com> wrote:

> I think for consistency it makes sense to users to be told to checkout
> this git repo for the language of your choice and run. Some repos will have
> more/less than others when it comes to setup necessary.
>
> On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> +1 for doing this for Java, as setting up a project there is quite
>> complicated. I simply don't know what one would put in a Python repo
>> than, other than a bare setup.py that lists a dependency on
>> apache_beam. We don't have recommendations on file layout, etc. more
>> than that (though there's plenty of generic advice to be found out
>> there on the topic). I have a hunch go is similar, and javascript
>> would be as well (npm install apache-beam and your package.json file
>> gets updated).
>>
>> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>> >
>> > There are several examples already within the Beam repo found in:
>> > https://github.com/apache/beam/tree/master/examples
>> > https://github.com/apache/beam/tree/master/sdks/go/examples
>> >
>> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>> >
>> >
>> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com>
>> wrote:
>> >>
>> >> I'd love to do something other than Wordcount just for
>> novelty/freshness but agreed with the suggestion that having an example in
>> each quickstart would be ideal.
>> >>
>> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
>> dhuntsperger@google.com> wrote:
>> >>>
>> >>> + 1 to a separate repo for each language.
>> >>>
>> >>> Would it make sense to include the Wordcount example in each repo? I
>> know that makes the repos less minimal, but we could rewrite the
>> quickstarts around these repos instead of the current Wordcount examples.
>> Or maybe we don't need to use the Wordcount example in the quickstarts...
>> >>>
>> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
>> wrote:
>> >>>>
>> >>>> I agree with dropping the archetypes. Less maintenance is
>> preferable, and the github repos are more flexible and maintainable.
>> >>>>
>> >>>> How about we create:
>> >>>>
>> >>>> apache/beam-starter-java
>> >>>> apache/beam-starter-python
>> >>>> apache/beam-starter-go
>> >>>>
>> >>>> During our OKR planning, +Keith Malvetti would prefer having repos
>> for all languages. It makes sense for consistency as well.
>> >>>>
>> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>> >>>>>
>> >>>>> As long as we have tags so that people can pull out a specific
>> version of the examples that coincides with a specific SDK version then we
>> could drop the archetypes.
>> >>>>>
>> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com>
>> wrote:
>> >>>>>>
>> >>>>>> > Being such minimal examples, I don't expect them to break
>> commonly, but I think it would be good to make sure tests aren't failing
>> when a release is published.
>> >>>>>>
>> >>>>>> Yeah it would be very unfortunate if we discovered a breakage
>> after the release. Agree we should verify RCs (document as part of the
>> release process), or even better, add automation to verify the repo against
>> snapshots. The automation could be nice to have anyway since it provides an
>> example for users to follow if they want to test against snapshots and
>> report issues to us sooner.
>> >>>>>>
>> >>>>>>
>> >>>>>> If we move forward with this can we drop the archetype?
>> >>>>>>
>> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>> >>>>>>>
>> >>>>>>> Sounds reasonable.
>> >>>>>>>
>> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <
>> dcavazos@google.com> wrote:
>> >>>>>>>>
>> >>>>>>>> I personally like the idea of a separate repo since we can see
>> how a true minimal project looks like. Having it in the main repo would
>> inherit build file configurations and other settings that would be
>> different from a clean project, so it could be non-trivial to adapt. Also
>> as its own repo, it's easier to clone and modify, or create an instance of
>> the template.
>> >>>>>>>>
>> >>>>>>>> Dependabot can take care of updating the Beam version and other
>> dependencies automatically. Testing is already set up via GitHub actions
>> for every pull request, so it would automatically be tested as soon as
>> there is a new dependency version available.
>> >>>>>>>>
>> >>>>>>>> Being such minimal examples, I don't expect them to break
>> commonly, but I think it would be good to make sure tests aren't failing
>> when a release is published.
>> >>>>>>>>
>> >>>>>>>> I'm okay with having one repo per language, and having all the
>> build systems we want to support for them. As long as we document which
>> files are for which build system. That way there are less repos to maintain.
>> >>>>>>>>
>> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> The github repo is definitely more flexible then the archetypes
>> but the archetypes have a few conveniences since they are integrated with
>> apache/beam repo. For example, updates/testing are done at the same time a
>> corresponding change to the main repo is done (like library version
>> updates), they are released when the SDK is released.
>> >>>>>>>>>
>> >>>>>>>>> Should these be part of the main repo, or a single starter repo
>> containing all the starters or one per language or one per build system?
>> >>>>>>>>>
>> >>>>>>>>> When should updates to the starter happen?
>> >>>>>>>>> How as a community do we get them to happen (e.g. release
>> manager owns it)?
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
>> dcavazos@google.com> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work very
>> well for Gradle and SBT users. I think a GitHub template might be the more
>> flexible option, and we could have something similar for other languages as
>> well. Having said that, we could still create a Maven archetype. If someone
>> is familiar with that process, please let me know since I'm not too
>> familiar with Maven and its ecosystem.
>> >>>>>>>>>>
>> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the
>> name of the repo, create it, and move the code there. I was thinking either
>> `apache/beam-java-template` or `apache/beam-java-starter`. What do you
>> think?
>> >>>>>>>>>>
>> >>>>>>>>>> What would be the next steps on creating the repo?
>> >>>>>>>>>>
>> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com>
>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>> This is great David. Was there any progress on this? Do you
>> need help?
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
>> bhulette@google.com> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> This is cool, thanks!
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> We do have a template in apache/beam already, built with
>> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>> co-locate the archetype with the GitHub template)?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
>> somewhere like apache/beam-java-template? I think apache repositories like
>> beam-* are allowed.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Brian
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>> >>>>>>>>>>>> [2]
>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>> dcavazos@google.com> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> +Ahmet Altay
>> >>>>>>>>>>>>> +Valentyn Tymofieiev
>> >>>>>>>>>>>>> +Kenneth Knowles
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Please feel free to include anyone else!
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>> dcavazos@google.com> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Hi Beam community!
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've
>> been working on a GitHub template containing a minimal Beam Java pipeline
>> for people to start with.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Link to the GitHub template:
>> https://github.com/davidcavazos/beam-java
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> So far, here's what the template contains:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>> >>>>>>>>>>>>>> Minimal test file
>> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
>> minutes to run)
>> >>>>>>>>>>>>>> README with instructions on how to build, run, test, and
>> add other runners
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Next steps
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we
>> need to create an Apache repo to host it
>> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a
>> new Beam Java pipeline
>>
>

Re: Beam Java starter project template

Posted by Luke Cwik <lc...@google.com>.
I think for consistency it makes sense to users to be told to checkout this
git repo for the language of your choice and run. Some repos will have
more/less than others when it comes to setup necessary.

On Fri, Jan 14, 2022 at 2:26 PM Robert Bradshaw <ro...@google.com> wrote:

> +1 for doing this for Java, as setting up a project there is quite
> complicated. I simply don't know what one would put in a Python repo
> than, other than a bare setup.py that lists a dependency on
> apache_beam. We don't have recommendations on file layout, etc. more
> than that (though there's plenty of generic advice to be found out
> there on the topic). I have a hunch go is similar, and javascript
> would be as well (npm install apache-beam and your package.json file
> gets updated).
>
> On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
> >
> > There are several examples already within the Beam repo found in:
> > https://github.com/apache/beam/tree/master/examples
> > https://github.com/apache/beam/tree/master/sdks/go/examples
> >
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
> >
> >
> > On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com>
> wrote:
> >>
> >> I'd love to do something other than Wordcount just for
> novelty/freshness but agreed with the suggestion that having an example in
> each quickstart would be ideal.
> >>
> >> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
> dhuntsperger@google.com> wrote:
> >>>
> >>> + 1 to a separate repo for each language.
> >>>
> >>> Would it make sense to include the Wordcount example in each repo? I
> know that makes the repos less minimal, but we could rewrite the
> quickstarts around these repos instead of the current Wordcount examples.
> Or maybe we don't need to use the Wordcount example in the quickstarts...
> >>>
> >>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
> wrote:
> >>>>
> >>>> I agree with dropping the archetypes. Less maintenance is preferable,
> and the github repos are more flexible and maintainable.
> >>>>
> >>>> How about we create:
> >>>>
> >>>> apache/beam-starter-java
> >>>> apache/beam-starter-python
> >>>> apache/beam-starter-go
> >>>>
> >>>> During our OKR planning, +Keith Malvetti would prefer having repos
> for all languages. It makes sense for consistency as well.
> >>>>
> >>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
> >>>>>
> >>>>> As long as we have tags so that people can pull out a specific
> version of the examples that coincides with a specific SDK version then we
> could drop the archetypes.
> >>>>>
> >>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com>
> wrote:
> >>>>>>
> >>>>>> > Being such minimal examples, I don't expect them to break
> commonly, but I think it would be good to make sure tests aren't failing
> when a release is published.
> >>>>>>
> >>>>>> Yeah it would be very unfortunate if we discovered a breakage after
> the release. Agree we should verify RCs (document as part of the release
> process), or even better, add automation to verify the repo against
> snapshots. The automation could be nice to have anyway since it provides an
> example for users to follow if they want to test against snapshots and
> report issues to us sooner.
> >>>>>>
> >>>>>>
> >>>>>> If we move forward with this can we drop the archetype?
> >>>>>>
> >>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
> >>>>>>>
> >>>>>>> Sounds reasonable.
> >>>>>>>
> >>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com>
> wrote:
> >>>>>>>>
> >>>>>>>> I personally like the idea of a separate repo since we can see
> how a true minimal project looks like. Having it in the main repo would
> inherit build file configurations and other settings that would be
> different from a clean project, so it could be non-trivial to adapt. Also
> as its own repo, it's easier to clone and modify, or create an instance of
> the template.
> >>>>>>>>
> >>>>>>>> Dependabot can take care of updating the Beam version and other
> dependencies automatically. Testing is already set up via GitHub actions
> for every pull request, so it would automatically be tested as soon as
> there is a new dependency version available.
> >>>>>>>>
> >>>>>>>> Being such minimal examples, I don't expect them to break
> commonly, but I think it would be good to make sure tests aren't failing
> when a release is published.
> >>>>>>>>
> >>>>>>>> I'm okay with having one repo per language, and having all the
> build systems we want to support for them. As long as we document which
> files are for which build system. That way there are less repos to maintain.
> >>>>>>>>
> >>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com>
> wrote:
> >>>>>>>>>
> >>>>>>>>> The github repo is definitely more flexible then the archetypes
> but the archetypes have a few conveniences since they are integrated with
> apache/beam repo. For example, updates/testing are done at the same time a
> corresponding change to the main repo is done (like library version
> updates), they are released when the SDK is released.
> >>>>>>>>>
> >>>>>>>>> Should these be part of the main repo, or a single starter repo
> containing all the starters or one per language or one per build system?
> >>>>>>>>>
> >>>>>>>>> When should updates to the starter happen?
> >>>>>>>>> How as a community do we get them to happen (e.g. release
> manager owns it)?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <
> dcavazos@google.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> We could do the Maven archetype, but that wouldn't work very
> well for Gradle and SBT users. I think a GitHub template might be the more
> flexible option, and we could have something similar for other languages as
> well. Having said that, we could still create a Maven archetype. If someone
> is familiar with that process, please let me know since I'm not too
> familiar with Maven and its ecosystem.
> >>>>>>>>>>
> >>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the
> name of the repo, create it, and move the code there. I was thinking either
> `apache/beam-java-template` or `apache/beam-java-starter`. What do you
> think?
> >>>>>>>>>>
> >>>>>>>>>> What would be the next steps on creating the repo?
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com>
> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> This is great David. Was there any progress on this? Do you
> need help?
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <
> bhulette@google.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is cool, thanks!
> >>>>>>>>>>>>
> >>>>>>>>>>>> We do have a template in apache/beam already, built with
> Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we
> de-dupe these (e.g. reference the GitHub template in the quickstart, or
> co-locate the archetype with the GitHub template)?
> >>>>>>>>>>>>
> >>>>>>>>>>>> As far as creating an Apache repo, would we put this
> somewhere like apache/beam-java-template? I think apache repositories like
> beam-* are allowed.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Brian
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
> >>>>>>>>>>>> [2]
> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
> dcavazos@google.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +Ahmet Altay
> >>>>>>>>>>>>> +Valentyn Tymofieiev
> >>>>>>>>>>>>> +Kenneth Knowles
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Please feel free to include anyone else!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
> dcavazos@google.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Beam community!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've
> been working on a GitHub template containing a minimal Beam Java pipeline
> for people to start with.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Link to the GitHub template:
> https://github.com/davidcavazos/beam-java
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So far, here's what the template contains:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
> >>>>>>>>>>>>>> Minimal test file
> >>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
> >>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2
> minutes to run)
> >>>>>>>>>>>>>> README with instructions on how to build, run, test, and
> add other runners
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Next steps
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
> >>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we
> need to create an Apache repo to host it
> >>>>>>>>>>>>>> Update/create docs with instructions on how to create a new
> Beam Java pipeline
>

Re: Beam Java starter project template

Posted by Robert Bradshaw <ro...@google.com>.
+1 for doing this for Java, as setting up a project there is quite
complicated. I simply don't know what one would put in a Python repo
than, other than a bare setup.py that lists a dependency on
apache_beam. We don't have recommendations on file layout, etc. more
than that (though there's plenty of generic advice to be found out
there on the topic). I have a hunch go is similar, and javascript
would be as well (npm install apache-beam and your package.json file
gets updated).

On Fri, Jan 14, 2022 at 2:17 PM Luke Cwik <lc...@google.com> wrote:
>
> There are several examples already within the Beam repo found in:
> https://github.com/apache/beam/tree/master/examples
> https://github.com/apache/beam/tree/master/sdks/go/examples
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples
>
>
> On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com> wrote:
>>
>> I'd love to do something other than Wordcount just for novelty/freshness but agreed with the suggestion that having an example in each quickstart would be ideal.
>>
>> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <dh...@google.com> wrote:
>>>
>>> + 1 to a separate repo for each language.
>>>
>>> Would it make sense to include the Wordcount example in each repo? I know that makes the repos less minimal, but we could rewrite the quickstarts around these repos instead of the current Wordcount examples. Or maybe we don't need to use the Wordcount example in the quickstarts...
>>>
>>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com> wrote:
>>>>
>>>> I agree with dropping the archetypes. Less maintenance is preferable, and the github repos are more flexible and maintainable.
>>>>
>>>> How about we create:
>>>>
>>>> apache/beam-starter-java
>>>> apache/beam-starter-python
>>>> apache/beam-starter-go
>>>>
>>>> During our OKR planning, +Keith Malvetti would prefer having repos for all languages. It makes sense for consistency as well.
>>>>
>>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>> As long as we have tags so that people can pull out a specific version of the examples that coincides with a specific SDK version then we could drop the archetypes.
>>>>>
>>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com> wrote:
>>>>>>
>>>>>> > Being such minimal examples, I don't expect them to break commonly, but I think it would be good to make sure tests aren't failing when a release is published.
>>>>>>
>>>>>> Yeah it would be very unfortunate if we discovered a breakage after the release. Agree we should verify RCs (document as part of the release process), or even better, add automation to verify the repo against snapshots. The automation could be nice to have anyway since it provides an example for users to follow if they want to test against snapshots and report issues to us sooner.
>>>>>>
>>>>>>
>>>>>> If we move forward with this can we drop the archetype?
>>>>>>
>>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>>>>>>>
>>>>>>> Sounds reasonable.
>>>>>>>
>>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com> wrote:
>>>>>>>>
>>>>>>>> I personally like the idea of a separate repo since we can see how a true minimal project looks like. Having it in the main repo would inherit build file configurations and other settings that would be different from a clean project, so it could be non-trivial to adapt. Also as its own repo, it's easier to clone and modify, or create an instance of the template.
>>>>>>>>
>>>>>>>> Dependabot can take care of updating the Beam version and other dependencies automatically. Testing is already set up via GitHub actions for every pull request, so it would automatically be tested as soon as there is a new dependency version available.
>>>>>>>>
>>>>>>>> Being such minimal examples, I don't expect them to break commonly, but I think it would be good to make sure tests aren't failing when a release is published.
>>>>>>>>
>>>>>>>> I'm okay with having one repo per language, and having all the build systems we want to support for them. As long as we document which files are for which build system. That way there are less repos to maintain.
>>>>>>>>
>>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>>>>>>>
>>>>>>>>> The github repo is definitely more flexible then the archetypes but the archetypes have a few conveniences since they are integrated with apache/beam repo. For example, updates/testing are done at the same time a corresponding change to the main repo is done (like library version updates), they are released when the SDK is released.
>>>>>>>>>
>>>>>>>>> Should these be part of the main repo, or a single starter repo containing all the starters or one per language or one per build system?
>>>>>>>>>
>>>>>>>>> When should updates to the starter happen?
>>>>>>>>> How as a community do we get them to happen (e.g. release manager owns it)?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>> We could do the Maven archetype, but that wouldn't work very well for Gradle and SBT users. I think a GitHub template might be the more flexible option, and we could have something similar for other languages as well. Having said that, we could still create a Maven archetype. If someone is familiar with that process, please let me know since I'm not too familiar with Maven and its ecosystem.
>>>>>>>>>>
>>>>>>>>>> @Ahmet Altay I think right now we only need to pin down the name of the repo, create it, and move the code there. I was thinking either `apache/beam-java-template` or `apache/beam-java-starter`. What do you think?
>>>>>>>>>>
>>>>>>>>>> What would be the next steps on creating the repo?
>>>>>>>>>>
>>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> This is great David. Was there any progress on this? Do you need help?
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> We do have a template in apache/beam already, built with Maven Archetype [1]. It's what powers the Java quickstart [2]. Could we de-dupe these (e.g. reference the GitHub template in the quickstart, or co-locate the archetype with the GitHub template)?
>>>>>>>>>>>>
>>>>>>>>>>>> As far as creating an Apache repo, would we put this somewhere like apache/beam-java-template? I think apache repositories like beam-* are allowed.
>>>>>>>>>>>>
>>>>>>>>>>>> Brian
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>>>>>>> [2] https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> +Ahmet Altay
>>>>>>>>>>>>> +Valentyn Tymofieiev
>>>>>>>>>>>>> +Kenneth Knowles
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've been working on a GitHub template containing a minimal Beam Java pipeline for people to start with.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Link to the GitHub template: https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>> Minimal test file
>>>>>>>>>>>>>> Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>>>>> Continuous integration via GitHub actions (around 1-2 minutes to run)
>>>>>>>>>>>>>> README with instructions on how to build, run, test, and add other runners
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Next steps
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>>>>>>> Right now it lives in my personal GitHub account, so we need to create an Apache repo to host it
>>>>>>>>>>>>>> Update/create docs with instructions on how to create a new Beam Java pipeline

Re: Beam Java starter project template

Posted by Luke Cwik <lc...@google.com>.
There are several examples already within the Beam repo found in:
https://github.com/apache/beam/tree/master/examples
https://github.com/apache/beam/tree/master/sdks/go/examples
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples


On Fri, Jan 14, 2022 at 11:07 AM Sachin Agarwal <sa...@google.com> wrote:

> I'd love to do something other than Wordcount just for
> novelty/freshness but agreed with the suggestion that having an example in
> each quickstart would be ideal.
>
> On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <
> dhuntsperger@google.com> wrote:
>
>> + 1 to a separate repo for each language.
>>
>> Would it make sense to include the Wordcount example in each repo? I know
>> that makes the repos less minimal, but we could rewrite the quickstarts
>> around these repos instead of the current Wordcount examples. Or maybe we
>> don't need to use the Wordcount example in the quickstarts...
>>
>> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> I agree with dropping the archetypes. Less maintenance is preferable,
>>> and the github repos are more flexible and maintainable.
>>>
>>> How about we create:
>>>
>>>    - apache/beam-starter-java
>>>    - apache/beam-starter-python
>>>    - apache/beam-starter-go
>>>
>>> During our OKR planning, +Keith Malvetti <km...@google.com> would
>>> prefer having repos for all languages. It makes sense for consistency as
>>> well.
>>>
>>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> As long as we have tags so that people can pull out a specific version
>>>> of the examples that coincides with a specific SDK version then we could
>>>> drop the archetypes.
>>>>
>>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com>
>>>> wrote:
>>>>
>>>>> > Being such minimal examples, I don't expect them to break commonly,
>>>>> but I think it would be good to make sure tests aren't failing when a
>>>>> release is published.
>>>>>
>>>>> Yeah it would be very unfortunate if we discovered a breakage after
>>>>> the release. Agree we should verify RCs (document as part of the release
>>>>> process), or even better, add automation to verify the repo against
>>>>> snapshots. The automation could be nice to have anyway since it provides an
>>>>> example for users to follow if they want to test against snapshots and
>>>>> report issues to us sooner.
>>>>>
>>>>>
>>>>> If we move forward with this can we drop the archetype?
>>>>>
>>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> Sounds reasonable.
>>>>>>
>>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I personally like the idea of a separate repo since we can see how a
>>>>>>> true minimal project looks like. Having it in the main repo would inherit
>>>>>>> build file configurations and other settings that would be different from a
>>>>>>> clean project, so it could be non-trivial to adapt. Also as its own repo,
>>>>>>> it's easier to clone and modify, or create an instance of the template.
>>>>>>>
>>>>>>> Dependabot
>>>>>>> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
>>>>>>> can take care of updating the Beam version and other dependencies
>>>>>>> automatically. Testing is already set up via GitHub actions for every pull
>>>>>>> request, so it would automatically be tested as soon as there is a new
>>>>>>> dependency version available.
>>>>>>>
>>>>>>> Being such minimal examples, I don't expect them to break commonly,
>>>>>>> but I think it would be good to make sure tests aren't failing when a
>>>>>>> release is published.
>>>>>>>
>>>>>>> I'm okay with having one repo per language, and having all the build
>>>>>>> systems we want to support for them. As long as we document which files are
>>>>>>> for which build system. That way there are less repos to maintain.
>>>>>>>
>>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>>>>>
>>>>>>>> The github repo is definitely more flexible then the archetypes but
>>>>>>>> the archetypes have a few conveniences since they are integrated with
>>>>>>>> apache/beam repo. For example, updates/testing are done at the same time a
>>>>>>>> corresponding change to the main repo is done (like library version
>>>>>>>> updates), they are released when the SDK is released.
>>>>>>>>
>>>>>>>> Should these be part of the main repo, or a single starter repo
>>>>>>>> containing all the starters or one per language or one per build system?
>>>>>>>>
>>>>>>>> When should updates to the starter happen?
>>>>>>>> How as a community do we get them to happen (e.g. release manager
>>>>>>>> owns it)?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> We could do the Maven archetype, but that wouldn't work very well
>>>>>>>>> for Gradle and SBT users. I think a GitHub template might be the more
>>>>>>>>> flexible option, and we could have something similar for other languages as
>>>>>>>>> well. Having said that, we could still create a Maven archetype. If someone
>>>>>>>>> is familiar with that process, please let me know since I'm not too
>>>>>>>>> familiar with Maven and its ecosystem.
>>>>>>>>>
>>>>>>>>> @Ahmet Altay <al...@google.com> I think right now we only need to
>>>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>>> What do you think?
>>>>>>>>>
>>>>>>>>> What would be the next steps on creating the repo?
>>>>>>>>>
>>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> This is great David. Was there any progress on this? Do you need
>>>>>>>>>> help?
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>>
>>>>>>>>>>> We do have a template in apache/beam already, built with Maven
>>>>>>>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>>>>>>
>>>>>>>>>>> As far as creating an Apache repo, would we put this somewhere
>>>>>>>>>>> like apache/beam-java-template? I think apache repositories like beam-* are
>>>>>>>>>>> allowed.
>>>>>>>>>>>
>>>>>>>>>>> Brian
>>>>>>>>>>>
>>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>>>>>> [2]
>>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +Ahmet Altay <al...@google.com>
>>>>>>>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>>>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>>>>>>>
>>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>>
>>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>>>>>>>> people to start with.
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Link to the GitHub template*:
>>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>>
>>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>>>>
>>>>>>>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>>    - Minimal test file
>>>>>>>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes
>>>>>>>>>>>>>    to run)
>>>>>>>>>>>>>    - README with instructions on how to build, run, test, and
>>>>>>>>>>>>>    add other runners
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>>>>>>>> .
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Next steps*
>>>>>>>>>>>>>
>>>>>>>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>>>>>>    - Right now it lives in my personal GitHub account, so we
>>>>>>>>>>>>>    need to create an Apache repo to host it
>>>>>>>>>>>>>    - Update/create docs with instructions on how to create a
>>>>>>>>>>>>>    new Beam Java pipeline
>>>>>>>>>>>>>
>>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by Sachin Agarwal <sa...@google.com>.
I'd love to do something other than Wordcount just for
novelty/freshness but agreed with the suggestion that having an example in
each quickstart would be ideal.

On Fri, Jan 14, 2022 at 11:06 AM David Huntsperger <dh...@google.com>
wrote:

> + 1 to a separate repo for each language.
>
> Would it make sense to include the Wordcount example in each repo? I know
> that makes the repos less minimal, but we could rewrite the quickstarts
> around these repos instead of the current Wordcount examples. Or maybe we
> don't need to use the Wordcount example in the quickstarts...
>
> On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com> wrote:
>
>> I agree with dropping the archetypes. Less maintenance is preferable, and
>> the github repos are more flexible and maintainable.
>>
>> How about we create:
>>
>>    - apache/beam-starter-java
>>    - apache/beam-starter-python
>>    - apache/beam-starter-go
>>
>> During our OKR planning, +Keith Malvetti <km...@google.com> would
>> prefer having repos for all languages. It makes sense for consistency as
>> well.
>>
>> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> As long as we have tags so that people can pull out a specific version
>>> of the examples that coincides with a specific SDK version then we could
>>> drop the archetypes.
>>>
>>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com>
>>> wrote:
>>>
>>>> > Being such minimal examples, I don't expect them to break commonly,
>>>> but I think it would be good to make sure tests aren't failing when a
>>>> release is published.
>>>>
>>>> Yeah it would be very unfortunate if we discovered a breakage after the
>>>> release. Agree we should verify RCs (document as part of the release
>>>> process), or even better, add automation to verify the repo against
>>>> snapshots. The automation could be nice to have anyway since it provides an
>>>> example for users to follow if they want to test against snapshots and
>>>> report issues to us sooner.
>>>>
>>>>
>>>> If we move forward with this can we drop the archetype?
>>>>
>>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> Sounds reasonable.
>>>>>
>>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> I personally like the idea of a separate repo since we can see how a
>>>>>> true minimal project looks like. Having it in the main repo would inherit
>>>>>> build file configurations and other settings that would be different from a
>>>>>> clean project, so it could be non-trivial to adapt. Also as its own repo,
>>>>>> it's easier to clone and modify, or create an instance of the template.
>>>>>>
>>>>>> Dependabot
>>>>>> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
>>>>>> can take care of updating the Beam version and other dependencies
>>>>>> automatically. Testing is already set up via GitHub actions for every pull
>>>>>> request, so it would automatically be tested as soon as there is a new
>>>>>> dependency version available.
>>>>>>
>>>>>> Being such minimal examples, I don't expect them to break commonly,
>>>>>> but I think it would be good to make sure tests aren't failing when a
>>>>>> release is published.
>>>>>>
>>>>>> I'm okay with having one repo per language, and having all the build
>>>>>> systems we want to support for them. As long as we document which files are
>>>>>> for which build system. That way there are less repos to maintain.
>>>>>>
>>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>>>>
>>>>>>> The github repo is definitely more flexible then the archetypes but
>>>>>>> the archetypes have a few conveniences since they are integrated with
>>>>>>> apache/beam repo. For example, updates/testing are done at the same time a
>>>>>>> corresponding change to the main repo is done (like library version
>>>>>>> updates), they are released when the SDK is released.
>>>>>>>
>>>>>>> Should these be part of the main repo, or a single starter repo
>>>>>>> containing all the starters or one per language or one per build system?
>>>>>>>
>>>>>>> When should updates to the starter happen?
>>>>>>> How as a community do we get them to happen (e.g. release manager
>>>>>>> owns it)?
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> We could do the Maven archetype, but that wouldn't work very well
>>>>>>>> for Gradle and SBT users. I think a GitHub template might be the more
>>>>>>>> flexible option, and we could have something similar for other languages as
>>>>>>>> well. Having said that, we could still create a Maven archetype. If someone
>>>>>>>> is familiar with that process, please let me know since I'm not too
>>>>>>>> familiar with Maven and its ecosystem.
>>>>>>>>
>>>>>>>> @Ahmet Altay <al...@google.com> I think right now we only need to
>>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>>> What do you think?
>>>>>>>>
>>>>>>>> What would be the next steps on creating the repo?
>>>>>>>>
>>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> This is great David. Was there any progress on this? Do you need
>>>>>>>>> help?
>>>>>>>>>
>>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> This is cool, thanks!
>>>>>>>>>>
>>>>>>>>>> We do have a template in apache/beam already, built with Maven
>>>>>>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>>>>>
>>>>>>>>>> As far as creating an Apache repo, would we put this somewhere
>>>>>>>>>> like apache/beam-java-template? I think apache repositories like beam-* are
>>>>>>>>>> allowed.
>>>>>>>>>>
>>>>>>>>>> Brian
>>>>>>>>>>
>>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>>>>> [2]
>>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> +Ahmet Altay <al...@google.com>
>>>>>>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>>>>>>
>>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>>
>>>>>>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>>>>>>> people to start with.
>>>>>>>>>>>>
>>>>>>>>>>>> *Link to the GitHub template*:
>>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>>
>>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>>>
>>>>>>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>>>>>>    - Minimal test file
>>>>>>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes
>>>>>>>>>>>>    to run)
>>>>>>>>>>>>    - README with instructions on how to build, run, test, and
>>>>>>>>>>>>    add other runners
>>>>>>>>>>>>
>>>>>>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>>>> *Next steps*
>>>>>>>>>>>>
>>>>>>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>>>>>    - Right now it lives in my personal GitHub account, so we
>>>>>>>>>>>>    need to create an Apache repo to host it
>>>>>>>>>>>>    - Update/create docs with instructions on how to create a
>>>>>>>>>>>>    new Beam Java pipeline
>>>>>>>>>>>>
>>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Huntsperger <dh...@google.com>.
+ 1 to a separate repo for each language.

Would it make sense to include the Wordcount example in each repo? I know
that makes the repos less minimal, but we could rewrite the quickstarts
around these repos instead of the current Wordcount examples. Or maybe we
don't need to use the Wordcount example in the quickstarts...

On Wed, Jan 12, 2022 at 1:54 PM David Cavazos <dc...@google.com> wrote:

> I agree with dropping the archetypes. Less maintenance is preferable, and
> the github repos are more flexible and maintainable.
>
> How about we create:
>
>    - apache/beam-starter-java
>    - apache/beam-starter-python
>    - apache/beam-starter-go
>
> During our OKR planning, +Keith Malvetti <km...@google.com> would
> prefer having repos for all languages. It makes sense for consistency as
> well.
>
> On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:
>
>> As long as we have tags so that people can pull out a specific version of
>> the examples that coincides with a specific SDK version then we could drop
>> the archetypes.
>>
>> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com>
>> wrote:
>>
>>> > Being such minimal examples, I don't expect them to break commonly,
>>> but I think it would be good to make sure tests aren't failing when a
>>> release is published.
>>>
>>> Yeah it would be very unfortunate if we discovered a breakage after the
>>> release. Agree we should verify RCs (document as part of the release
>>> process), or even better, add automation to verify the repo against
>>> snapshots. The automation could be nice to have anyway since it provides an
>>> example for users to follow if they want to test against snapshots and
>>> report issues to us sooner.
>>>
>>>
>>> If we move forward with this can we drop the archetype?
>>>
>>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> Sounds reasonable.
>>>>
>>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> I personally like the idea of a separate repo since we can see how a
>>>>> true minimal project looks like. Having it in the main repo would inherit
>>>>> build file configurations and other settings that would be different from a
>>>>> clean project, so it could be non-trivial to adapt. Also as its own repo,
>>>>> it's easier to clone and modify, or create an instance of the template.
>>>>>
>>>>> Dependabot
>>>>> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
>>>>> can take care of updating the Beam version and other dependencies
>>>>> automatically. Testing is already set up via GitHub actions for every pull
>>>>> request, so it would automatically be tested as soon as there is a new
>>>>> dependency version available.
>>>>>
>>>>> Being such minimal examples, I don't expect them to break commonly,
>>>>> but I think it would be good to make sure tests aren't failing when a
>>>>> release is published.
>>>>>
>>>>> I'm okay with having one repo per language, and having all the build
>>>>> systems we want to support for them. As long as we document which files are
>>>>> for which build system. That way there are less repos to maintain.
>>>>>
>>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> The github repo is definitely more flexible then the archetypes but
>>>>>> the archetypes have a few conveniences since they are integrated with
>>>>>> apache/beam repo. For example, updates/testing are done at the same time a
>>>>>> corresponding change to the main repo is done (like library version
>>>>>> updates), they are released when the SDK is released.
>>>>>>
>>>>>> Should these be part of the main repo, or a single starter repo
>>>>>> containing all the starters or one per language or one per build system?
>>>>>>
>>>>>> When should updates to the starter happen?
>>>>>> How as a community do we get them to happen (e.g. release manager
>>>>>> owns it)?
>>>>>>
>>>>>>
>>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> We could do the Maven archetype, but that wouldn't work very well
>>>>>>> for Gradle and SBT users. I think a GitHub template might be the more
>>>>>>> flexible option, and we could have something similar for other languages as
>>>>>>> well. Having said that, we could still create a Maven archetype. If someone
>>>>>>> is familiar with that process, please let me know since I'm not too
>>>>>>> familiar with Maven and its ecosystem.
>>>>>>>
>>>>>>> @Ahmet Altay <al...@google.com> I think right now we only need to
>>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>>> What do you think?
>>>>>>>
>>>>>>> What would be the next steps on creating the repo?
>>>>>>>
>>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> This is great David. Was there any progress on this? Do you need
>>>>>>>> help?
>>>>>>>>
>>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> This is cool, thanks!
>>>>>>>>>
>>>>>>>>> We do have a template in apache/beam already, built with Maven
>>>>>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>>>>
>>>>>>>>> As far as creating an Apache repo, would we put this somewhere
>>>>>>>>> like apache/beam-java-template? I think apache repositories like beam-* are
>>>>>>>>> allowed.
>>>>>>>>>
>>>>>>>>> Brian
>>>>>>>>>
>>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>>>> [2]
>>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>>
>>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> +Ahmet Altay <al...@google.com>
>>>>>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>>>>>
>>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Beam community!
>>>>>>>>>>>
>>>>>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>>>>>> people to start with.
>>>>>>>>>>>
>>>>>>>>>>> *Link to the GitHub template*:
>>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>>
>>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>>
>>>>>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>>>>>    - Minimal test file
>>>>>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes to
>>>>>>>>>>>    run)
>>>>>>>>>>>    - README with instructions on how to build, run, test, and
>>>>>>>>>>>    add other runners
>>>>>>>>>>>
>>>>>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>> *Next steps*
>>>>>>>>>>>
>>>>>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>>>>    - Right now it lives in my personal GitHub account, so we
>>>>>>>>>>>    need to create an Apache repo to host it
>>>>>>>>>>>    - Update/create docs with instructions on how to create a
>>>>>>>>>>>    new Beam Java pipeline
>>>>>>>>>>>
>>>>>>>>>>>

Re: Beam Java starter project template

Posted by David Cavazos <dc...@google.com>.
I agree with dropping the archetypes. Less maintenance is preferable, and
the github repos are more flexible and maintainable.

How about we create:

   - apache/beam-starter-java
   - apache/beam-starter-python
   - apache/beam-starter-go

During our OKR planning, +Keith Malvetti <km...@google.com> would prefer
having repos for all languages. It makes sense for consistency as well.

On Mon, Jan 10, 2022 at 5:14 PM Luke Cwik <lc...@google.com> wrote:

> As long as we have tags so that people can pull out a specific version of
> the examples that coincides with a specific SDK version then we could drop
> the archetypes.
>
> On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com> wrote:
>
>> > Being such minimal examples, I don't expect them to break commonly, but
>> I think it would be good to make sure tests aren't failing when a release
>> is published.
>>
>> Yeah it would be very unfortunate if we discovered a breakage after the
>> release. Agree we should verify RCs (document as part of the release
>> process), or even better, add automation to verify the repo against
>> snapshots. The automation could be nice to have anyway since it provides an
>> example for users to follow if they want to test against snapshots and
>> report issues to us sooner.
>>
>>
>> If we move forward with this can we drop the archetype?
>>
>> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> Sounds reasonable.
>>>
>>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> I personally like the idea of a separate repo since we can see how a
>>>> true minimal project looks like. Having it in the main repo would inherit
>>>> build file configurations and other settings that would be different from a
>>>> clean project, so it could be non-trivial to adapt. Also as its own repo,
>>>> it's easier to clone and modify, or create an instance of the template.
>>>>
>>>> Dependabot
>>>> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
>>>> can take care of updating the Beam version and other dependencies
>>>> automatically. Testing is already set up via GitHub actions for every pull
>>>> request, so it would automatically be tested as soon as there is a new
>>>> dependency version available.
>>>>
>>>> Being such minimal examples, I don't expect them to break commonly, but
>>>> I think it would be good to make sure tests aren't failing when a release
>>>> is published.
>>>>
>>>> I'm okay with having one repo per language, and having all the build
>>>> systems we want to support for them. As long as we document which files are
>>>> for which build system. That way there are less repos to maintain.
>>>>
>>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> The github repo is definitely more flexible then the archetypes but
>>>>> the archetypes have a few conveniences since they are integrated with
>>>>> apache/beam repo. For example, updates/testing are done at the same time a
>>>>> corresponding change to the main repo is done (like library version
>>>>> updates), they are released when the SDK is released.
>>>>>
>>>>> Should these be part of the main repo, or a single starter repo
>>>>> containing all the starters or one per language or one per build system?
>>>>>
>>>>> When should updates to the starter happen?
>>>>> How as a community do we get them to happen (e.g. release manager owns
>>>>> it)?
>>>>>
>>>>>
>>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> We could do the Maven archetype, but that wouldn't work very well for
>>>>>> Gradle and SBT users. I think a GitHub template might be the more flexible
>>>>>> option, and we could have something similar for other languages as well.
>>>>>> Having said that, we could still create a Maven archetype. If someone is
>>>>>> familiar with that process, please let me know since I'm not too familiar
>>>>>> with Maven and its ecosystem.
>>>>>>
>>>>>> @Ahmet Altay <al...@google.com> I think right now we only need to
>>>>>> pin down the name of the repo, create it, and move the code there. I was
>>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>>> What do you think?
>>>>>>
>>>>>> What would be the next steps on creating the repo?
>>>>>>
>>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> This is great David. Was there any progress on this? Do you need
>>>>>>> help?
>>>>>>>
>>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> This is cool, thanks!
>>>>>>>>
>>>>>>>> We do have a template in apache/beam already, built with Maven
>>>>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>>>
>>>>>>>> As far as creating an Apache repo, would we put this somewhere like
>>>>>>>> apache/beam-java-template? I think apache repositories like beam-* are
>>>>>>>> allowed.
>>>>>>>>
>>>>>>>> Brian
>>>>>>>>
>>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>>> [2]
>>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>>
>>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> +Ahmet Altay <al...@google.com>
>>>>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>>>>
>>>>>>>>> Please feel free to include anyone else!
>>>>>>>>>
>>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <
>>>>>>>>> dcavazos@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Beam community!
>>>>>>>>>>
>>>>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>>>>> people to start with.
>>>>>>>>>>
>>>>>>>>>> *Link to the GitHub template*:
>>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>>
>>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>>
>>>>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>>>>    - Minimal test file
>>>>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes to
>>>>>>>>>>    run)
>>>>>>>>>>    - README with instructions on how to build, run, test, and
>>>>>>>>>>    add other runners
>>>>>>>>>>
>>>>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>> *Next steps*
>>>>>>>>>>
>>>>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>>>    - Right now it lives in my personal GitHub account, so we
>>>>>>>>>>    need to create an Apache repo to host it
>>>>>>>>>>    - Update/create docs with instructions on how to create a new
>>>>>>>>>>    Beam Java pipeline
>>>>>>>>>>
>>>>>>>>>>

Re: Beam Java starter project template

Posted by Luke Cwik <lc...@google.com>.
As long as we have tags so that people can pull out a specific version of
the examples that coincides with a specific SDK version then we could drop
the archetypes.

On Mon, Jan 10, 2022 at 4:09 PM Brian Hulette <bh...@google.com> wrote:

> > Being such minimal examples, I don't expect them to break commonly, but
> I think it would be good to make sure tests aren't failing when a release
> is published.
>
> Yeah it would be very unfortunate if we discovered a breakage after the
> release. Agree we should verify RCs (document as part of the release
> process), or even better, add automation to verify the repo against
> snapshots. The automation could be nice to have anyway since it provides an
> example for users to follow if they want to test against snapshots and
> report issues to us sooner.
>
>
> If we move forward with this can we drop the archetype?
>
> On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:
>
>> Sounds reasonable.
>>
>> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> I personally like the idea of a separate repo since we can see how a
>>> true minimal project looks like. Having it in the main repo would inherit
>>> build file configurations and other settings that would be different from a
>>> clean project, so it could be non-trivial to adapt. Also as its own repo,
>>> it's easier to clone and modify, or create an instance of the template.
>>>
>>> Dependabot
>>> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
>>> can take care of updating the Beam version and other dependencies
>>> automatically. Testing is already set up via GitHub actions for every pull
>>> request, so it would automatically be tested as soon as there is a new
>>> dependency version available.
>>>
>>> Being such minimal examples, I don't expect them to break commonly, but
>>> I think it would be good to make sure tests aren't failing when a release
>>> is published.
>>>
>>> I'm okay with having one repo per language, and having all the build
>>> systems we want to support for them. As long as we document which files are
>>> for which build system. That way there are less repos to maintain.
>>>
>>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> The github repo is definitely more flexible then the archetypes but the
>>>> archetypes have a few conveniences since they are integrated with
>>>> apache/beam repo. For example, updates/testing are done at the same time a
>>>> corresponding change to the main repo is done (like library version
>>>> updates), they are released when the SDK is released.
>>>>
>>>> Should these be part of the main repo, or a single starter repo
>>>> containing all the starters or one per language or one per build system?
>>>>
>>>> When should updates to the starter happen?
>>>> How as a community do we get them to happen (e.g. release manager owns
>>>> it)?
>>>>
>>>>
>>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>>>> wrote:
>>>>
>>>>> We could do the Maven archetype, but that wouldn't work very well for
>>>>> Gradle and SBT users. I think a GitHub template might be the more flexible
>>>>> option, and we could have something similar for other languages as well.
>>>>> Having said that, we could still create a Maven archetype. If someone is
>>>>> familiar with that process, please let me know since I'm not too familiar
>>>>> with Maven and its ecosystem.
>>>>>
>>>>> @Ahmet Altay <al...@google.com> I think right now we only need to pin
>>>>> down the name of the repo, create it, and move the code there. I was
>>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>>> What do you think?
>>>>>
>>>>> What would be the next steps on creating the repo?
>>>>>
>>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> This is great David. Was there any progress on this? Do you need help?
>>>>>>
>>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> This is cool, thanks!
>>>>>>>
>>>>>>> We do have a template in apache/beam already, built with Maven
>>>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>>
>>>>>>> As far as creating an Apache repo, would we put this somewhere like
>>>>>>> apache/beam-java-template? I think apache repositories like beam-* are
>>>>>>> allowed.
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>>> [2]
>>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>>
>>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +Ahmet Altay <al...@google.com>
>>>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>>>
>>>>>>>> Please feel free to include anyone else!
>>>>>>>>
>>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Beam community!
>>>>>>>>>
>>>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>>>> people to start with.
>>>>>>>>>
>>>>>>>>> *Link to the GitHub template*:
>>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>>
>>>>>>>>> So far, here's what the template contains:
>>>>>>>>>
>>>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>>>    - Minimal test file
>>>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes to
>>>>>>>>>    run)
>>>>>>>>>    - README with instructions on how to build, run, test, and add
>>>>>>>>>    other runners
>>>>>>>>>
>>>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> *Next steps*
>>>>>>>>>
>>>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>>    - Right now it lives in my personal GitHub account, so we need
>>>>>>>>>    to create an Apache repo to host it
>>>>>>>>>    - Update/create docs with instructions on how to create a new
>>>>>>>>>    Beam Java pipeline
>>>>>>>>>
>>>>>>>>>

Re: Beam Java starter project template

Posted by Brian Hulette <bh...@google.com>.
> Being such minimal examples, I don't expect them to break commonly, but I
think it would be good to make sure tests aren't failing when a release is
published.

Yeah it would be very unfortunate if we discovered a breakage after the
release. Agree we should verify RCs (document as part of the release
process), or even better, add automation to verify the repo against
snapshots. The automation could be nice to have anyway since it provides an
example for users to follow if they want to test against snapshots and
report issues to us sooner.


If we move forward with this can we drop the archetype?

On Fri, Jan 7, 2022 at 3:54 PM Luke Cwik <lc...@google.com> wrote:

> Sounds reasonable.
>
> On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com> wrote:
>
>> I personally like the idea of a separate repo since we can see how a true
>> minimal project looks like. Having it in the main repo would inherit build
>> file configurations and other settings that would be different from a clean
>> project, so it could be non-trivial to adapt. Also as its own repo, it's
>> easier to clone and modify, or create an instance of the template.
>>
>> Dependabot
>> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
>> can take care of updating the Beam version and other dependencies
>> automatically. Testing is already set up via GitHub actions for every pull
>> request, so it would automatically be tested as soon as there is a new
>> dependency version available.
>>
>> Being such minimal examples, I don't expect them to break commonly, but I
>> think it would be good to make sure tests aren't failing when a release is
>> published.
>>
>> I'm okay with having one repo per language, and having all the build
>> systems we want to support for them. As long as we document which files are
>> for which build system. That way there are less repos to maintain.
>>
>> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>>
>>> The github repo is definitely more flexible then the archetypes but the
>>> archetypes have a few conveniences since they are integrated with
>>> apache/beam repo. For example, updates/testing are done at the same time a
>>> corresponding change to the main repo is done (like library version
>>> updates), they are released when the SDK is released.
>>>
>>> Should these be part of the main repo, or a single starter repo
>>> containing all the starters or one per language or one per build system?
>>>
>>> When should updates to the starter happen?
>>> How as a community do we get them to happen (e.g. release manager owns
>>> it)?
>>>
>>>
>>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>>> wrote:
>>>
>>>> We could do the Maven archetype, but that wouldn't work very well for
>>>> Gradle and SBT users. I think a GitHub template might be the more flexible
>>>> option, and we could have something similar for other languages as well.
>>>> Having said that, we could still create a Maven archetype. If someone is
>>>> familiar with that process, please let me know since I'm not too familiar
>>>> with Maven and its ecosystem.
>>>>
>>>> @Ahmet Altay <al...@google.com> I think right now we only need to pin
>>>> down the name of the repo, create it, and move the code there. I was
>>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>>> What do you think?
>>>>
>>>> What would be the next steps on creating the repo?
>>>>
>>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> This is great David. Was there any progress on this? Do you need help?
>>>>>
>>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>>> wrote:
>>>>>
>>>>>> This is cool, thanks!
>>>>>>
>>>>>> We do have a template in apache/beam already, built with Maven
>>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>>> co-locate the archetype with the GitHub template)?
>>>>>>
>>>>>> As far as creating an Apache repo, would we put this somewhere like
>>>>>> apache/beam-java-template? I think apache repositories like beam-* are
>>>>>> allowed.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>>> [2]
>>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>>
>>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +Ahmet Altay <al...@google.com>
>>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>>
>>>>>>> Please feel free to include anyone else!
>>>>>>>
>>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Beam community!
>>>>>>>>
>>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>>> people to start with.
>>>>>>>>
>>>>>>>> *Link to the GitHub template*:
>>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>>
>>>>>>>> So far, here's what the template contains:
>>>>>>>>
>>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>>    - Minimal test file
>>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes to
>>>>>>>>    run)
>>>>>>>>    - README with instructions on how to build, run, test, and add
>>>>>>>>    other runners
>>>>>>>>
>>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>>> .
>>>>>>>>
>>>>>>>> *Next steps*
>>>>>>>>
>>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>>    - Right now it lives in my personal GitHub account, so we need
>>>>>>>>    to create an Apache repo to host it
>>>>>>>>    - Update/create docs with instructions on how to create a new
>>>>>>>>    Beam Java pipeline
>>>>>>>>
>>>>>>>>

Re: Beam Java starter project template

Posted by Luke Cwik <lc...@google.com>.
Sounds reasonable.

On Wed, Jan 5, 2022 at 12:47 PM David Cavazos <dc...@google.com> wrote:

> I personally like the idea of a separate repo since we can see how a true
> minimal project looks like. Having it in the main repo would inherit build
> file configurations and other settings that would be different from a clean
> project, so it could be non-trivial to adapt. Also as its own repo, it's
> easier to clone and modify, or create an instance of the template.
>
> Dependabot
> <https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/about-dependabot-version-updates>
> can take care of updating the Beam version and other dependencies
> automatically. Testing is already set up via GitHub actions for every pull
> request, so it would automatically be tested as soon as there is a new
> dependency version available.
>
> Being such minimal examples, I don't expect them to break commonly, but I
> think it would be good to make sure tests aren't failing when a release is
> published.
>
> I'm okay with having one repo per language, and having all the build
> systems we want to support for them. As long as we document which files are
> for which build system. That way there are less repos to maintain.
>
> On Mon, Dec 13, 2021 at 9:25 AM Luke Cwik <lc...@google.com> wrote:
>
>> The github repo is definitely more flexible then the archetypes but the
>> archetypes have a few conveniences since they are integrated with
>> apache/beam repo. For example, updates/testing are done at the same time a
>> corresponding change to the main repo is done (like library version
>> updates), they are released when the SDK is released.
>>
>> Should these be part of the main repo, or a single starter repo
>> containing all the starters or one per language or one per build system?
>>
>> When should updates to the starter happen?
>> How as a community do we get them to happen (e.g. release manager owns
>> it)?
>>
>>
>> On Sun, Dec 12, 2021 at 4:06 PM David Cavazos <dc...@google.com>
>> wrote:
>>
>>> We could do the Maven archetype, but that wouldn't work very well for
>>> Gradle and SBT users. I think a GitHub template might be the more flexible
>>> option, and we could have something similar for other languages as well.
>>> Having said that, we could still create a Maven archetype. If someone is
>>> familiar with that process, please let me know since I'm not too familiar
>>> with Maven and its ecosystem.
>>>
>>> @Ahmet Altay <al...@google.com> I think right now we only need to pin
>>> down the name of the repo, create it, and move the code there. I was
>>> thinking either `apache/beam-java-template` or `apache/beam-java-starter`.
>>> What do you think?
>>>
>>> What would be the next steps on creating the repo?
>>>
>>> On Thu, Dec 9, 2021 at 11:09 AM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> This is great David. Was there any progress on this? Do you need help?
>>>>
>>>> On Wed, Dec 1, 2021 at 3:54 PM Brian Hulette <bh...@google.com>
>>>> wrote:
>>>>
>>>>> This is cool, thanks!
>>>>>
>>>>> We do have a template in apache/beam already, built with Maven
>>>>> Archetype [1]. It's what powers the Java quickstart [2]. Could we
>>>>> de-dupe these (e.g. reference the GitHub template in the quickstart, or
>>>>> co-locate the archetype with the GitHub template)?
>>>>>
>>>>> As far as creating an Apache repo, would we put this somewhere like
>>>>> apache/beam-java-template? I think apache repositories like beam-* are
>>>>> allowed.
>>>>>
>>>>> Brian
>>>>>
>>>>> [1] https://maven.apache.org/archetype/index.html
>>>>> [2]
>>>>> https://beam.apache.org/get-started/quickstart-java/#get-the-example-code
>>>>>
>>>>> On Wed, Dec 1, 2021 at 11:30 AM David Cavazos <dc...@google.com>
>>>>> wrote:
>>>>>
>>>>>> +Ahmet Altay <al...@google.com>
>>>>>> +Valentyn Tymofieiev <va...@google.com>
>>>>>> +Kenneth Knowles <kl...@google.com>
>>>>>>
>>>>>> Please feel free to include anyone else!
>>>>>>
>>>>>> On Mon, Oct 25, 2021 at 11:31 AM David Cavazos <dc...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Beam community!
>>>>>>>
>>>>>>> To make it easier to create a new Beam Java project, I've been
>>>>>>> working on a GitHub template containing a minimal Beam Java pipeline for
>>>>>>> people to start with.
>>>>>>>
>>>>>>> *Link to the GitHub template*:
>>>>>>> https://github.com/davidcavazos/beam-java
>>>>>>>
>>>>>>> So far, here's what the template contains:
>>>>>>>
>>>>>>>    - Minimal "Hello World" Beam pipeline
>>>>>>>    - Minimal test file
>>>>>>>    - Build files for Gradle, sbt, and Maven (Direct runner)
>>>>>>>    - Continuous integration via GitHub actions
>>>>>>>    <https://github.com/features/actions> (around 1-2 minutes to run)
>>>>>>>    - README with instructions on how to build, run, test, and add
>>>>>>>    other runners
>>>>>>>
>>>>>>> It's easy to create a new GitHub repo from a template
>>>>>>> <https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template>
>>>>>>> .
>>>>>>>
>>>>>>> *Next steps*
>>>>>>>
>>>>>>>    - Some reviewers to make sure everyone is happy with it 🙂
>>>>>>>    - Right now it lives in my personal GitHub account, so we need
>>>>>>>    to create an Apache repo to host it
>>>>>>>    - Update/create docs with instructions on how to create a new
>>>>>>>    Beam Java pipeline
>>>>>>>
>>>>>>>