You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Austin Bennett <wh...@gmail.com> on 2018/10/26 04:25:42 UTC

Growing Beam -- A call for ideas? What is missing? What would be good to see?

Hi Beam Devs and Users,

Trying to get a sense from the community on the sorts of things we think
would be useful to build the community (I am thinking not from an angle of
specific code/implementation/functionality, but from a user/usability -- I
want to dive in and make real contributions with the code, too, but know I
also have the interest and skills to help with education and community
aspects, hence my focus on this).

I had previously suggested a sort of cookbook for focused and curated
examples (code and explination) to help people get started, on-boarding,
using Beam to aid getting up and running and accomplishing something
worthwhile (and quickly), that seems one way to help grow our user base
(and maybe future dev base afterwards those users become enamored), which
did get some positive feedback when first put out there.

There are many other areas where featuring others sharing successes from
having used Beam or little tips can be valuable, Pablo's Awesome Beam is
one example of such a collection: https://github.com/pabloem/awesome-beam
or even centralizing a general place to find any/all Beam
blogs/shared-code/writeups/etc.

Certainly there is a place for all sorts of contributions and resources.
What do people on these lists think would be particularly useful?  Trying
to get a more focused sense of where we think efforts might be best
focused.

Please share anything (even semi-)related!?

Thanks,
Austin


P.S.  I realize that those following this list are rather self selecting as
well, so this might not be the best forum to figure out what new/novice
users need, but I would like to hear what everyone else here thinks could
be useful.

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by Gleb Kanterov <gl...@spotify.com>.
I'm a scio contributor, and I have a lot of experience with Scala. However,
I would advise for NOT using Scala. There are several problems with
maintaining Scala libraries:

- have to build different artifacts for each Scala version
- artifacts have dependencies to Scala standard library
- it becomes even a bigger problem with Scala 3 migration
- Scala is a very complex language and requires a lot of discipline

Because of these issues, you see how much time it takes to upgrade Spark to
newer Scala version, or libraries from the Twitter ecosystem.

Gleb

On Sun, Oct 28, 2018 at 10:10 PM Kenneth Knowles <ke...@apache.org> wrote:

> Porting to Scio is not necessary. I expect you can use Scala main() + Java
> SDK + Scala DPASF no problem. Then Scio users can use it also, of course,
> and so can Java SDK users.
>
> Doesn't Scala compile to jars that are somewhat usable from Java? I've
> only ever gone the other way, but I thought it was somewhat both ways. That
> would mean Java main() + Java SDK + Scala DPASF is also viable. But there's
> still the matter of supporting Scala in our build system & codebase.
> Personally SGTM since I know Scala well but I wouldn't want to have code
> that only a couple people are comfortable modifying*. From a whole-project
> perspective I would yield to the broader Beam dev community on this issue.
>
> I took a quick look at your Scala and it does look like it would mostly
> port to Java quite easily, just a bit more boilerplate. Do you know of
> particular things that might be hard?
>
> Kenn
>
> *our Jenkins and Gradle groovy scripts are further from Java than Scala
> and we seem to be doing OK-but-not-great as far as everyone feeling OK to
> modify them
>
>
> On Sat, Oct 27, 2018 at 1:43 AM David Morávek <da...@gmail.com>
> wrote:
>
>> Hello Alejandro,
>>
>> +1 for java implementation, even though this would probably require more
>> effort from your side
>>
>> The main problem with Scio is that it lives outside beam code base and
>> depends on specific version of Beam SDK. The sketching extension (and any
>> other module in beam code base) on the other hand uses Beam SDK that is
>> build from sources (current checkout), that Scio might not be compatible
>> with.
>>
>> D.
>>
>> On Sat, Oct 27, 2018 at 8:26 AM Alejandro <al...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> although not exactly your intentions, I am also looking to contribute to
>>> Beam, but from a code perspective.
>>>
>>> I've been discussing with some beam members like Austin and lukasz
>>> (CCed) on how to integrate https://github.com/elbaulp/DPASF into Beam.
>>>
>>> It seems the best place for this algorithms is
>>> https://github.com/apache/beam/tree/master/sdks/java/extensions/sketching
>>> ,
>>> but right now I lack the Beam knowledge that allow my to implement it.
>>> So I am looking to someone who could help me start. Should I write
>>> wrappers that interface my Scala code using
>>> https://github.com/spotify/scio? or re implement all in Java?
>>>
>>> Cheers.
>>>
>>> On 10/26/2018 11:55 PM, Rose Nguyen wrote:
>>> > I've heard of many people referring to the Medium posts related to Beam
>>> > for step-by-step tutorials.
>>> >
>>> > https://medium.com/tag/apache-beam/latest
>>> >
>>> > On Thu, Oct 25, 2018 at 9:25 PM Austin Bennett
>>> > <whatwouldaustindo@gmail.com <ma...@gmail.com>>
>>> wrote:
>>> >
>>> >     Hi Beam Devs and Users,
>>> >
>>> >     Trying to get a sense from the community on the sorts of things we
>>> >     think would be useful to build the community (I am thinking not
>>> from
>>> >     an angle of specific code/implementation/functionality, but from a
>>> >     user/usability -- I want to dive in and make real contributions
>>> with
>>> >     the code, too, but know I also have the interest and skills to help
>>> >     with education and community aspects, hence my focus on this).
>>> >
>>> >     I had previously suggested a sort of cookbook for focused and
>>> >     curated examples (code and explination) to help people get started,
>>> >     on-boarding, using Beam to aid getting up and running and
>>> >     accomplishing something worthwhile (and quickly), that seems one
>>> way
>>> >     to help grow our user base (and maybe future dev base afterwards
>>> >     those users become enamored), which did get some positive feedback
>>> >     when first put out there.
>>> >
>>> >     There are many other areas where featuring others sharing successes
>>> >     from having used Beam or little tips can be valuable, Pablo's
>>> >     Awesome Beam is one example of such a
>>> >     collection: https://github.com/pabloem/awesome-beam or even
>>> >     centralizing a general place to find any/all Beam
>>> >     blogs/shared-code/writeups/etc.
>>> >
>>> >     Certainly there is a place for all sorts of contributions and
>>> >     resources.  What do people on these lists think would be
>>> >     particularly useful?  Trying to get a more focused sense of where
>>> we
>>> >     think efforts might be best focused.
>>> >
>>> >     Please share anything (even semi-)related!?
>>> >
>>> >     Thanks,
>>> >     Austin
>>> >
>>> >
>>> >     P.S.  I realize that those following this list are rather self
>>> >     selecting as well, so this might not be the best forum to figure
>>> out
>>> >     what new/novice users need, but I would like to hear what everyone
>>> >     else here thinks could be useful.
>>> >
>>> >
>>> >
>>> > --
>>> > Rose Thị Nguyễn
>>>
>>> --
>>> elbauldelprogramador.com
>>>
>>

-- 
Cheers,
Gleb

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by Kenneth Knowles <ke...@apache.org>.
Porting to Scio is not necessary. I expect you can use Scala main() + Java
SDK + Scala DPASF no problem. Then Scio users can use it also, of course,
and so can Java SDK users.

Doesn't Scala compile to jars that are somewhat usable from Java? I've only
ever gone the other way, but I thought it was somewhat both ways. That
would mean Java main() + Java SDK + Scala DPASF is also viable. But there's
still the matter of supporting Scala in our build system & codebase.
Personally SGTM since I know Scala well but I wouldn't want to have code
that only a couple people are comfortable modifying*. From a whole-project
perspective I would yield to the broader Beam dev community on this issue.

I took a quick look at your Scala and it does look like it would mostly
port to Java quite easily, just a bit more boilerplate. Do you know of
particular things that might be hard?

Kenn

*our Jenkins and Gradle groovy scripts are further from Java than Scala and
we seem to be doing OK-but-not-great as far as everyone feeling OK to
modify them


On Sat, Oct 27, 2018 at 1:43 AM David Morávek <da...@gmail.com>
wrote:

> Hello Alejandro,
>
> +1 for java implementation, even though this would probably require more
> effort from your side
>
> The main problem with Scio is that it lives outside beam code base and
> depends on specific version of Beam SDK. The sketching extension (and any
> other module in beam code base) on the other hand uses Beam SDK that is
> build from sources (current checkout), that Scio might not be compatible
> with.
>
> D.
>
> On Sat, Oct 27, 2018 at 8:26 AM Alejandro <al...@gmail.com> wrote:
>
>> Hello,
>>
>> although not exactly your intentions, I am also looking to contribute to
>> Beam, but from a code perspective.
>>
>> I've been discussing with some beam members like Austin and lukasz
>> (CCed) on how to integrate https://github.com/elbaulp/DPASF into Beam.
>>
>> It seems the best place for this algorithms is
>> https://github.com/apache/beam/tree/master/sdks/java/extensions/sketching
>> ,
>> but right now I lack the Beam knowledge that allow my to implement it.
>> So I am looking to someone who could help me start. Should I write
>> wrappers that interface my Scala code using
>> https://github.com/spotify/scio? or re implement all in Java?
>>
>> Cheers.
>>
>> On 10/26/2018 11:55 PM, Rose Nguyen wrote:
>> > I've heard of many people referring to the Medium posts related to Beam
>> > for step-by-step tutorials.
>> >
>> > https://medium.com/tag/apache-beam/latest
>> >
>> > On Thu, Oct 25, 2018 at 9:25 PM Austin Bennett
>> > <whatwouldaustindo@gmail.com <ma...@gmail.com>>
>> wrote:
>> >
>> >     Hi Beam Devs and Users,
>> >
>> >     Trying to get a sense from the community on the sorts of things we
>> >     think would be useful to build the community (I am thinking not from
>> >     an angle of specific code/implementation/functionality, but from a
>> >     user/usability -- I want to dive in and make real contributions with
>> >     the code, too, but know I also have the interest and skills to help
>> >     with education and community aspects, hence my focus on this).
>> >
>> >     I had previously suggested a sort of cookbook for focused and
>> >     curated examples (code and explination) to help people get started,
>> >     on-boarding, using Beam to aid getting up and running and
>> >     accomplishing something worthwhile (and quickly), that seems one way
>> >     to help grow our user base (and maybe future dev base afterwards
>> >     those users become enamored), which did get some positive feedback
>> >     when first put out there.
>> >
>> >     There are many other areas where featuring others sharing successes
>> >     from having used Beam or little tips can be valuable, Pablo's
>> >     Awesome Beam is one example of such a
>> >     collection: https://github.com/pabloem/awesome-beam or even
>> >     centralizing a general place to find any/all Beam
>> >     blogs/shared-code/writeups/etc.
>> >
>> >     Certainly there is a place for all sorts of contributions and
>> >     resources.  What do people on these lists think would be
>> >     particularly useful?  Trying to get a more focused sense of where we
>> >     think efforts might be best focused.
>> >
>> >     Please share anything (even semi-)related!?
>> >
>> >     Thanks,
>> >     Austin
>> >
>> >
>> >     P.S.  I realize that those following this list are rather self
>> >     selecting as well, so this might not be the best forum to figure out
>> >     what new/novice users need, but I would like to hear what everyone
>> >     else here thinks could be useful.
>> >
>> >
>> >
>> > --
>> > Rose Thị Nguyễn
>>
>> --
>> elbauldelprogramador.com
>>
>

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by David Morávek <da...@gmail.com>.
Hello Alejandro,

+1 for java implementation, even though this would probably require more
effort from your side

The main problem with Scio is that it lives outside beam code base and
depends on specific version of Beam SDK. The sketching extension (and any
other module in beam code base) on the other hand uses Beam SDK that is
build from sources (current checkout), that Scio might not be compatible
with.

D.

On Sat, Oct 27, 2018 at 8:26 AM Alejandro <al...@gmail.com> wrote:

> Hello,
>
> although not exactly your intentions, I am also looking to contribute to
> Beam, but from a code perspective.
>
> I've been discussing with some beam members like Austin and lukasz
> (CCed) on how to integrate https://github.com/elbaulp/DPASF into Beam.
>
> It seems the best place for this algorithms is
> https://github.com/apache/beam/tree/master/sdks/java/extensions/sketching,
> but right now I lack the Beam knowledge that allow my to implement it.
> So I am looking to someone who could help me start. Should I write
> wrappers that interface my Scala code using
> https://github.com/spotify/scio? or re implement all in Java?
>
> Cheers.
>
> On 10/26/2018 11:55 PM, Rose Nguyen wrote:
> > I've heard of many people referring to the Medium posts related to Beam
> > for step-by-step tutorials.
> >
> > https://medium.com/tag/apache-beam/latest
> >
> > On Thu, Oct 25, 2018 at 9:25 PM Austin Bennett
> > <whatwouldaustindo@gmail.com <ma...@gmail.com>>
> wrote:
> >
> >     Hi Beam Devs and Users,
> >
> >     Trying to get a sense from the community on the sorts of things we
> >     think would be useful to build the community (I am thinking not from
> >     an angle of specific code/implementation/functionality, but from a
> >     user/usability -- I want to dive in and make real contributions with
> >     the code, too, but know I also have the interest and skills to help
> >     with education and community aspects, hence my focus on this).
> >
> >     I had previously suggested a sort of cookbook for focused and
> >     curated examples (code and explination) to help people get started,
> >     on-boarding, using Beam to aid getting up and running and
> >     accomplishing something worthwhile (and quickly), that seems one way
> >     to help grow our user base (and maybe future dev base afterwards
> >     those users become enamored), which did get some positive feedback
> >     when first put out there.
> >
> >     There are many other areas where featuring others sharing successes
> >     from having used Beam or little tips can be valuable, Pablo's
> >     Awesome Beam is one example of such a
> >     collection: https://github.com/pabloem/awesome-beam or even
> >     centralizing a general place to find any/all Beam
> >     blogs/shared-code/writeups/etc.
> >
> >     Certainly there is a place for all sorts of contributions and
> >     resources.  What do people on these lists think would be
> >     particularly useful?  Trying to get a more focused sense of where we
> >     think efforts might be best focused.
> >
> >     Please share anything (even semi-)related!?
> >
> >     Thanks,
> >     Austin
> >
> >
> >     P.S.  I realize that those following this list are rather self
> >     selecting as well, so this might not be the best forum to figure out
> >     what new/novice users need, but I would like to hear what everyone
> >     else here thinks could be useful.
> >
> >
> >
> > --
> > Rose Thị Nguyễn
>
> --
> elbauldelprogramador.com
>

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by Alejandro <al...@gmail.com>.
Hello,

although not exactly your intentions, I am also looking to contribute to
Beam, but from a code perspective.

I've been discussing with some beam members like Austin and lukasz
(CCed) on how to integrate https://github.com/elbaulp/DPASF into Beam.

It seems the best place for this algorithms is
https://github.com/apache/beam/tree/master/sdks/java/extensions/sketching,
but right now I lack the Beam knowledge that allow my to implement it.
So I am looking to someone who could help me start. Should I write
wrappers that interface my Scala code using
https://github.com/spotify/scio? or re implement all in Java?

Cheers.

On 10/26/2018 11:55 PM, Rose Nguyen wrote:
> I've heard of many people referring to the Medium posts related to Beam
> for step-by-step tutorials. 
> 
> https://medium.com/tag/apache-beam/latest
> 
> On Thu, Oct 25, 2018 at 9:25 PM Austin Bennett
> <whatwouldaustindo@gmail.com <ma...@gmail.com>> wrote:
> 
>     Hi Beam Devs and Users,
> 
>     Trying to get a sense from the community on the sorts of things we
>     think would be useful to build the community (I am thinking not from
>     an angle of specific code/implementation/functionality, but from a
>     user/usability -- I want to dive in and make real contributions with
>     the code, too, but know I also have the interest and skills to help
>     with education and community aspects, hence my focus on this).  
> 
>     I had previously suggested a sort of cookbook for focused and
>     curated examples (code and explination) to help people get started,
>     on-boarding, using Beam to aid getting up and running and
>     accomplishing something worthwhile (and quickly), that seems one way
>     to help grow our user base (and maybe future dev base afterwards
>     those users become enamored), which did get some positive feedback
>     when first put out there.  
> 
>     There are many other areas where featuring others sharing successes
>     from having used Beam or little tips can be valuable, Pablo's
>     Awesome Beam is one example of such a
>     collection: https://github.com/pabloem/awesome-beam or even
>     centralizing a general place to find any/all Beam
>     blogs/shared-code/writeups/etc.  
> 
>     Certainly there is a place for all sorts of contributions and
>     resources.  What do people on these lists think would be
>     particularly useful?  Trying to get a more focused sense of where we
>     think efforts might be best focused.  
> 
>     Please share anything (even semi-)related!?  
> 
>     Thanks,
>     Austin
> 
> 
>     P.S.  I realize that those following this list are rather self
>     selecting as well, so this might not be the best forum to figure out
>     what new/novice users need, but I would like to hear what everyone
>     else here thinks could be useful.  
> 
> 
> 
> -- 
> Rose Thị Nguyễn

-- 
elbauldelprogramador.com

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by Rose Nguyen <rt...@google.com>.
I've heard of many people referring to the Medium posts related to Beam for
step-by-step tutorials.

https://medium.com/tag/apache-beam/latest

On Thu, Oct 25, 2018 at 9:25 PM Austin Bennett <wh...@gmail.com>
wrote:

> Hi Beam Devs and Users,
>
> Trying to get a sense from the community on the sorts of things we think
> would be useful to build the community (I am thinking not from an angle of
> specific code/implementation/functionality, but from a user/usability -- I
> want to dive in and make real contributions with the code, too, but know I
> also have the interest and skills to help with education and community
> aspects, hence my focus on this).
>
> I had previously suggested a sort of cookbook for focused and curated
> examples (code and explination) to help people get started, on-boarding,
> using Beam to aid getting up and running and accomplishing something
> worthwhile (and quickly), that seems one way to help grow our user base
> (and maybe future dev base afterwards those users become enamored), which
> did get some positive feedback when first put out there.
>
> There are many other areas where featuring others sharing successes from
> having used Beam or little tips can be valuable, Pablo's Awesome Beam is
> one example of such a collection: https://github.com/pabloem/awesome-beam
> or even centralizing a general place to find any/all Beam
> blogs/shared-code/writeups/etc.
>
> Certainly there is a place for all sorts of contributions and resources.
> What do people on these lists think would be particularly useful?  Trying
> to get a more focused sense of where we think efforts might be best
> focused.
>
> Please share anything (even semi-)related!?
>
> Thanks,
> Austin
>
>
> P.S.  I realize that those following this list are rather self selecting
> as well, so this might not be the best forum to figure out what new/novice
> users need, but I would like to hear what everyone else here thinks could
> be useful.
>


-- 
Rose Thị Nguyễn

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by Rose Nguyen <rt...@google.com>.
I've heard of many people referring to the Medium posts related to Beam for
step-by-step tutorials.

https://medium.com/tag/apache-beam/latest

On Thu, Oct 25, 2018 at 9:25 PM Austin Bennett <wh...@gmail.com>
wrote:

> Hi Beam Devs and Users,
>
> Trying to get a sense from the community on the sorts of things we think
> would be useful to build the community (I am thinking not from an angle of
> specific code/implementation/functionality, but from a user/usability -- I
> want to dive in and make real contributions with the code, too, but know I
> also have the interest and skills to help with education and community
> aspects, hence my focus on this).
>
> I had previously suggested a sort of cookbook for focused and curated
> examples (code and explination) to help people get started, on-boarding,
> using Beam to aid getting up and running and accomplishing something
> worthwhile (and quickly), that seems one way to help grow our user base
> (and maybe future dev base afterwards those users become enamored), which
> did get some positive feedback when first put out there.
>
> There are many other areas where featuring others sharing successes from
> having used Beam or little tips can be valuable, Pablo's Awesome Beam is
> one example of such a collection: https://github.com/pabloem/awesome-beam
> or even centralizing a general place to find any/all Beam
> blogs/shared-code/writeups/etc.
>
> Certainly there is a place for all sorts of contributions and resources.
> What do people on these lists think would be particularly useful?  Trying
> to get a more focused sense of where we think efforts might be best
> focused.
>
> Please share anything (even semi-)related!?
>
> Thanks,
> Austin
>
>
> P.S.  I realize that those following this list are rather self selecting
> as well, so this might not be the best forum to figure out what new/novice
> users need, but I would like to hear what everyone else here thinks could
> be useful.
>


-- 
Rose Thị Nguyễn

Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

Posted by Maximilian Michels <mx...@apache.org>.
Hi Austin,

Great initiative. I think there are already some materials out there but 
they are not consolidated:

Cookbook with examples: 
https://github.com/apache/beam/tree/master/examples/java/src/main/java/org/apache/beam/examples/cookbook

An interactive tutorial would be a great addition, perhaps the examples 
also need an update to reflect more typical use cases.

Documentation resources: https://beam.apache.org/documentation/resources/

IMHO there are three types of resources that would be useful:

1) Learning to write Beam pipelines
2) Beam Success Stories
3) Contributing to Beam

1) and 3) could need an overhaul. We clearly lack 2), not that there are 
no success stories but they are not collected yet.

Cheers,
Max

On 26.10.18 06:25, Austin Bennett wrote:
> Hi Beam Devs and Users,
> 
> Trying to get a sense from the community on the sorts of things we think 
> would be useful to build the community (I am thinking not from an angle 
> of specific code/implementation/functionality, but from a user/usability 
> -- I want to dive in and make real contributions with the code, too, but 
> know I also have the interest and skills to help with education and 
> community aspects, hence my focus on this).
> 
> I had previously suggested a sort of cookbook for focused and curated 
> examples (code and explination) to help people get started, on-boarding, 
> using Beam to aid getting up and running and accomplishing something 
> worthwhile (and quickly), that seems one way to help grow our user base 
> (and maybe future dev base afterwards those users become enamored), 
> which did get some positive feedback when first put out there.
> 
> There are many other areas where featuring others sharing successes from 
> having used Beam or little tips can be valuable, Pablo's Awesome Beam is 
> one example of such a collection: 
> https://github.com/pabloem/awesome-beam or even centralizing a general 
> place to find any/all Beam blogs/shared-code/writeups/etc.
> 
> Certainly there is a place for all sorts of contributions and 
> resources.  What do people on these lists think would be particularly 
> useful?  Trying to get a more focused sense of where we think efforts 
> might be best focused.
> 
> Please share anything (even semi-)related!?
> 
> Thanks,
> Austin
> 
> 
> P.S.  I realize that those following this list are rather self selecting 
> as well, so this might not be the best forum to figure out what 
> new/novice users need, but I would like to hear what everyone else here 
> thinks could be useful.