You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Manu Zhang <ow...@gmail.com> on 2016/04/20 10:22:46 UTC

add use cases to capability matrix

Guys,

Do you think it's valuable to add real world use cases to capability matrix
<http://beam.incubator.apache.org/capability-matrix/> ?
Then, we could know why a particular capability is needed and which should
be prioritized for runner implementations.
I found some examples in the Dataflow paper (3.3)
<http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf>
and another reference is http://www.vldb.org/pvldb/vol8/p2040-Kejariwal.pdf.

Thanks,
Manu Zhang

Re: add use cases to capability matrix

Posted by Manu Zhang <ow...@gmail.com>.
Thanks JB and Davor.

More advanced features (e.g. session, distracting) of Beam may look strange
to users who ask me where those features will be used. It will be great to
have a use case page with code snippets side-by-side with each use case.
(like what you did with the Game example on Cloud Dataflow).

Another question is whether the Beam use cases are valid beyond Google
since most companies don't have the same (or even close) scale as Google.

Sorry, these questions may be more suitable for user list.

Thanks,
Manu

On Thu, Apr 21, 2016 at 5:49 AM Davor Bonaci <da...@google.com.invalid>
wrote:

> I would like to avoid complicating the capability matrix itself with such
> details. Hopefully, user documentation for each of these features would
> (eventually) give insights what you could use them for, and we could
> cross-link to that. For now, you can refer to the Dataflow SDK
> documentation to get some of this information [1]. (We'll have that ported
> over to Beam soon.)
>
> The answer your specific question about priority, you should probably
> prioritize "what" over "where" over "when" over "how" parts. That said, it
> is probably fine to advance to the next category once you have figured out
> the first few bullets in the current category.
>
> [1] https://cloud.google.com/dataflow/model/programming-model
>
> On Wed, Apr 20, 2016 at 2:11 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
> > Hi Manu,
> >
> > generally speaking, we have to add a complete started guide with "real"
> > use cases to illustrate beam usage.
> >
> > I'm preparing some website PR about this (with the overview of IOs,
> > DSLs/SDKs, runners, etc we discussed early).
> >
> > Regards
> > JB
> >
> >
> > On 04/20/2016 10:22 AM, Manu Zhang wrote:
> >
> >> Guys,
> >>
> >> Do you think it's valuable to add real world use cases to capability
> >> matrix
> >> <http://beam.incubator.apache.org/capability-matrix/> ?
> >> Then, we could know why a particular capability is needed and which
> should
> >> be prioritized for runner implementations.
> >> I found some examples in the Dataflow paper (3.3)
> >> <
> >>
> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf
> >> >
> >> and another reference is
> >> http://www.vldb.org/pvldb/vol8/p2040-Kejariwal.pdf.
> >>
> >> Thanks,
> >> Manu Zhang
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Re: add use cases to capability matrix

Posted by Davor Bonaci <da...@google.com.INVALID>.
I would like to avoid complicating the capability matrix itself with such
details. Hopefully, user documentation for each of these features would
(eventually) give insights what you could use them for, and we could
cross-link to that. For now, you can refer to the Dataflow SDK
documentation to get some of this information [1]. (We'll have that ported
over to Beam soon.)

The answer your specific question about priority, you should probably
prioritize "what" over "where" over "when" over "how" parts. That said, it
is probably fine to advance to the next category once you have figured out
the first few bullets in the current category.

[1] https://cloud.google.com/dataflow/model/programming-model

On Wed, Apr 20, 2016 at 2:11 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> Hi Manu,
>
> generally speaking, we have to add a complete started guide with "real"
> use cases to illustrate beam usage.
>
> I'm preparing some website PR about this (with the overview of IOs,
> DSLs/SDKs, runners, etc we discussed early).
>
> Regards
> JB
>
>
> On 04/20/2016 10:22 AM, Manu Zhang wrote:
>
>> Guys,
>>
>> Do you think it's valuable to add real world use cases to capability
>> matrix
>> <http://beam.incubator.apache.org/capability-matrix/> ?
>> Then, we could know why a particular capability is needed and which should
>> be prioritized for runner implementations.
>> I found some examples in the Dataflow paper (3.3)
>> <
>> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf
>> >
>> and another reference is
>> http://www.vldb.org/pvldb/vol8/p2040-Kejariwal.pdf.
>>
>> Thanks,
>> Manu Zhang
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: add use cases to capability matrix

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi Manu,

generally speaking, we have to add a complete started guide with "real" 
use cases to illustrate beam usage.

I'm preparing some website PR about this (with the overview of IOs, 
DSLs/SDKs, runners, etc we discussed early).

Regards
JB

On 04/20/2016 10:22 AM, Manu Zhang wrote:
> Guys,
>
> Do you think it's valuable to add real world use cases to capability matrix
> <http://beam.incubator.apache.org/capability-matrix/> ?
> Then, we could know why a particular capability is needed and which should
> be prioritized for runner implementations.
> I found some examples in the Dataflow paper (3.3)
> <http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf>
> and another reference is http://www.vldb.org/pvldb/vol8/p2040-Kejariwal.pdf.
>
> Thanks,
> Manu Zhang
>

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com