You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Ismaël Mejía <ie...@gmail.com> on 2018/09/26 21:16:27 UTC

Modular IO presentation at Apachecon

Hello, today Eugene and me did a talk about about modular APIs for IO
at ApacheCon. This talk introduces some common patterns that we have
found while creating IO connectors and also presents recent ideas like
dynamic destinations, sequential writes among others using FileIO as a
use case.

In case you guys want to take a look, here is a copy of the slides, we
will probably add this to the IO authoring documentation too.

https://s.apache.org/beam-modular-io-talk

Re: Modular IO presentation at Apachecon

Posted by Chamikara Jayalath <ch...@google.com>.
Thanks, it was a great talk. Modular and composable IO FTW!

On Thu, Sep 27, 2018 at 1:30 AM Juan Carlos Garcia <jc...@gmail.com>
wrote:

> Im really looking forward for a way to monitor the results(like which
> batch of elements were written per destination if possible 🙆 ) of an IO
> Module in a consistent way.
>
> Nice presentation.
>
> Thomas Weise <th...@apache.org> schrieb am Do., 27. Sep. 2018, 06:35:
>
>> Thanks for sharing. I'm looking forward to see the recording of the talk
>> (hopefully!).
>>
>> This will be very helpful for Beam users. IO still is typically the
>> unexpectedly hard and time consuming part of authoring pipelines.
>>
>>
>> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:
>>
>>> Thanks for the slides.
>>> Really enjoyed the talk in person, especially the concept that IO is a
>>> transformation, and a source or sink are not special and the splittable
>>> DoFn explanation.
>>>
>>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>>> at ApacheCon. This talk introduces some common patterns that we have
>>>> found while creating IO connectors and also presents recent ideas like
>>>> dynamic destinations, sequential writes among others using FileIO as a
>>>> use case.
>>>>
>>>> In case you guys want to take a look, here is a copy of the slides, we
>>>> will probably add this to the IO authoring documentation too.
>>>>
>>>> https://s.apache.org/beam-modular-io-talk
>>>>
>>>

Re: Modular IO presentation at Apachecon

Posted by Juan Carlos Garcia <jc...@gmail.com>.
Im really looking forward for a way to monitor the results(like which batch
of elements were written per destination if possible 🙆 ) of an IO Module
in a consistent way.

Nice presentation.

Thomas Weise <th...@apache.org> schrieb am Do., 27. Sep. 2018, 06:35:

> Thanks for sharing. I'm looking forward to see the recording of the talk
> (hopefully!).
>
> This will be very helpful for Beam users. IO still is typically the
> unexpectedly hard and time consuming part of authoring pipelines.
>
>
> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:
>
>> Thanks for the slides.
>> Really enjoyed the talk in person, especially the concept that IO is a
>> transformation, and a source or sink are not special and the splittable
>> DoFn explanation.
>>
>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>> at ApacheCon. This talk introduces some common patterns that we have
>>> found while creating IO connectors and also presents recent ideas like
>>> dynamic destinations, sequential writes among others using FileIO as a
>>> use case.
>>>
>>> In case you guys want to take a look, here is a copy of the slides, we
>>> will probably add this to the IO authoring documentation too.
>>>
>>> https://s.apache.org/beam-modular-io-talk
>>>
>>

Re: Modular IO presentation at Apachecon

Posted by Alexey Romanenko <ar...@gmail.com>.
Interesting talk, I’ve had a chance to see this in person. 
Many thanks to Ismael and Eugene!

> On 27 Sep 2018, at 21:34, Eugene Kirpichov <ki...@google.com> wrote:
> 
> Thanks Ismael and everyone else! Unfortunately I do not believe that this session was recorded on video :(
> Juan - yes, this is some of the important future work, and I think it's not hard to add to many connectors; contributions would be welcome.
> In terms of a "per-key" Wait transform, yeah, that definitely needs to be figured out too. The presentation considers only the non-per-key case but I think it should not be hard to add a per-key one. If you need to do something directly with the results, you can use Combine.perKey().
> 
> On Thu, Sep 27, 2018 at 10:10 AM Pablo Estrada <pabloem@google.com <ma...@google.com>> wrote:
> I'll take this chance to plug in my little directory of Beam tools/materials: https://github.com/pabloem/awesome-beam <https://github.com/pabloem/awesome-beam>
> 
> Please feel free to send PRs : )
> 
> 
> On Wed, Sep 26, 2018 at 10:29 PM Ankur Goenka <goenka@google.com <ma...@google.com>> wrote:
> Thanks for sharing. Great slides and looking for the recorded session.
> 
> Do we have a central location where we link all the beam presentations for discoverability?
> 
> On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <thw@apache.org <ma...@apache.org>> wrote:
> Thanks for sharing. I'm looking forward to see the recording of the talk (hopefully!).
> 
> This will be very helpful for Beam users. IO still is typically the unexpectedly hard and time consuming part of authoring pipelines.
> 
> 
> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <amyrvold@google.com <ma...@google.com>> wrote:
> Thanks for the slides.
> Really enjoyed the talk in person, especially the concept that IO is a transformation, and a source or sink are not special and the splittable DoFn explanation.
> 
> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <iemejia@gmail.com <ma...@gmail.com>> wrote:
> Hello, today Eugene and me did a talk about about modular APIs for IO
> at ApacheCon. This talk introduces some common patterns that we have
> found while creating IO connectors and also presents recent ideas like
> dynamic destinations, sequential writes among others using FileIO as a
> use case.
> 
> In case you guys want to take a look, here is a copy of the slides, we
> will probably add this to the IO authoring documentation too.
> 
> https://s.apache.org/beam-modular-io-talk <https://s.apache.org/beam-modular-io-talk>


Re: Modular IO presentation at Apachecon

Posted by Alexey Romanenko <ar...@gmail.com>.
Interesting talk, I’ve had a chance to see this in person. 
Many thanks to Ismael and Eugene!

> On 27 Sep 2018, at 21:34, Eugene Kirpichov <ki...@google.com> wrote:
> 
> Thanks Ismael and everyone else! Unfortunately I do not believe that this session was recorded on video :(
> Juan - yes, this is some of the important future work, and I think it's not hard to add to many connectors; contributions would be welcome.
> In terms of a "per-key" Wait transform, yeah, that definitely needs to be figured out too. The presentation considers only the non-per-key case but I think it should not be hard to add a per-key one. If you need to do something directly with the results, you can use Combine.perKey().
> 
> On Thu, Sep 27, 2018 at 10:10 AM Pablo Estrada <pabloem@google.com <ma...@google.com>> wrote:
> I'll take this chance to plug in my little directory of Beam tools/materials: https://github.com/pabloem/awesome-beam <https://github.com/pabloem/awesome-beam>
> 
> Please feel free to send PRs : )
> 
> 
> On Wed, Sep 26, 2018 at 10:29 PM Ankur Goenka <goenka@google.com <ma...@google.com>> wrote:
> Thanks for sharing. Great slides and looking for the recorded session.
> 
> Do we have a central location where we link all the beam presentations for discoverability?
> 
> On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <thw@apache.org <ma...@apache.org>> wrote:
> Thanks for sharing. I'm looking forward to see the recording of the talk (hopefully!).
> 
> This will be very helpful for Beam users. IO still is typically the unexpectedly hard and time consuming part of authoring pipelines.
> 
> 
> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <amyrvold@google.com <ma...@google.com>> wrote:
> Thanks for the slides.
> Really enjoyed the talk in person, especially the concept that IO is a transformation, and a source or sink are not special and the splittable DoFn explanation.
> 
> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <iemejia@gmail.com <ma...@gmail.com>> wrote:
> Hello, today Eugene and me did a talk about about modular APIs for IO
> at ApacheCon. This talk introduces some common patterns that we have
> found while creating IO connectors and also presents recent ideas like
> dynamic destinations, sequential writes among others using FileIO as a
> use case.
> 
> In case you guys want to take a look, here is a copy of the slides, we
> will probably add this to the IO authoring documentation too.
> 
> https://s.apache.org/beam-modular-io-talk <https://s.apache.org/beam-modular-io-talk>


Re: Modular IO presentation at Apachecon

Posted by Eugene Kirpichov <ki...@google.com>.
Thanks Ismael and everyone else! Unfortunately I do not believe that this
session was recorded on video :(
Juan - yes, this is some of the important future work, and I think it's not
hard to add to many connectors; contributions would be welcome.
In terms of a "per-key" Wait transform, yeah, that definitely needs to be
figured out too. The presentation considers only the non-per-key case but I
think it should not be hard to add a per-key one. If you need to do
something directly with the results, you can use Combine.perKey().

On Thu, Sep 27, 2018 at 10:10 AM Pablo Estrada <pa...@google.com> wrote:

> I'll take this chance to plug in my little directory of Beam
> tools/materials: https://github.com/pabloem/awesome-beam
>
> Please feel free to send PRs : )
>
>
> On Wed, Sep 26, 2018 at 10:29 PM Ankur Goenka <go...@google.com> wrote:
>
>> Thanks for sharing. Great slides and looking for the recorded session.
>>
>> Do we have a central location where we link all the beam presentations
>> for discoverability?
>>
>> On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <th...@apache.org> wrote:
>>
>>> Thanks for sharing. I'm looking forward to see the recording of the talk
>>> (hopefully!).
>>>
>>> This will be very helpful for Beam users. IO still is typically the
>>> unexpectedly hard and time consuming part of authoring pipelines.
>>>
>>>
>>> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com>
>>> wrote:
>>>
>>>> Thanks for the slides.
>>>> Really enjoyed the talk in person, especially the concept that IO is a
>>>> transformation, and a source or sink are not special and the splittable
>>>> DoFn explanation.
>>>>
>>>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>>
>>>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>>>> at ApacheCon. This talk introduces some common patterns that we have
>>>>> found while creating IO connectors and also presents recent ideas like
>>>>> dynamic destinations, sequential writes among others using FileIO as a
>>>>> use case.
>>>>>
>>>>> In case you guys want to take a look, here is a copy of the slides, we
>>>>> will probably add this to the IO authoring documentation too.
>>>>>
>>>>> https://s.apache.org/beam-modular-io-talk
>>>>>
>>>>

Re: Modular IO presentation at Apachecon

Posted by Eugene Kirpichov <ki...@google.com>.
Thanks Ismael and everyone else! Unfortunately I do not believe that this
session was recorded on video :(
Juan - yes, this is some of the important future work, and I think it's not
hard to add to many connectors; contributions would be welcome.
In terms of a "per-key" Wait transform, yeah, that definitely needs to be
figured out too. The presentation considers only the non-per-key case but I
think it should not be hard to add a per-key one. If you need to do
something directly with the results, you can use Combine.perKey().

On Thu, Sep 27, 2018 at 10:10 AM Pablo Estrada <pa...@google.com> wrote:

> I'll take this chance to plug in my little directory of Beam
> tools/materials: https://github.com/pabloem/awesome-beam
>
> Please feel free to send PRs : )
>
>
> On Wed, Sep 26, 2018 at 10:29 PM Ankur Goenka <go...@google.com> wrote:
>
>> Thanks for sharing. Great slides and looking for the recorded session.
>>
>> Do we have a central location where we link all the beam presentations
>> for discoverability?
>>
>> On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <th...@apache.org> wrote:
>>
>>> Thanks for sharing. I'm looking forward to see the recording of the talk
>>> (hopefully!).
>>>
>>> This will be very helpful for Beam users. IO still is typically the
>>> unexpectedly hard and time consuming part of authoring pipelines.
>>>
>>>
>>> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com>
>>> wrote:
>>>
>>>> Thanks for the slides.
>>>> Really enjoyed the talk in person, especially the concept that IO is a
>>>> transformation, and a source or sink are not special and the splittable
>>>> DoFn explanation.
>>>>
>>>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>>
>>>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>>>> at ApacheCon. This talk introduces some common patterns that we have
>>>>> found while creating IO connectors and also presents recent ideas like
>>>>> dynamic destinations, sequential writes among others using FileIO as a
>>>>> use case.
>>>>>
>>>>> In case you guys want to take a look, here is a copy of the slides, we
>>>>> will probably add this to the IO authoring documentation too.
>>>>>
>>>>> https://s.apache.org/beam-modular-io-talk
>>>>>
>>>>

Re: Modular IO presentation at Apachecon

Posted by Pablo Estrada <pa...@google.com>.
I'll take this chance to plug in my little directory of Beam
tools/materials: https://github.com/pabloem/awesome-beam

Please feel free to send PRs : )

On Wed, Sep 26, 2018 at 10:29 PM Ankur Goenka <go...@google.com> wrote:

> Thanks for sharing. Great slides and looking for the recorded session.
>
> Do we have a central location where we link all the beam presentations for
> discoverability?
>
> On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <th...@apache.org> wrote:
>
>> Thanks for sharing. I'm looking forward to see the recording of the talk
>> (hopefully!).
>>
>> This will be very helpful for Beam users. IO still is typically the
>> unexpectedly hard and time consuming part of authoring pipelines.
>>
>>
>> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:
>>
>>> Thanks for the slides.
>>> Really enjoyed the talk in person, especially the concept that IO is a
>>> transformation, and a source or sink are not special and the splittable
>>> DoFn explanation.
>>>
>>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>>> at ApacheCon. This talk introduces some common patterns that we have
>>>> found while creating IO connectors and also presents recent ideas like
>>>> dynamic destinations, sequential writes among others using FileIO as a
>>>> use case.
>>>>
>>>> In case you guys want to take a look, here is a copy of the slides, we
>>>> will probably add this to the IO authoring documentation too.
>>>>
>>>> https://s.apache.org/beam-modular-io-talk
>>>>
>>>

Re: Modular IO presentation at Apachecon

Posted by Pablo Estrada <pa...@google.com>.
I'll take this chance to plug in my little directory of Beam
tools/materials: https://github.com/pabloem/awesome-beam

Please feel free to send PRs : )

On Wed, Sep 26, 2018 at 10:29 PM Ankur Goenka <go...@google.com> wrote:

> Thanks for sharing. Great slides and looking for the recorded session.
>
> Do we have a central location where we link all the beam presentations for
> discoverability?
>
> On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <th...@apache.org> wrote:
>
>> Thanks for sharing. I'm looking forward to see the recording of the talk
>> (hopefully!).
>>
>> This will be very helpful for Beam users. IO still is typically the
>> unexpectedly hard and time consuming part of authoring pipelines.
>>
>>
>> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:
>>
>>> Thanks for the slides.
>>> Really enjoyed the talk in person, especially the concept that IO is a
>>> transformation, and a source or sink are not special and the splittable
>>> DoFn explanation.
>>>
>>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>>> at ApacheCon. This talk introduces some common patterns that we have
>>>> found while creating IO connectors and also presents recent ideas like
>>>> dynamic destinations, sequential writes among others using FileIO as a
>>>> use case.
>>>>
>>>> In case you guys want to take a look, here is a copy of the slides, we
>>>> will probably add this to the IO authoring documentation too.
>>>>
>>>> https://s.apache.org/beam-modular-io-talk
>>>>
>>>

Re: Modular IO presentation at Apachecon

Posted by Ankur Goenka <go...@google.com>.
Thanks for sharing. Great slides and looking for the recorded session.

Do we have a central location where we link all the beam presentations for
discoverability?

On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <th...@apache.org> wrote:

> Thanks for sharing. I'm looking forward to see the recording of the talk
> (hopefully!).
>
> This will be very helpful for Beam users. IO still is typically the
> unexpectedly hard and time consuming part of authoring pipelines.
>
>
> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:
>
>> Thanks for the slides.
>> Really enjoyed the talk in person, especially the concept that IO is a
>> transformation, and a source or sink are not special and the splittable
>> DoFn explanation.
>>
>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>> at ApacheCon. This talk introduces some common patterns that we have
>>> found while creating IO connectors and also presents recent ideas like
>>> dynamic destinations, sequential writes among others using FileIO as a
>>> use case.
>>>
>>> In case you guys want to take a look, here is a copy of the slides, we
>>> will probably add this to the IO authoring documentation too.
>>>
>>> https://s.apache.org/beam-modular-io-talk
>>>
>>

Re: Modular IO presentation at Apachecon

Posted by Ankur Goenka <go...@google.com>.
Thanks for sharing. Great slides and looking for the recorded session.

Do we have a central location where we link all the beam presentations for
discoverability?

On Wed, Sep 26, 2018 at 9:35 PM Thomas Weise <th...@apache.org> wrote:

> Thanks for sharing. I'm looking forward to see the recording of the talk
> (hopefully!).
>
> This will be very helpful for Beam users. IO still is typically the
> unexpectedly hard and time consuming part of authoring pipelines.
>
>
> On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:
>
>> Thanks for the slides.
>> Really enjoyed the talk in person, especially the concept that IO is a
>> transformation, and a source or sink are not special and the splittable
>> DoFn explanation.
>>
>> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Hello, today Eugene and me did a talk about about modular APIs for IO
>>> at ApacheCon. This talk introduces some common patterns that we have
>>> found while creating IO connectors and also presents recent ideas like
>>> dynamic destinations, sequential writes among others using FileIO as a
>>> use case.
>>>
>>> In case you guys want to take a look, here is a copy of the slides, we
>>> will probably add this to the IO authoring documentation too.
>>>
>>> https://s.apache.org/beam-modular-io-talk
>>>
>>

Re: Modular IO presentation at Apachecon

Posted by Thomas Weise <th...@apache.org>.
Thanks for sharing. I'm looking forward to see the recording of the talk
(hopefully!).

This will be very helpful for Beam users. IO still is typically the
unexpectedly hard and time consuming part of authoring pipelines.


On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:

> Thanks for the slides.
> Really enjoyed the talk in person, especially the concept that IO is a
> transformation, and a source or sink are not special and the splittable
> DoFn explanation.
>
> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Hello, today Eugene and me did a talk about about modular APIs for IO
>> at ApacheCon. This talk introduces some common patterns that we have
>> found while creating IO connectors and also presents recent ideas like
>> dynamic destinations, sequential writes among others using FileIO as a
>> use case.
>>
>> In case you guys want to take a look, here is a copy of the slides, we
>> will probably add this to the IO authoring documentation too.
>>
>> https://s.apache.org/beam-modular-io-talk
>>
>

Re: Modular IO presentation at Apachecon

Posted by Thomas Weise <th...@apache.org>.
Thanks for sharing. I'm looking forward to see the recording of the talk
(hopefully!).

This will be very helpful for Beam users. IO still is typically the
unexpectedly hard and time consuming part of authoring pipelines.


On Wed, Sep 26, 2018 at 2:48 PM Alan Myrvold <am...@google.com> wrote:

> Thanks for the slides.
> Really enjoyed the talk in person, especially the concept that IO is a
> transformation, and a source or sink are not special and the splittable
> DoFn explanation.
>
> On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Hello, today Eugene and me did a talk about about modular APIs for IO
>> at ApacheCon. This talk introduces some common patterns that we have
>> found while creating IO connectors and also presents recent ideas like
>> dynamic destinations, sequential writes among others using FileIO as a
>> use case.
>>
>> In case you guys want to take a look, here is a copy of the slides, we
>> will probably add this to the IO authoring documentation too.
>>
>> https://s.apache.org/beam-modular-io-talk
>>
>

Re: Modular IO presentation at Apachecon

Posted by Alan Myrvold <am...@google.com>.
Thanks for the slides.
Really enjoyed the talk in person, especially the concept that IO is a
transformation, and a source or sink are not special and the splittable
DoFn explanation.

On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:

> Hello, today Eugene and me did a talk about about modular APIs for IO
> at ApacheCon. This talk introduces some common patterns that we have
> found while creating IO connectors and also presents recent ideas like
> dynamic destinations, sequential writes among others using FileIO as a
> use case.
>
> In case you guys want to take a look, here is a copy of the slides, we
> will probably add this to the IO authoring documentation too.
>
> https://s.apache.org/beam-modular-io-talk
>

Re: Modular IO presentation at Apachecon

Posted by Alan Myrvold <am...@google.com>.
Thanks for the slides.
Really enjoyed the talk in person, especially the concept that IO is a
transformation, and a source or sink are not special and the splittable
DoFn explanation.

On Wed, Sep 26, 2018 at 2:17 PM Ismaël Mejía <ie...@gmail.com> wrote:

> Hello, today Eugene and me did a talk about about modular APIs for IO
> at ApacheCon. This talk introduces some common patterns that we have
> found while creating IO connectors and also presents recent ideas like
> dynamic destinations, sequential writes among others using FileIO as a
> use case.
>
> In case you guys want to take a look, here is a copy of the slides, we
> will probably add this to the IO authoring documentation too.
>
> https://s.apache.org/beam-modular-io-talk
>