You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Eleanore Jin <el...@gmail.com> on 2020/07/07 18:02:48 UTC

Beam supports Flink Async IO operator

Hi community,

I cannot find any documentation for Beam supporting Flink async IO operator
(
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
just wonder is this not supported right now?

Thanks a lot!
Eleanore

Re: Beam supports Flink Async IO operator

Posted by Luke Cwik <lc...@google.com>.
That is correct.

On Mon, Jul 13, 2020 at 4:33 PM Eleanore Jin <el...@gmail.com> wrote:

> Hi Kaymak,
>
> Sorry for the late reply and thanks for sharing the blog, I went through
> it.
>
> here is my understanding:
>
> timely processing could `buffer` data and send them to the external
> system in a batch fashion, but in order for it to work `similar` flink
> async IO operator it also requires the external system to be able to accept
> input data in bulk and return back the response synchronously. Otherwise it
> would still like making multiple sync calls to the external system and get
> back responses one by one.
>
> Thanks a lot for sharing!
>
> Best,
> Eleanore
>
> On Thu, Jul 9, 2020 at 1:56 AM Kaymak, Tobias <to...@ricardo.ch>
> wrote:
>
>> Hi Eleanore,
>>
>> Maybe batched RPC is what you are looking for?
>> https://beam.apache.org/blog/timely-processing/
>>
>> On Wed, Jul 8, 2020 at 6:20 PM Eleanore Jin <el...@gmail.com>
>> wrote:
>>
>>> Thanks Luke and Max for the information.
>>>
>>> We have the use case that inside a DoFn, we will need to call external
>>> services to trigger some other flows. The calls to other services are REST
>>> based sync calls, and it will take 150 milliseconds plus to return. We are
>>> using Flink as the runner and I came across this Async I/O operator from
>>> flink, trying to figure out if this is the right approach and if Beam
>>> provides any similar concept for it.
>>>
>>> Thanks!
>>> Eleanore
>>>
>>> On Wed, Jul 8, 2020 at 2:55 AM Maximilian Michels <mx...@apache.org>
>>> wrote:
>>>
>>>> Just to clarify: We could make the AsnycIO operator also available in
>>>> Beam but the operator has to be represented by a concept in Beam.
>>>> Otherwise, there is no way to know when to produce it as part of the
>>>> translation.
>>>>
>>>> On 08.07.20 11:53, Maximilian Michels wrote:
>>>> > Flink's AsycIO operator is useful for processing io-bound operations,
>>>> > e.g. sending network requests. Like Luke mentioned, it is not
>>>> available
>>>> > in Beam.
>>>> >
>>>> > -Max
>>>> >
>>>> > On 07.07.20 22:11, Luke Cwik wrote:
>>>> >> Beam is a layer that sits on top of execution engines like Flink and
>>>> >> provides its own programming model thus native operators like
>>>> Flink's
>>>> >> async IO operator are not exposed.
>>>> >>
>>>> >> Most people use a DoFn to do all their IO and sometimes will compose
>>>> >> it with another transform such as GroupIntoBatches[1] to simplify
>>>> >> their implementation.
>>>> >>
>>>> >> Why do you need async?
>>>> >>
>>>> >> 1:
>>>> >>
>>>> https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <eleanore.jin@gmail.com
>>>> >> <ma...@gmail.com>> wrote:
>>>> >>
>>>> >>     Hi community,
>>>> >>
>>>> >>     I cannot find any documentation for Beam supporting Flink async
>>>> IO
>>>> >>     operator
>>>> >>
>>>> >> (
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
>>>>
>>>> >>
>>>> >>     just wonder is this not supported right now?
>>>> >>
>>>> >>     Thanks a lot!
>>>> >>     Eleanore
>>>> >>
>>>>
>>>

Re: Beam supports Flink Async IO operator

Posted by Eleanore Jin <el...@gmail.com>.
Hi Kaymak,

Sorry for the late reply and thanks for sharing the blog, I went through
it.

here is my understanding:

timely processing could `buffer` data and send them to the external
system in a batch fashion, but in order for it to work `similar` flink
async IO operator it also requires the external system to be able to accept
input data in bulk and return back the response synchronously. Otherwise it
would still like making multiple sync calls to the external system and get
back responses one by one.

Thanks a lot for sharing!

Best,
Eleanore

On Thu, Jul 9, 2020 at 1:56 AM Kaymak, Tobias <to...@ricardo.ch>
wrote:

> Hi Eleanore,
>
> Maybe batched RPC is what you are looking for?
> https://beam.apache.org/blog/timely-processing/
>
> On Wed, Jul 8, 2020 at 6:20 PM Eleanore Jin <el...@gmail.com>
> wrote:
>
>> Thanks Luke and Max for the information.
>>
>> We have the use case that inside a DoFn, we will need to call external
>> services to trigger some other flows. The calls to other services are REST
>> based sync calls, and it will take 150 milliseconds plus to return. We are
>> using Flink as the runner and I came across this Async I/O operator from
>> flink, trying to figure out if this is the right approach and if Beam
>> provides any similar concept for it.
>>
>> Thanks!
>> Eleanore
>>
>> On Wed, Jul 8, 2020 at 2:55 AM Maximilian Michels <mx...@apache.org> wrote:
>>
>>> Just to clarify: We could make the AsnycIO operator also available in
>>> Beam but the operator has to be represented by a concept in Beam.
>>> Otherwise, there is no way to know when to produce it as part of the
>>> translation.
>>>
>>> On 08.07.20 11:53, Maximilian Michels wrote:
>>> > Flink's AsycIO operator is useful for processing io-bound operations,
>>> > e.g. sending network requests. Like Luke mentioned, it is not
>>> available
>>> > in Beam.
>>> >
>>> > -Max
>>> >
>>> > On 07.07.20 22:11, Luke Cwik wrote:
>>> >> Beam is a layer that sits on top of execution engines like Flink and
>>> >> provides its own programming model thus native operators like Flink's
>>> >> async IO operator are not exposed.
>>> >>
>>> >> Most people use a DoFn to do all their IO and sometimes will compose
>>> >> it with another transform such as GroupIntoBatches[1] to simplify
>>> >> their implementation.
>>> >>
>>> >> Why do you need async?
>>> >>
>>> >> 1:
>>> >>
>>> https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <eleanore.jin@gmail.com
>>> >> <ma...@gmail.com>> wrote:
>>> >>
>>> >>     Hi community,
>>> >>
>>> >>     I cannot find any documentation for Beam supporting Flink async IO
>>> >>     operator
>>> >>
>>> >> (
>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
>>>
>>> >>
>>> >>     just wonder is this not supported right now?
>>> >>
>>> >>     Thanks a lot!
>>> >>     Eleanore
>>> >>
>>>
>>

Re: Beam supports Flink Async IO operator

Posted by "Kaymak, Tobias" <to...@ricardo.ch>.
Hi Eleanore,

Maybe batched RPC is what you are looking for?
https://beam.apache.org/blog/timely-processing/

On Wed, Jul 8, 2020 at 6:20 PM Eleanore Jin <el...@gmail.com> wrote:

> Thanks Luke and Max for the information.
>
> We have the use case that inside a DoFn, we will need to call external
> services to trigger some other flows. The calls to other services are REST
> based sync calls, and it will take 150 milliseconds plus to return. We are
> using Flink as the runner and I came across this Async I/O operator from
> flink, trying to figure out if this is the right approach and if Beam
> provides any similar concept for it.
>
> Thanks!
> Eleanore
>
> On Wed, Jul 8, 2020 at 2:55 AM Maximilian Michels <mx...@apache.org> wrote:
>
>> Just to clarify: We could make the AsnycIO operator also available in
>> Beam but the operator has to be represented by a concept in Beam.
>> Otherwise, there is no way to know when to produce it as part of the
>> translation.
>>
>> On 08.07.20 11:53, Maximilian Michels wrote:
>> > Flink's AsycIO operator is useful for processing io-bound operations,
>> > e.g. sending network requests. Like Luke mentioned, it is not available
>> > in Beam.
>> >
>> > -Max
>> >
>> > On 07.07.20 22:11, Luke Cwik wrote:
>> >> Beam is a layer that sits on top of execution engines like Flink and
>> >> provides its own programming model thus native operators like Flink's
>> >> async IO operator are not exposed.
>> >>
>> >> Most people use a DoFn to do all their IO and sometimes will compose
>> >> it with another transform such as GroupIntoBatches[1] to simplify
>> >> their implementation.
>> >>
>> >> Why do you need async?
>> >>
>> >> 1:
>> >>
>> https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/
>> >>
>> >>
>> >>
>> >> On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <eleanore.jin@gmail.com
>> >> <ma...@gmail.com>> wrote:
>> >>
>> >>     Hi community,
>> >>
>> >>     I cannot find any documentation for Beam supporting Flink async IO
>> >>     operator
>> >>
>> >> (
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
>>
>> >>
>> >>     just wonder is this not supported right now?
>> >>
>> >>     Thanks a lot!
>> >>     Eleanore
>> >>
>>
>

Re: Beam supports Flink Async IO operator

Posted by Eleanore Jin <el...@gmail.com>.
Thanks Luke and Max for the information.

We have the use case that inside a DoFn, we will need to call external
services to trigger some other flows. The calls to other services are REST
based sync calls, and it will take 150 milliseconds plus to return. We are
using Flink as the runner and I came across this Async I/O operator from
flink, trying to figure out if this is the right approach and if Beam
provides any similar concept for it.

Thanks!
Eleanore

On Wed, Jul 8, 2020 at 2:55 AM Maximilian Michels <mx...@apache.org> wrote:

> Just to clarify: We could make the AsnycIO operator also available in
> Beam but the operator has to be represented by a concept in Beam.
> Otherwise, there is no way to know when to produce it as part of the
> translation.
>
> On 08.07.20 11:53, Maximilian Michels wrote:
> > Flink's AsycIO operator is useful for processing io-bound operations,
> > e.g. sending network requests. Like Luke mentioned, it is not available
> > in Beam.
> >
> > -Max
> >
> > On 07.07.20 22:11, Luke Cwik wrote:
> >> Beam is a layer that sits on top of execution engines like Flink and
> >> provides its own programming model thus native operators like Flink's
> >> async IO operator are not exposed.
> >>
> >> Most people use a DoFn to do all their IO and sometimes will compose
> >> it with another transform such as GroupIntoBatches[1] to simplify
> >> their implementation.
> >>
> >> Why do you need async?
> >>
> >> 1:
> >>
> https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/
> >>
> >>
> >>
> >> On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <eleanore.jin@gmail.com
> >> <ma...@gmail.com>> wrote:
> >>
> >>     Hi community,
> >>
> >>     I cannot find any documentation for Beam supporting Flink async IO
> >>     operator
> >>
> >> (
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
>
> >>
> >>     just wonder is this not supported right now?
> >>
> >>     Thanks a lot!
> >>     Eleanore
> >>
>

Re: Beam supports Flink Async IO operator

Posted by Maximilian Michels <mx...@apache.org>.
Just to clarify: We could make the AsnycIO operator also available in 
Beam but the operator has to be represented by a concept in Beam. 
Otherwise, there is no way to know when to produce it as part of the 
translation.

On 08.07.20 11:53, Maximilian Michels wrote:
> Flink's AsycIO operator is useful for processing io-bound operations, 
> e.g. sending network requests. Like Luke mentioned, it is not available 
> in Beam.
> 
> -Max
> 
> On 07.07.20 22:11, Luke Cwik wrote:
>> Beam is a layer that sits on top of execution engines like Flink and 
>> provides its own programming model thus native operators like Flink's 
>> async IO operator are not exposed.
>>
>> Most people use a DoFn to do all their IO and sometimes will compose 
>> it with another transform such as GroupIntoBatches[1] to simplify 
>> their implementation.
>>
>> Why do you need async?
>>
>> 1: 
>> https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/ 
>>
>>
>>
>> On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <eleanore.jin@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>     Hi community,
>>
>>     I cannot find any documentation for Beam supporting Flink async IO
>>     operator
>>     
>> (https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html), 
>>
>>     just wonder is this not supported right now?
>>
>>     Thanks a lot!
>>     Eleanore
>>

Re: Beam supports Flink Async IO operator

Posted by Maximilian Michels <mx...@apache.org>.
Flink's AsycIO operator is useful for processing io-bound operations, 
e.g. sending network requests. Like Luke mentioned, it is not available 
in Beam.

-Max

On 07.07.20 22:11, Luke Cwik wrote:
> Beam is a layer that sits on top of execution engines like Flink and 
> provides its own programming model thus native operators like Flink's 
> async IO operator are not exposed.
> 
> Most people use a DoFn to do all their IO and sometimes will compose it 
> with another transform such as GroupIntoBatches[1] to simplify their 
> implementation.
> 
> Why do you need async?
> 
> 1: 
> https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/
> 
> 
> On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <eleanore.jin@gmail.com 
> <ma...@gmail.com>> wrote:
> 
>     Hi community,
> 
>     I cannot find any documentation for Beam supporting Flink async IO
>     operator
>     (https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
>     just wonder is this not supported right now?
> 
>     Thanks a lot!
>     Eleanore
> 

Re: Beam supports Flink Async IO operator

Posted by Luke Cwik <lc...@google.com>.
Beam is a layer that sits on top of execution engines like Flink and
provides its own programming model thus native operators like Flink's async
IO operator are not exposed.

Most people use a DoFn to do all their IO and sometimes will compose it
with another transform such as GroupIntoBatches[1] to simplify their
implementation.

Why do you need async?

1:
https://beam.apache.org/documentation/transforms/java/aggregation/groupintobatches/


On Tue, Jul 7, 2020 at 11:03 AM Eleanore Jin <el...@gmail.com> wrote:

> Hi community,
>
> I cannot find any documentation for Beam supporting Flink async IO
> operator (
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html),
> just wonder is this not supported right now?
>
> Thanks a lot!
> Eleanore
>