You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by xie wei <ji...@googlemail.com> on 2017/08/22 06:31:10 UTC

how is data partitoned and distributed for connected stream

Hello Flink,

assume there are two finite streams, stream1(s1)has only one event,
stream2(s2)have 100 events, the parallelism is 2.
Then doing stream1.connect(stream2).map().
How is the data partitioned and distributed to the CoMap instances? Is the
event from s1 only available in one of the CoMap instance?
Thank you!

Best regards
Wei

Re: how is data partitoned and distributed for connected stream

Posted by Till Rohrmann <tr...@apache.org>.
Hi,

if all operators have the same parallelism, then there will be a pointwise
connection. This means all elements arriving at s1_x and s2_x will be
forwarded to s3_x with _x denoting the parallel subtask. Thus, to answer
your second question, the single s1 element will only be present at one
subtask of the CoMap operator, depending from which s1 parallel subtask it
comes.

Cheers,
Till

On Tue, Aug 22, 2017 at 8:31 AM, xie wei <ji...@googlemail.com> wrote:

> Hello Flink,
>
> assume there are two finite streams, stream1(s1)has only one event,
> stream2(s2)have 100 events, the parallelism is 2.
> Then doing stream1.connect(stream2).map().
> How is the data partitioned and distributed to the CoMap instances? Is the
> event from s1 only available in one of the CoMap instance?
> Thank you!
>
> Best regards
> Wei
>
>
>