You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Avi Levi <av...@bluevoyant.com> on 2019/11/25 08:12:19 UTC

Idiomatic way to split pipeline

Hi,
I want to split the output of one of the operators to two pipelines. Since
the *split* method is deprecated, what is the idiomatic way to do that
without duplicating the operator ?

[image: Screen Shot 2019-11-25 at 10.05.38.png]

Re: Idiomatic way to split pipeline

Posted by Robert Metzger <rm...@apache.org>.
Hi Avi,
can you post the exception with the stack trace here as well?

On Sun, Dec 1, 2019 at 10:03 AM Avi Levi <av...@bluevoyant.com> wrote:

> Thanks Arvid,
> The problem is that I will get an exception on non unique uid on the
> *stream* .
>
> On Thu, Nov 28, 2019 at 2:45 PM Arvid Heise <ar...@ververica.com> wrote:
>
>> *This Message originated outside your organization.*
>> ------------------------------
>> Hi Avi,
>>
>> it seems to me that you are not really needing any split feature. As far
>> as I can see in your picture you want to apply two different windows on the
>> same input data.
>>
>> In that case you simply use two different subgraphs.
>>
>> stream = ...
>>
>> stream1 = stream.window(...).....addSink(<sink1>)
>>
>> stream2 = stream.window(...).....addSink(<sink2>)
>>
>> In Flink, you can compose arbitrary directed acyclic graphs, so consuming
>> the output of one operator on several downstream operators is completely
>> normal.
>>
>> Best,
>>
>> Arvid
>>
>> On Mon, Nov 25, 2019 at 10:50 AM Avi Levi <av...@bluevoyant.com>
>> wrote:
>>
>>> Thanks, I'll check it out.
>>>
>>> On Mon, Nov 25, 2019 at 11:46 AM vino yang <ya...@gmail.com>
>>> wrote:
>>>
>>>> *This Message originated outside your organization.*
>>>> ------------------------------
>>>> Hi Avi,
>>>>
>>>> The side output provides a superset of split's functionality. So
>>>> anything can be implemented via split also can be implemented via side
>>>> output.[1]
>>>>
>>>> Best,
>>>> Vino
>>>>
>>>> [1]:
>>>> https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data
>>>>
>>>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午5:32写道:
>>>>
>>>>> Thank you, for your quick reply. I appreciate that.  but this it not
>>>>> exactly "side output" per se. it is simple splitting. IIUC The side output
>>>>> is more for splitting the records buy something the differentiate them
>>>>> (latnes , value etc' ) . I thought there is more idiomatic but if this is
>>>>> it, than I will go with that.
>>>>>
>>>>> On Mon, Nov 25, 2019 at 10:42 AM vino yang <ya...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> *This Message originated outside your organization.*
>>>>>> ------------------------------
>>>>>> Hi Avi,
>>>>>>
>>>>>> As the doc of DataStream#split said, you can use the "side output"
>>>>>> feature to replace it.[1]
>>>>>>
>>>>>> [1]:
>>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
>>>>>>
>>>>>> Best,
>>>>>> Vino
>>>>>>
>>>>>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:
>>>>>>
>>>>>>> Hi,
>>>>>>> I want to split the output of one of the operators to two pipelines.
>>>>>>> Since the *split* method is deprecated, what is the idiomatic way
>>>>>>> to do that without duplicating the operator ?
>>>>>>>
>>>>>>> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>>>>>>>
>>>>>>>
>>>>>>>

Re: Idiomatic way to split pipeline

Posted by Avi Levi <av...@bluevoyant.com>.
Thanks Arvid,
The problem is that I will get an exception on non unique uid on the
*stream* .

On Thu, Nov 28, 2019 at 2:45 PM Arvid Heise <ar...@ververica.com> wrote:

> *This Message originated outside your organization.*
> ------------------------------
> Hi Avi,
>
> it seems to me that you are not really needing any split feature. As far
> as I can see in your picture you want to apply two different windows on the
> same input data.
>
> In that case you simply use two different subgraphs.
>
> stream = ...
>
> stream1 = stream.window(...).....addSink(<sink1>)
>
> stream2 = stream.window(...).....addSink(<sink2>)
>
> In Flink, you can compose arbitrary directed acyclic graphs, so consuming
> the output of one operator on several downstream operators is completely
> normal.
>
> Best,
>
> Arvid
>
> On Mon, Nov 25, 2019 at 10:50 AM Avi Levi <av...@bluevoyant.com> wrote:
>
>> Thanks, I'll check it out.
>>
>> On Mon, Nov 25, 2019 at 11:46 AM vino yang <ya...@gmail.com> wrote:
>>
>>> *This Message originated outside your organization.*
>>> ------------------------------
>>> Hi Avi,
>>>
>>> The side output provides a superset of split's functionality. So
>>> anything can be implemented via split also can be implemented via side
>>> output.[1]
>>>
>>> Best,
>>> Vino
>>>
>>> [1]:
>>> https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data
>>> <https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data>
>>>
>>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午5:32写道:
>>>
>>>> Thank you, for your quick reply. I appreciate that.  but this it not
>>>> exactly "side output" per se. it is simple splitting. IIUC The side output
>>>> is more for splitting the records buy something the differentiate them
>>>> (latnes , value etc' ) . I thought there is more idiomatic but if this is
>>>> it, than I will go with that.
>>>>
>>>> On Mon, Nov 25, 2019 at 10:42 AM vino yang <ya...@gmail.com>
>>>> wrote:
>>>>
>>>>> *This Message originated outside your organization.*
>>>>> ------------------------------
>>>>> Hi Avi,
>>>>>
>>>>> As the doc of DataStream#split said, you can use the "side output"
>>>>> feature to replace it.[1]
>>>>>
>>>>> [1]:
>>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
>>>>> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html>
>>>>>
>>>>> Best,
>>>>> Vino
>>>>>
>>>>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:
>>>>>
>>>>>> Hi,
>>>>>> I want to split the output of one of the operators to two pipelines.
>>>>>> Since the *split* method is deprecated, what is the idiomatic way to
>>>>>> do that without duplicating the operator ?
>>>>>>
>>>>>> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>>>>>>
>>>>>>
>>>>>>

Re: Idiomatic way to split pipeline

Posted by Arvid Heise <ar...@ververica.com>.
Hi Avi,

it seems to me that you are not really needing any split feature. As far as
I can see in your picture you want to apply two different windows on the
same input data.

In that case you simply use two different subgraphs.

stream = ...

stream1 = stream.window(...).....addSink(<sink1>)

stream2 = stream.window(...).....addSink(<sink2>)

In Flink, you can compose arbitrary directed acyclic graphs, so consuming
the output of one operator on several downstream operators is completely
normal.

Best,

Arvid

On Mon, Nov 25, 2019 at 10:50 AM Avi Levi <av...@bluevoyant.com> wrote:

> Thanks, I'll check it out.
>
> On Mon, Nov 25, 2019 at 11:46 AM vino yang <ya...@gmail.com> wrote:
>
>> *This Message originated outside your organization.*
>> ------------------------------
>> Hi Avi,
>>
>> The side output provides a superset of split's functionality. So anything
>> can be implemented via split also can be implemented via side output.[1]
>>
>> Best,
>> Vino
>>
>> [1]:
>> https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data
>>
>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午5:32写道:
>>
>>> Thank you, for your quick reply. I appreciate that.  but this it not
>>> exactly "side output" per se. it is simple splitting. IIUC The side output
>>> is more for splitting the records buy something the differentiate them
>>> (latnes , value etc' ) . I thought there is more idiomatic but if this is
>>> it, than I will go with that.
>>>
>>> On Mon, Nov 25, 2019 at 10:42 AM vino yang <ya...@gmail.com>
>>> wrote:
>>>
>>>> *This Message originated outside your organization.*
>>>> ------------------------------
>>>> Hi Avi,
>>>>
>>>> As the doc of DataStream#split said, you can use the "side output"
>>>> feature to replace it.[1]
>>>>
>>>> [1]:
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
>>>>
>>>> Best,
>>>> Vino
>>>>
>>>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:
>>>>
>>>>> Hi,
>>>>> I want to split the output of one of the operators to two pipelines.
>>>>> Since the *split* method is deprecated, what is the idiomatic way to
>>>>> do that without duplicating the operator ?
>>>>>
>>>>> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>>>>>
>>>>>
>>>>>

Re: Idiomatic way to split pipeline

Posted by Avi Levi <av...@bluevoyant.com>.
Thanks, I'll check it out.

On Mon, Nov 25, 2019 at 11:46 AM vino yang <ya...@gmail.com> wrote:

> *This Message originated outside your organization.*
> ------------------------------
> Hi Avi,
>
> The side output provides a superset of split's functionality. So anything
> can be implemented via split also can be implemented via side output.[1]
>
> Best,
> Vino
>
> [1]:
> https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data
> <https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data>
>
> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午5:32写道:
>
>> Thank you, for your quick reply. I appreciate that.  but this it not
>> exactly "side output" per se. it is simple splitting. IIUC The side output
>> is more for splitting the records buy something the differentiate them
>> (latnes , value etc' ) . I thought there is more idiomatic but if this is
>> it, than I will go with that.
>>
>> On Mon, Nov 25, 2019 at 10:42 AM vino yang <ya...@gmail.com> wrote:
>>
>>> *This Message originated outside your organization.*
>>> ------------------------------
>>> Hi Avi,
>>>
>>> As the doc of DataStream#split said, you can use the "side output"
>>> feature to replace it.[1]
>>>
>>> [1]:
>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
>>> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html>
>>>
>>> Best,
>>> Vino
>>>
>>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:
>>>
>>>> Hi,
>>>> I want to split the output of one of the operators to two pipelines.
>>>> Since the *split* method is deprecated, what is the idiomatic way to
>>>> do that without duplicating the operator ?
>>>>
>>>> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>>>>
>>>>
>>>>

Re: Idiomatic way to split pipeline

Posted by vino yang <ya...@gmail.com>.
Hi Avi,

The side output provides a superset of split's functionality. So anything
can be implemented via split also can be implemented via side output.[1]

Best,
Vino

[1]:
https://stackoverflow.com/questions/51440677/apache-flink-whats-the-difference-between-side-outputs-and-split-in-the-data

Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午5:32写道:

> Thank you, for your quick reply. I appreciate that.  but this it not
> exactly "side output" per se. it is simple splitting. IIUC The side output
> is more for splitting the records buy something the differentiate them
> (latnes , value etc' ) . I thought there is more idiomatic but if this is
> it, than I will go with that.
>
> On Mon, Nov 25, 2019 at 10:42 AM vino yang <ya...@gmail.com> wrote:
>
>> *This Message originated outside your organization.*
>> ------------------------------
>> Hi Avi,
>>
>> As the doc of DataStream#split said, you can use the "side output"
>> feature to replace it.[1]
>>
>> [1]:
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
>>
>> Best,
>> Vino
>>
>> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:
>>
>>> Hi,
>>> I want to split the output of one of the operators to two pipelines.
>>> Since the *split* method is deprecated, what is the idiomatic way to do
>>> that without duplicating the operator ?
>>>
>>> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>>>
>>>
>>>

Re: Idiomatic way to split pipeline

Posted by Avi Levi <av...@bluevoyant.com>.
Thank you, for your quick reply. I appreciate that.  but this it not
exactly "side output" per se. it is simple splitting. IIUC The side output
is more for splitting the records buy something the differentiate them
(latnes , value etc' ) . I thought there is more idiomatic but if this is
it, than I will go with that.

On Mon, Nov 25, 2019 at 10:42 AM vino yang <ya...@gmail.com> wrote:

> *This Message originated outside your organization.*
> ------------------------------
> Hi Avi,
>
> As the doc of DataStream#split said, you can use the "side output" feature
> to replace it.[1]
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html>
>
> Best,
> Vino
>
> Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:
>
>> Hi,
>> I want to split the output of one of the operators to two pipelines.
>> Since the *split* method is deprecated, what is the idiomatic way to do
>> that without duplicating the operator ?
>>
>> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>>
>>
>>

Re: Idiomatic way to split pipeline

Posted by vino yang <ya...@gmail.com>.
Hi Avi,

As the doc of DataStream#split said, you can use the "side output" feature
to replace it.[1]

[1]:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html

Best,
Vino

Avi Levi <av...@bluevoyant.com> 于2019年11月25日周一 下午4:12写道:

> Hi,
> I want to split the output of one of the operators to two pipelines. Since
> the *split* method is deprecated, what is the idiomatic way to do that
> without duplicating the operator ?
>
> [image: Screen Shot 2019-11-25 at 10.05.38.png]
>
>
>