You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Shailesh Jain <sh...@stellapps.com> on 2017/12/05 10:58:14 UTC

Share state across operators

Hi,

Is it possible to share state across operators in Flink?

I have CoFlatMap operator which maintains a ListState and returns a
DataStream. And downstream there is a KafkaSink operator for the same
DataStream which needs to access the ListState.

Thanks,
Shailesh

Re: Share state across operators

Posted by Shailesh Jain <sh...@stellapps.com>.
Thanks, Timo.

Either<L, R> works for me.

On Tue, Dec 5, 2017 at 4:55 PM, Timo Walther <tw...@apache.org> wrote:

> Hi Shailesh,
>
> sharing state across operators is not possible. However, you could emit
> the state (or parts of it) as a stream element to downstream operators by
> having a function that emits a type like "Either<MyElement,List<MyState
> >>".
>
> Another option would be to use side outputs to send state to downstream
> operators [0].
>
> Maybe you can tell use a bit more about what you want to achieve?
>
> Regards,
> Timo
>
> [0] https://ci.apache.org/projects/flink/flink-docs-release-1.3/
> dev/stream/side_output.html
>
>
> Am 12/5/17 um 11:58 AM schrieb Shailesh Jain:
>
> Hi,
>>
>> Is it possible to share state across operators in Flink?
>>
>> I have CoFlatMap operator which maintains a ListState and returns a
>> DataStream. And downstream there is a KafkaSink operator for the same
>> DataStream which needs to access the ListState.
>>
>> Thanks,
>> Shailesh
>>
>
>
>

Re: Share state across operators

Posted by "m@xi" <ma...@gmail.com>.
Thank a lot Timo!

Best,
Max



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Share state across operators

Posted by Timo Walther <tw...@apache.org>.
Hi Max,

I would go with the Either<A, B> approach if you want to ensure that the 
initital state and the first element arrive in the right order. 
Performance-wise there should not be a big different between both 
approaches. The side outputs are more meant for have a side channel 
beside the main stream that is not keyed anymore (e.g. for late elements 
that need a special treatment next to the main processing).

Regards,
Timo

Am 12.03.18 um 10:17 schrieb m@xi:
> Hey Flinker!
>
> Anyone? Anybody? Someone with experience or any idea on the question above?
>
> Best,
> Max
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/



Re: Share state across operators

Posted by "m@xi" <ma...@gmail.com>.
Hey Flinker!

Anyone? Anybody? Someone with experience or any idea on the question above?

Best,
Max



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Share state across operators

Posted by "m@xi" <ma...@gmail.com>.
Hey Timo!

I am using Java for my implementation and I have found this article [1] in
stackoverflow for simulating the Either<A,B> in Java. 

Now, for my case, I have a coordinator instance (parallelism = 1) that needs
to both distribute incoming tuples in a specific way, but also needs to
redistribute some state (previously distributed tuples) that maintains to
other downstream operators.

I am wondering whether I should use SideOutputs or Either<A,B> to succeed
the aforementioned goal. Nevertheless, I am wondering whether there are any
differences regarding efficiency/effectiveness of the above 2 proposed
workarounds to share state.

Also, in the case of Either<A,B>, I am first doing all my distribution of Bs
(parts of the state) and afterwards in the same .processElement() function I
am distributing the newly arrived tuples to the downstream operators. So,
given that my coordinator's parallelism is 1, I assume that the order of my
collect()-s is preserved, meaning the first the state redistribution will
get settled in the downstream operators, and afterwards the processing of
new tuples will happen. Is that correct? (I do not no if inside the network
buffers things get twisted somehow).

Thanks a lot in advance!

Best,
Max

[1] --
https://stackoverflow.com/questions/9975836/how-can-i-simulate-haskells-either-a-b-in-java




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Share state across operators

Posted by Timo Walther <tw...@apache.org>.
Hi Shailesh,

sharing state across operators is not possible. However, you could emit 
the state (or parts of it) as a stream element to downstream operators 
by having a function that emits a type like 
"Either<MyElement,List<MyState>>".

Another option would be to use side outputs to send state to downstream 
operators [0].

Maybe you can tell use a bit more about what you want to achieve?

Regards,
Timo

[0] 
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/side_output.html


Am 12/5/17 um 11:58 AM schrieb Shailesh Jain:
> Hi,
>
> Is it possible to share state across operators in Flink?
>
> I have CoFlatMap operator which maintains a ListState and returns a 
> DataStream. And downstream there is a KafkaSink operator for the same 
> DataStream which needs to access the ListState.
>
> Thanks,
> Shailesh