You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by madan <ma...@gmail.com> on 2017/12/18 12:15:31 UTC

About serialization - When and What

Hi,

I am trying to understand serialization part when a environment is
executed. Taking a simple environment for ex.,
env.addSource(...).keyBy(...).sum(...).addSink(...).execute. I would like
to understand when and where serialization happens and what are all
serialized, operators,functions etc.,

Can anyone please give some information or point me at proper documentation.

-- 
Thank you,
Madan.

Re: About serialization - When and What

Posted by Timo Walther <tw...@apache.org>.
Hi Madan,

serialization happens at different positions with different mechanisms.

For records that are travelling in the stream the serializer that is 
defined by the type information is used (print 
env.addSource().getType()). However, whether records are serialized or 
not depends on the so-called object reuse mode [0] and if repartitioning 
is involved. By default, records are serialized between all operators 
because Flink Functions could mutually modify objects if a user does not 
pay special attention to it. If you know what you are doing, you can 
enable object reuse. In this case, serialization happens only for 
repartitioning, e.g. always after a keyBy().

For keeping state (like in sum()), a record must be serialized during a 
checkpoint or for writing it to the RocksDB state backend.

Instances of Functions are serialized using Java serialization (to ship 
member variables etc.).

I hope we can improve this part in the documentation in the near future.

Regards,
Timo


[0] 
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/execution_configuration.html

Am 12/18/17 um 1:15 PM schrieb madan:
> Hi,
>
> I am trying to understand serialization part when a environment is 
> executed. Taking a simple environment for ex., 
> env.addSource(...).keyBy(...).sum(...).addSink(...).execute. I would 
> like to understand when and where serialization happens and what are 
> all serialized, operators,functions etc.,
>
> Can anyone please give some information or point me at proper 
> documentation.
>
> -- 
> Thank you,
> Madan.