You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Hicken, Jan" <Ja...@ottogroup.com> on 2017/11/09 12:57:09 UTC

Serialization in Operator Chaining

Hi folks,

I have a question regarding the serialization in Flink's operator
chaining:

Consider these two map functions: Map1<String, T> and Map2<T, String>

As I haven't disabled operator chaining in the environment, these two
functions will be chained into one operator when executing my job.

The thing is, that the serialization for objects of type T is quite
expensive and I'd like to avoid that as much as possible. Does Flink
actually serialize these objects under the hood even if the functions
run in the same operator? If so, is it possible to disable the
serialization somehow?

Kind regards,
Jan

Re: Serialization in Operator Chaining

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,

If you use the DataSet API, there will be no serialisation between operations in a chain. If you use the DataStream API, there will be serialisation by default but you can disable that using executionEnv.getConfig().enableObjectReuse().

Hope that helps,
Aljoscha

> On 9. Nov 2017, at 13:57, Hicken, Jan <Ja...@ottogroup.com> wrote:
> 
> Hi folks,
> 
> I have a question regarding the serialization in Flink's operator
> chaining:
> 
> Consider these two map functions: Map1<String, T> and Map2<T, String>
> 
> As I haven't disabled operator chaining in the environment, these two
> functions will be chained into one operator when executing my job.
> 
> The thing is, that the serialization for objects of type T is quite
> expensive and I'd like to avoid that as much as possible. Does Flink
> actually serialize these objects under the hood even if the functions
> run in the same operator? If so, is it possible to disable the
> serialization somehow?
> 
> Kind regards,
> Jan