You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by gerardg <ge...@talaia.io> on 2018/08/17 15:29:37 UTC

Override CaseClassSerializer with custom serializer

Hello,

I can't seem to be able to override the CaseClassSerializer with my custom
serializer. I'm using env.getConfig.addDefaultKryoSerializer() to add the
custom serializer but I don't see it being used. I guess it is because it
only uses Kryo based serializers if it can't find a Flink serializer? 

Is then worth it to replace the CaseClassSerializer with a custom
serializer? (when I profile the CaseClassSerializer.(de)serialize method
appears as the most used so I wanted to give it a try) If so, how can I do
it?

Thanks,

Gerard



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Override CaseClassSerializer with custom serializer

Posted by Gerard Garcia <ge...@talaia.io>.
Hi Timo,

I see. Yes, we have already use the "Object Reuse" option. It was a nice
performance improvement when we first set it!

I guess another option we can try is to somehow make things "easier" to
Flink so it can chain operators together. Most of them are not chained, I
think it's because they have a control stream as source together with the
main stream. I'll need to check that and see if we can re-architecture them.

Thanks,

Gerard

On Fri, Aug 17, 2018 at 11:21 PM Timo Walther <tw...@apache.org> wrote:

> Hi Gerard,
>
> you are correct, Kryo serializers are only used when no built-in Flink
> serializer is available.
>
> Actually, the tuple and case class serializers are one of the most
> performant serializers in Flink (due to their fixed length, no null
> support). If you really want to reduce the serialization overhead you
> could look into the object reuse mode. We had this topic on the mailing
> list recently, I will just copy it here:
>
> If you want to improve the performance of a collect() between operators,
> you could also enable object reuse. You can read more about this here
> [1] (section "Issue 2: Object Reuse"), but make sure your implementation
> is correct because an operator could modify the objects of follwing
> operators.
>
> I hope this helps.
>
> Regards,
> Timo
>
> [1]
>
> https://data-artisans.com/blog/curious-case-broken-benchmark-revisiting-apache-flink-vs-databricks-runtime
>
> Am 17.08.18 um 17:29 schrieb gerardg:
> > Hello,
> >
> > I can't seem to be able to override the CaseClassSerializer with my
> custom
> > serializer. I'm using env.getConfig.addDefaultKryoSerializer() to add the
> > custom serializer but I don't see it being used. I guess it is because it
> > only uses Kryo based serializers if it can't find a Flink serializer?
> >
> > Is then worth it to replace the CaseClassSerializer with a custom
> > serializer? (when I profile the CaseClassSerializer.(de)serialize method
> > appears as the most used so I wanted to give it a try) If so, how can I
> do
> > it?
> >
> > Thanks,
> >
> > Gerard
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>
>
>

Re: Override CaseClassSerializer with custom serializer

Posted by Timo Walther <tw...@apache.org>.
Hi Gerard,

you are correct, Kryo serializers are only used when no built-in Flink 
serializer is available.

Actually, the tuple and case class serializers are one of the most 
performant serializers in Flink (due to their fixed length, no null 
support). If you really want to reduce the serialization overhead you 
could look into the object reuse mode. We had this topic on the mailing 
list recently, I will just copy it here:

If you want to improve the performance of a collect() between operators, 
you could also enable object reuse. You can read more about this here 
[1] (section "Issue 2: Object Reuse"), but make sure your implementation 
is correct because an operator could modify the objects of follwing 
operators.

I hope this helps.

Regards,
Timo

[1] 
https://data-artisans.com/blog/curious-case-broken-benchmark-revisiting-apache-flink-vs-databricks-runtime

Am 17.08.18 um 17:29 schrieb gerardg:
> Hello,
>
> I can't seem to be able to override the CaseClassSerializer with my custom
> serializer. I'm using env.getConfig.addDefaultKryoSerializer() to add the
> custom serializer but I don't see it being used. I guess it is because it
> only uses Kryo based serializers if it can't find a Flink serializer?
>
> Is then worth it to replace the CaseClassSerializer with a custom
> serializer? (when I profile the CaseClassSerializer.(de)serialize method
> appears as the most used so I wanted to give it a try) If so, how can I do
> it?
>
> Thanks,
>
> Gerard
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/