You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/08/26 11:00:00 UTC

[jira] [Updated] (FLINK-19052) Performance issue with PojoSerializer

     [ https://issues.apache.org/jira/browse/FLINK-19052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated FLINK-19052:
-----------------------------------
    Labels: pull-request-available  (was: )

> Performance issue with PojoSerializer
> -------------------------------------
>
>                 Key: FLINK-19052
>                 URL: https://issues.apache.org/jira/browse/FLINK-19052
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Type Serialization System
>    Affects Versions: 1.11.1
>         Environment: Flink 1.12 master on 26.08.2020
>            Reporter: Roman Grebennikov
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2020-08-26-10-46-19-800.png, image-2020-08-26-10-49-59-400.png
>
>
> Currently PojoSerializer.createInstance() uses reflection call to create a class instance. As this method is called on each stream element on deserialization, reflection overhead can become noticeable in serialization-bound cases when:
>  # Pojo class is small, so instantiation is noticeable.
>  # The job is not having heavy CPU-bound event processing.
> See this flamegraph built for flink-benchmarks/SerializationFrameworkMiniBenchmarks.serializerPojo benchmark:
> !image-2020-08-26-10-46-19-800.png!
> This Reflection.getCallerClass method consumes a lot of CPU, mostly doing a security check if we allowed to do this reflective call.
>  
> There is no true reason to perform this check on each deserializing event, so to speed things up we can just cache the constructor using MetaHandle, so this check will be performed only once. With this tiny fix, the getCallerClass is gone:
>  
> !image-2020-08-26-10-49-59-400.png!
>  
> The benchmark result:
> {noformat}
> serializerPojo thrpt 100 487.706 ± 30.480 ops/ms
> serializerPojo thrpt 100 569.828 ± 8.815 ops/m{noformat}
> Which is +15% to throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)