You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/07/07 10:41:11 UTC
[jira] [Commented] (FLINK-3599) GSoC: Code Generation in
Serializers
[ https://issues.apache.org/jira/browse/FLINK-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365935#comment-15365935 ]
ASF GitHub Bot commented on FLINK-3599:
---------------------------------------
GitHub user Xazax-hun opened a pull request:
https://github.com/apache/flink/pull/2211
[WIP][FLINK-3599] Code generation for PojoSerializer and PojoComparator
The current implementation of the serializers can be a
performance bottleneck in some scenarios. These performance problems were
also reported on the mailing list recently [1].
E.g. the PojoSerializer uses reflection for accessing the fields, which is slow [2].
For the complete proposal see [3].
This pull request implements code generation support for PojoComparators and PojoSerializers. On my machine I could measure about 10% performance improvements for the WordCountPojo example. This pull request does not implement distribution of the generated code to the task managers yet.
[1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Tuple-performance-and-the-curious-JIT-compiler-td10666.html
[2] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/PojoSerializer.java#L369
[3] https://docs.google.com/document/d/1VC8lCeErx9kI5lCMPiUn625PO0rxR-iKlVqtt3hkVnk
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Xazax-hun/flink serializer_codegen
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2211.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2211
----
commit 6263ebe496ed7a0ac9ca9df35ffcdb8633519944
Author: Gabor Horvath <xa...@gmail.com>
Date: 2016-04-17T13:40:33Z
Implement PojoSerializer and PojoComparator generators.
commit be698b44453f10add284db1c5dee24f719a87902
Author: Gabor Horvath <xa...@gmail.com>
Date: 2016-07-03T13:58:41Z
Migrate code generation templates from string literals to files.
commit d8c63a1749a439907ef6bfbdb2da1962df7b61d3
Author: Gabor Horvath <xa...@gmail.com>
Date: 2016-07-06T11:23:29Z
Fix a bunch of test failures.
----
> GSoC: Code Generation in Serializers
> ------------------------------------
>
> Key: FLINK-3599
> URL: https://issues.apache.org/jira/browse/FLINK-3599
> Project: Flink
> Issue Type: Improvement
> Components: Type Serialization System
> Reporter: Márton Balassi
> Assignee: Gabor Horvath
> Labels: gsoc2016, mentor
>
> The current implementation of the serializers can be a
> performance bottleneck in some scenarios. These performance problems were
> also reported on the mailing list recently [1].
> E.g. the PojoSerializer uses reflection for accessing the fields, which is slow [2].
> For the complete proposal see [3].
> [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Tuple-performance-and-the-curious-JIT-compiler-td10666.html
> [2] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/PojoSerializer.java#L369
> [3] https://docs.google.com/document/d/1VC8lCeErx9kI5lCMPiUn625PO0rxR-iKlVqtt3hkVnk
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)