You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Ufuk Celebi (JIRA)" <ji...@apache.org> on 2015/10/20 21:53:27 UTC

[jira] [Commented] (FLINK-2882) Improve performance of string conversions

    [ https://issues.apache.org/jira/browse/FLINK-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965649#comment-14965649 ] 

Ufuk Celebi commented on FLINK-2882:
------------------------------------

Regarding {{toShortString}}: This is exclusively used in the network stack and there it's only used for debugging as far as I know, as part of the {{toString}} methods of result partitions, because they require two IDs to identify their source/target (which gets very long even as hex strings). Can you tell what fraction of calls come from {{toShortString}}? We can also think about removing that variant as it was mostly useful in the early days during debugging the network stack.

In general, I'm wondering why the ID toString methods are called so often. Can you give the top 10 stack traces leading to it or so? And what LOG level you are using?

In any case, both your suggestions sound reasonable to me.

> Improve performance of string conversions
> -----------------------------------------
>
>                 Key: FLINK-2882
>                 URL: https://issues.apache.org/jira/browse/FLINK-2882
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.10
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>
> {{AbstractID.toString()}} and {{AbstractID.toShortString()}} call {{StringUtils.byteToHexString(...)}} which uses a StringBuilder to convert from binary to hex. This is a hotspot when scaling the number of workers.
> While testing on my single node with parallelism=512 jvisualvm reports 600,000 calls taking 13.4 seconds. Improving {{StringUtils.byteToHexString(...)}} reduces the time to 1.3 seconds. Additionally memoizing the string values in {{AbstractID}} reduce the time to 350 ms and the number of calls to {{StringUtils.byteToHexString(...)}} to ~1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)