You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2016/02/13 17:25:18 UTC

[jira] [Comment Edited] (TEZ-3115) Shuffle string handling adds significant memory overhead

    [ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146041#comment-15146041 ] 

Jonathan Eagles edited comment on TEZ-3115 at 2/13/16 4:25 PM:
---------------------------------------------------------------

Setup some of the obvious initial plumbing to see this idea in practice. I left the fundamental MapHost id as hostIdentifier (host:port). If we want to intern the host strings, we will want to break the hostIdentifier up and store them separately. I'll do some runs to measure where we are at this point and use patch 1 as a checkpoint.


was (Author: jeagles):
Setup some of the obvious initial plumbing to see this idea in practice. I left the fundamental mapout id as hostIdentifier (host:port). If we want to intern the host strings, we will want to break the hostIdentifier up and store them separately. I'll do some runs to measure where we are at this point and use patch 1 as a checkpoint.

> Shuffle string handling adds significant memory overhead
> --------------------------------------------------------
>
>                 Key: TEZ-3115
>                 URL: https://issues.apache.org/jira/browse/TEZ-3115
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jason Lowe
>         Attachments: TEZ-3115.1.patch
>
>
> While investigating the OOM heap dump from TEZ-3114 I noticed that the ShuffleManager and other shuffle-related objects were holding onto many strings that added up to over a hundred megabytes of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)