You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2016/04/01 23:27:25 UTC

[jira] [Commented] (TEZ-3195) TezMerger OOM: unreserve called while memory still held

    [ https://issues.apache.org/jira/browse/TEZ-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222362#comment-15222362 ] 

Jonathan Eagles commented on TEZ-3195:
--------------------------------------

While this patch is verified to make the byte buffers unreachable in the heap, they are still not always garbage collected. Still tracking down the second issue while this patch can use some feedback to verify the patch is not breaking API assumptions.

> TezMerger OOM: unreserve called while memory still held
> -------------------------------------------------------
>
>                 Key: TEZ-3195
>                 URL: https://issues.apache.org/jira/browse/TEZ-3195
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3195.1-branch-0.7.patch, TEZ-3195.1.patch
>
>
> When the reader is closed in MergeQueue#adjustPriorityQueue, the byte buffer is still held in several places in the code while unreserve is called. In the case below, the Fetcher was trying to fetch a nearly 100MB map output which exposed this race condition.
> {noformat}
> Caused by: java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
> 	at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput.<init>(MapOutput.java:75)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput.createMemoryMapOutput(MapOutput.java:124)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.unconditionalReserve(MergeManager.java:437)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.reserve(MergeManager.java:427)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyMapOutput(FetcherOrderedGrouped.java:481)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:286)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:176)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:191)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)