You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2012/10/25 04:00:13 UTC

[jira] [Commented] (HADOOP-8942) Thundering herd of RPCs with large responses leads to OOM

    [ https://issues.apache.org/jira/browse/HADOOP-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483821#comment-13483821 ] 

Hudson commented on HADOOP-8942:
--------------------------------

Integrated in Hadoop-trunk-Commit #2925 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/2925/])
    MAPREDUCE-4730. Fix Reducer's EventFetcher to scale the map-completion requests slowly to avoid HADOOP-8942. Contributed by Jason Lowe. (Revision 1401941)

     Result = SUCCESS
vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1401941
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/EventFetcher.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestEventFetcher.java

                
> Thundering herd of RPCs with large responses leads to OOM
> ---------------------------------------------------------
>
>                 Key: HADOOP-8942
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8942
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.23.3
>            Reporter: Jason Lowe
>
> When a large number of clients are all making calls with large amounts of response data then the IPC server can exhaust memory.  See MAPREDUCE-4730 for an example of this.
> There does not appear to be any flow control between the server's handler threads and the responder thread.  If a handler thread cannot write out all of the response data without blocking, it queues up the remainder for the responder thread and goes back to the next call in the call queue.  If there are enough clients, this can cause the handler threads to overwhelm the heap by queueing response data faster than it can be processed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira