You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wilfred Spiegelenburg (JIRA)" <ji...@apache.org> on 2018/12/03 00:24:00 UTC
[jira] [Commented] (YARN-8789) Add BoundedQueue to AsyncDispatcher

    [ https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16706529#comment-16706529 ] 

Wilfred Spiegelenburg commented on YARN-8789:
---------------------------------------------

Hi [~belugabehr] can you explain a bit more about when you saw the issue and which release you were on. [~pbacsko] already implemented a MR specific change in MAPREDUCE-5124 which has helped our deployments a lot. In that change we merge (coalesce) the events in the AM. It helps with the event queue growth as not every event is a new entry. Do we need still need a bounded queue for the events after that? 

The other thing that I do not understand is how your fix would really help the AM. It still needs to process all the events. You just stop adding them when the queue is full. They do not disappear. I think that if it can not keep up processing just stop accepting events does not solve the problem you just push it further away.

> Add BoundedQueue to AsyncDispatcher
> -----------------------------------
>
>                 Key: YARN-8789
>                 URL: https://issues.apache.org/jira/browse/YARN-8789
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: applications
>    Affects Versions: 3.2.0
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Major
>         Attachments: YARN-8789.1.patch, YARN-8789.10.patch, YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing with an OOM exception.  It had many thousands of Mappers and thousands of Reducers.  It was noted that in the logging that the event-queue of {{AsyncDispatcher}} had a very large number of item in it and was seemingly never decreasing.
> I started looking at the code and thought it could use some clean up, simplification, and the ability to specify a bounded queue so that any incoming events are throttled until they can be processed.  This will protect the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org