You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Peter Bacsko (JIRA)" <ji...@apache.org> on 2017/11/10 15:57:00 UTC

[jira] [Comment Edited] (MAPREDUCE-5124) AM lacks flow control for task events

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247678#comment-16247678 ] 

Peter Bacsko edited comment on MAPREDUCE-5124 at 11/10/17 3:56 PM:
-------------------------------------------------------------------

Thanks Jason.

Are you sure we can't just replace the status updates? I checked the code of TaskReporter, to me it seems that counters/fetch failures cannot be removed, only altered/increased. If you think about it, we send updates in every 3 seconds anyway - so if it's a problem, then it would appear on the client side, too (that is, losing data).

I agree with your comment regarding the mapping - passing a reference in the event is a good idea.


was (Author: pbacsko):
Thanks Jason.

Are you sure we can't just replace the status updates? I checked the code of TaskReporter, to it seems that counters/fetch failures cannot be removed, only altered/increased. If you think about it, we send updates in every 3 seconds anyway - so if it's a problem, then it would appear on the client side, too (that is, losing data).

I agree with your comment regarding the mapping - passing a reference in the event is a good idea.

> AM lacks flow control for task events
> -------------------------------------
>
>                 Key: MAPREDUCE-5124
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.3-alpha, 0.23.5
>            Reporter: Jason Lowe
>            Assignee: Peter Bacsko
>         Attachments: MAPREDUCE-5124-CoalescingPOC-1.patch, MAPREDUCE-5124-CoalescingPOC2.patch, MAPREDUCE-5124-proto.2.txt, MAPREDUCE-5124-prototype.txt
>
>
> The AM does not have any flow control to limit the incoming rate of events from tasks.  If the AM is unable to keep pace with the rate of incoming events for a sufficient period of time then it will eventually exhaust the heap and crash.  MAPREDUCE-5043 addressed a major bottleneck for event processing, but the AM could still get behind if it's starved for CPU and/or handling a very large job with tens of thousands of active tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org