You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Karthik Kambatla (JIRA)" <ji...@apache.org> on 2016/01/02 19:21:40 UTC

[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

    [ https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076629#comment-15076629 ] 

Karthik Kambatla commented on YARN-3870:
----------------------------------------

bq. Karthik Kambatla, I am thinking whether we also need some update for the response part to correlate it with the ResourceRequest ID. As the scheduling is asynchronous, AM will also need to know the relation between response and request.

bq. If the AM doesn't add an ID, the RM could add one. Or, we could have the RM add the IDs and return them to the AM for help with book keeping.

Thought more about this. Since one AllocateRequest could have multiple ResourceRequests, the protocol becomes quite complicated if the RM creates an ID instead of the AM. How about we expect the AM to set this ID? If the AM doesn't set, we treat the requests the same way we do today (ID = -1). In the AllocateResponse, the RM could send the last received ResourceRequest. The AM could look at this ACK to see if it has to resend the requests? The AMRMClient and MR-AM could be updated to do this. 

> Providing raw container request information for fine scheduling
> ---------------------------------------------------------------
>
>                 Key: YARN-3870
>                 URL: https://issues.apache.org/jira/browse/YARN-3870
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, applications, capacityscheduler, fairscheduler, resourcemanager, scheduler, yarn
>            Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands individual container requests into host/rack/any format. For instance, if I am asking for container request with preference "host1, host2, host3", assuming all are in the same rack rack1, instead of sending one raw container request to RM/Scheduler with raw preference list, it basically expand it to become 5 different objects with host1, host2, host3, rack1 and any in there. When scheduler receives information, it basically already lost the raw request. This is ok for single container request, but it will cause trouble when dealing with multiple container requests from the same application. Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without knowing the raw container request. The situation will get worse when dealing with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)