You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Navina Ramesh <nr...@linkedin.com.INVALID> on 2015/06/12 03:10:47 UTC

Ordering of ResourceRequest handling in the RM

Hi yarn-devs,

I am trying to utilize YARN’s host-affinity feature in Apache Samza (SAMZA-617).

I have a question regarding the order in which the resource requests are executed by the RM. Today, we make successive requests for x (say x = 3) containers for deploying a Samza job.
It looks like:
Request 1: (container-0, $hostX)
Request 2: (container-1, $hostY)
Request 3: (container-2, $hostZ)

When implementing the callback “onContainerAllocated” in the SamzaAppMaster, how can I associate an allocated container with its corresponding containerRequest? Is there a way in YARN to associate an allocated container to its request?

If not, is it correct to assume that the RM handles the requests in a FIFO manner and hence, the order in which the onContainerAllocated callback is invoked will be the same as the request order?

This information will be very useful for Samza to implement host-affinity in its deployment model. Please let me know.

Thanks!
Navina


Re: Ordering of ResourceRequest handling in the RM

Posted by Steve Loughran <st...@hortonworks.com>.
> On 12 Jun 2015, at 20:20, Navina Ramesh <nr...@linkedin.com.INVALID> wrote:
> 
> Hi Steve,
> Thanks for your response.
> 
> I think I should have mentioned a couple of design choices we made:
> 1. *Continuous Scheduling*  - I explored the option of requesting
> containers with relaxLocality==false. But this will add to latency and
> there is no way of revoking the resource request.

You can cancel requests. AMRMClientAsync.removeContainerRequest(request);


> An alternate approach
> was to use continuous scheduling with the FairScheduler. Instead of
> waiting on node heartbeats, the RM iterates through the nodes based on the
> latest known states. We enable this and also, associate a rack and node
> level delay. This way if the requested resource is not available, it will
> automatically relaxLocality and return any available resource.
> 
> 2. All the resource request that I make is for running the same component
> and hence, their priority is the same.
> 
>>> that's a bad strategy. What if hostY had capacity before hostX? Would
>>> you expect it to be blocked until hostX was satisifed? as hostY may not
>>> be free then
> It is fine if the hostY gets allocated before hostX. I do maintain a map
> of resource request to requested host. So, if hostY is returned, I will
> know that I expected that host for container-1. The issue is how to
> associate an allocated resource that is not in my list of preferred hosts.
> If a random hostA is allocated, then which container should I assign ?
> This is where I am stuck and I am wondering if there is a way around.
> 

I couldn't find a way here. When things come in to an unexpected host they just get an instance of that component type assigned to it.

that leaves outstanding requests in the list, I deal with that when eventually all the containers get satisified -at that point I know everything outstanding is satisified, so cancel all the requests.

I'd have liked a request ID to do the mapping, but that would apparently make aggregating requests impossible, so no.

Re: Ordering of ResourceRequest handling in the RM

Posted by Navina Ramesh <nr...@linkedin.com.INVALID>.
Hi Steve,
Thanks for your response.

I think I should have mentioned a couple of design choices we made:
1. *Continuous Scheduling*  - I explored the option of requesting
containers with relaxLocality==false. But this will add to latency and
there is no way of revoking the resource request. An alternate approach
was to use continuous scheduling with the FairScheduler. Instead of
waiting on node heartbeats, the RM iterates through the nodes based on the
latest known states. We enable this and also, associate a rack and node
level delay. This way if the requested resource is not available, it will
automatically relaxLocality and return any available resource.

2. All the resource request that I make is for running the same component
and hence, their priority is the same.

>> that's a bad strategy. What if hostY had capacity before hostX? Would
>>you expect it to be blocked until hostX was satisifed? as hostY may not
>>be free then
It is fine if the hostY gets allocated before hostX. I do maintain a map
of resource request to requested host. So, if hostY is returned, I will
know that I expected that host for container-1. The issue is how to
associate an allocated resource that is not in my list of preferred hosts.
If a random hostA is allocated, then which container should I assign ?
This is where I am stuck and I am wondering if there is a way around.

Thanks!
Navina

 
On 6/12/15, 1:09 AM, "Steve Loughran" <st...@hortonworks.com> wrote:

>
>> On 12 Jun 2015, at 02:10, Navina Ramesh <nr...@linkedin.com.INVALID>
>>wrote:
>> 
>> Hi yarn-devs,
>> 
>> I am trying to utilize YARN¹s host-affinity feature in Apache Samza
>>(SAMZA-617).
>> 
>> I have a question regarding the order in which the resource requests
>>are executed by the RM. Today, we make successive requests for x (say x
>>= 3) containers for deploying a Samza job.
>> It looks like:
>> Request 1: (container-0, $hostX)
>> Request 2: (container-1, $hostY)
>> Request 3: (container-2, $hostZ)
>
>
>does it when space comes up.
>
>> 
>> When implementing the callback ³onContainerAllocated² in the
>>SamzaAppMaster, how can I associate an allocated container with its
>>corresponding containerRequest? Is there a way in YARN to associate an
>>allocated container to its request?
>> 
>> If not, is it correct to assume
>> that the RM handles the requests in a FIFO manner and hence, the order
>>in which the onContainerAllocated callback is invoked will be the same
>>as the request order?
>> 
>
>that's a bad strategy. What if hostY had capacity before hostX? Would you
>expect it to be blocked until hostX was satisifed? as hostY may not be
>free then.
>
>> This information will be very useful for Samza to implement
>>host-affinity in its deployment model. Please let me know.
>> 
>
>If you are explicilty requesting nodes on specific hosts, with
>relaxLocality==false, then the responses you get back will be for the
>hosts you asked for. All you need is a map of (hostname->request) to look
>them back up, provided you have exactly one request per host outstanding.
>
>For Apache Slider (incubating) we use a different YARN priority for
>different component types; each, so can have >1 explicit request for a
>host provided they are for different components. What we can't do is tell
>which request with
>relaxLocality=true has been satisfied when a response comes back for a
>host that wasn't explicly asked for. For example, if you get got an
>allocation for a container on host W, which one of the other requests is
>no longer outstanding?
>


Re: Ordering of ResourceRequest handling in the RM

Posted by Steve Loughran <st...@hortonworks.com>.
> On 12 Jun 2015, at 02:10, Navina Ramesh <nr...@linkedin.com.INVALID> wrote:
> 
> Hi yarn-devs,
> 
> I am trying to utilize YARN’s host-affinity feature in Apache Samza (SAMZA-617).
> 
> I have a question regarding the order in which the resource requests are executed by the RM. Today, we make successive requests for x (say x = 3) containers for deploying a Samza job.
> It looks like:
> Request 1: (container-0, $hostX)
> Request 2: (container-1, $hostY)
> Request 3: (container-2, $hostZ)


does it when space comes up.

> 
> When implementing the callback “onContainerAllocated” in the SamzaAppMaster, how can I associate an allocated container with its corresponding containerRequest? Is there a way in YARN to associate an allocated container to its request?
> 
> If not, is it correct to assume
> that the RM handles the requests in a FIFO manner and hence, the order in which the onContainerAllocated callback is invoked will be the same as the request order?
> 

that's a bad strategy. What if hostY had capacity before hostX? Would you expect it to be blocked until hostX was satisifed? as hostY may not be free then.

> This information will be very useful for Samza to implement host-affinity in its deployment model. Please let me know.
> 

If you are explicilty requesting nodes on specific hosts, with relaxLocality==false, then the responses you get back will be for the hosts you asked for. All you need is a map of (hostname->request) to look them back up, provided you have exactly one request per host outstanding.

For Apache Slider (incubating) we use a different YARN priority for different component types; each, so can have >1 explicit request for a host provided they are for different components. What we can't do is tell which request with
relaxLocality=true has been satisfied when a response comes back for a host that wasn't explicly asked for. For example, if you get got an allocation for a container on host W, which one of the other requests is no longer outstanding?


Re: Ordering of ResourceRequest handling in the RM

Posted by Navina Ramesh <nr...@linkedin.com.INVALID>.

From: Navina Ramesh <nr...@linkedin.com>>
Date: Thursday, June 11, 2015 at 6:10 PM
To: "yarn-dev@hadoop.apache.org<ma...@hadoop.apache.org>" <ya...@hadoop.apache.org>>
Cc: "kasha@cloudera.com<ma...@cloudera.com>" <ka...@cloudera.com>>, "vinodkv@apache.org<ma...@apache.org>" <vi...@apache.org>>, "Yi Pan (Data Infrastructure)" <yi...@linkedin.com>>, Chris Riccomini <cr...@apache.org>>
Subject: Ordering of ResourceRequest handling in the RM

Hi yarn-devs,

I am trying to utilize YARN’s host-affinity feature in Apache Samza (SAMZA-617).

I have a question regarding the order in which the resource requests are executed by the RM. Today, we make successive requests for x (say x = 3) containers for deploying a Samza job.
It looks like:
Request 1: (container-0, $hostX)
Request 2: (container-1, $hostY)
Request 3: (container-2, $hostZ)

When implementing the callback “onContainerAllocated” in the SamzaAppMaster, how can I associate an allocated container with its corresponding containerRequest? Is there a way in YARN to associate an allocated container to its request?

If not, is it correct to assume that the RM handles the requests in a FIFO manner and hence, the order in which the onContainerAllocated callback is invoked will be the same as the request order?

This information will be very useful for Samza to implement host-affinity in its deployment model. Please let me know.

Thanks!
Navina