You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-dev@hadoop.apache.org by Patrick Wendell <pw...@gmail.com> on 2011/11/25 23:26:47 UTC

Scheduler Questions

Hey All,

Two questions about the MR2 scheduler code, for anyone more familiar.

- The return type of allocate() suggests that the AM will get some
instantaneous allocation of resources. Reading through the two current
schedulers, however, it looks like containers are only allocated after Node
heartbeats, and that allocate() will rarely return any useful allocation.
Is that correct or am I missing something?

- The SchedulerNode class maintains state for both a single "reserved"
container and several "allocated" containers. As far as I can tell only the
capacity scheduler uses the former, but it's not clear to me what the
reserved container is for. Is this to enforce hard limits on allocation
reserved for particular users?

- Patrick

Re: Scheduler Questions

Posted by Patrick Wendell <pw...@gmail.com>.

Thanks Arun,

That all makes sense.

One more question. What semantic is ascribed to "priorities" given to
the scheduler. A few examples would be:

a) Within a request, containers should be allocated in strict order of
priority (i.e. all priority 1 containers should be allocated before
the priority 2 container). This would make sense for MR requests where
maps are assigned higher priority than reduces.

b) Across all requests from an application, containers should be
allocated in strict order of priority.

c) Across all requests to the scheduler, containers should be
allocated in strict order of priority,

Is it any of these? Or is it undefined an up to the scheduler how to
use priorities.

- Patrick

On Sat, Nov 26, 2011 at 5:20 PM, Arun Murthy <ac...@hortonworks.com> wrote:
>> On Sat, Nov 26, 2011 at 3:56 AM, Patrick Wendell <pw...@gmail.com> wrote:
>>
>>> Hey All,
>>>
>>> Two questions about the MR2 scheduler code, for anyone more familiar.
>>>
>>> - The return type of allocate() suggests that the AM will get some
>>> instantaneous allocation of resources. Reading through the two current
>>> schedulers, however, it looks like containers are only allocated after Node
>>> heartbeats, and that allocate() will rarely return any useful allocation.
>>> Is that correct or am I missing something?
>
> Typically applications submit all resource requests up-front and then
> get allocations in subsequent calls to allocate...
>
>>>
>>> - The SchedulerNode class maintains state for both a single "reserved"
>>> container and several "allocated" containers. As far as I can tell only the
>>> capacity scheduler uses the former, but it's not clear to me what the
>>> reserved container is for. Is this to enforce hard limits on allocation
>>> reserved for particular users?
>
> Reserved containers are partial containers reserved while the
> scheduler awaits for a larger resource request to be fulfilled. For
> e.g. if a 8G container is required and only 4G is available,
> schedulers can 'reserve' the 4G while waiting for the rest.
>
> Reservation prevents starvation of larger resource requests, but
> should be done carefully since it's essentially leaving resources
> under-utilized.
>
> The CS 'charges' upfront for the whole reservation (8G in the e.g.
> above) to ensure reservations are 'expensive' - larger the requested
> container, larger is the cost for reserving them (upfront).
>
> hth,
> Arun
>

Re: Scheduler Questions

Posted by Arun Murthy <ac...@hortonworks.com>.

> On Sat, Nov 26, 2011 at 3:56 AM, Patrick Wendell <pw...@gmail.com> wrote:
>
>> Hey All,
>>
>> Two questions about the MR2 scheduler code, for anyone more familiar.
>>
>> - The return type of allocate() suggests that the AM will get some
>> instantaneous allocation of resources. Reading through the two current
>> schedulers, however, it looks like containers are only allocated after Node
>> heartbeats, and that allocate() will rarely return any useful allocation.
>> Is that correct or am I missing something?

Typically applications submit all resource requests up-front and then
get allocations in subsequent calls to allocate...

>>
>> - The SchedulerNode class maintains state for both a single "reserved"
>> container and several "allocated" containers. As far as I can tell only the
>> capacity scheduler uses the former, but it's not clear to me what the
>> reserved container is for. Is this to enforce hard limits on allocation
>> reserved for particular users?

Reserved containers are partial containers reserved while the
scheduler awaits for a larger resource request to be fulfilled. For
e.g. if a 8G container is required and only 4G is available,
schedulers can 'reserve' the 4G while waiting for the rest.

Reservation prevents starvation of larger resource requests, but
should be done carefully since it's essentially leaving resources
under-utilized.

The CS 'charges' upfront for the whole reservation (8G in the e.g.
above) to ensure reservations are 'expensive' - larger the requested
container, larger is the cost for reserving them (upfront).

hth,
Arun

Re: Scheduler Questions

Posted by Praveen Sripati <pr...@gmail.com>.

`yarn.app.mapreduce.am.scheduler.heartbeat.interval-ms` parameter is
defaulted to to 2000ms (DEFAULT_MR_AM_TO_RM_HEARTBEAT_INTERVAL_MS). So, the
AM sends a Heart Beat every 2000ms to the RM. Along with the Heart Beat, it
is also piggy backing the resource request which seems to be an overhead.
The resource request from the AM to the RM should be made as and when
required.

Thanks,
Praveen

On Sat, Nov 26, 2011 at 3:56 AM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey All,
>
> Two questions about the MR2 scheduler code, for anyone more familiar.
>
> - The return type of allocate() suggests that the AM will get some
> instantaneous allocation of resources. Reading through the two current
> schedulers, however, it looks like containers are only allocated after Node
> heartbeats, and that allocate() will rarely return any useful allocation.
> Is that correct or am I missing something?
>
> - The SchedulerNode class maintains state for both a single "reserved"
> container and several "allocated" containers. As far as I can tell only the
> capacity scheduler uses the former, but it's not clear to me what the
> reserved container is for. Is this to enforce hard limits on allocation
> reserved for particular users?
>
> - Patrick
>