You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Yuzhang Han <yu...@gmail.com> on 2013/06/13 17:41:51 UTC

Container size configuration

Hi,

I am wondering if I can allocate different size of containers to the 
tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 
= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.

Yuzhang

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
You mean different memory sizes for different reducers?  If you're willing
to mess with the MR Application Master code, you should be to do it there.
 Check out RMContainerAllocator.java and RMContainerRequestor.java.

-Sandy


On Thu, Jun 13, 2013 at 9:56 AM, Yuzhang Han <yu...@gmail.com>wrote:

>  Thank you Sandy.
>
> I am trying to do to some experiments with MapReduce on YARN, where some
> reducers gain larger processing tasks than the other reducers. I want to
> see that if I allocate more memory to the larger-task reducers, how the
> performance would be improved.
>
> Anyway, if I want to manually configure memory size for a particular
> reducer, how can I do it? In ApplicationMaster? Or this would not work at
> all?
>
> Thank you.
>
>
>
>
>
> On 6/13/2013 12:47 PM, Sandy Ryza wrote:
>
> Hi Yuzhang,
>
>  Moving this question to the Hadoop user list.
>
>  Are you using MapReduce or writing your own YARN application?  In
> MapReduce, all maps must request the same amount of memory and all reduces
> must request the same amount of memory.  It would be trivial to do this in
> your own YARN application.
>
>  -Sandy
>
>
> On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:
>
>> Hi,
>>
>> I am wondering if I can allocate different size of containers to the
>> tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 =
>> 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>>
>> Yuzhang
>>
>
>
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
You mean different memory sizes for different reducers?  If you're willing
to mess with the MR Application Master code, you should be to do it there.
 Check out RMContainerAllocator.java and RMContainerRequestor.java.

-Sandy


On Thu, Jun 13, 2013 at 9:56 AM, Yuzhang Han <yu...@gmail.com>wrote:

>  Thank you Sandy.
>
> I am trying to do to some experiments with MapReduce on YARN, where some
> reducers gain larger processing tasks than the other reducers. I want to
> see that if I allocate more memory to the larger-task reducers, how the
> performance would be improved.
>
> Anyway, if I want to manually configure memory size for a particular
> reducer, how can I do it? In ApplicationMaster? Or this would not work at
> all?
>
> Thank you.
>
>
>
>
>
> On 6/13/2013 12:47 PM, Sandy Ryza wrote:
>
> Hi Yuzhang,
>
>  Moving this question to the Hadoop user list.
>
>  Are you using MapReduce or writing your own YARN application?  In
> MapReduce, all maps must request the same amount of memory and all reduces
> must request the same amount of memory.  It would be trivial to do this in
> your own YARN application.
>
>  -Sandy
>
>
> On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:
>
>> Hi,
>>
>> I am wondering if I can allocate different size of containers to the
>> tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 =
>> 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>>
>> Yuzhang
>>
>
>
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
You mean different memory sizes for different reducers?  If you're willing
to mess with the MR Application Master code, you should be to do it there.
 Check out RMContainerAllocator.java and RMContainerRequestor.java.

-Sandy


On Thu, Jun 13, 2013 at 9:56 AM, Yuzhang Han <yu...@gmail.com>wrote:

>  Thank you Sandy.
>
> I am trying to do to some experiments with MapReduce on YARN, where some
> reducers gain larger processing tasks than the other reducers. I want to
> see that if I allocate more memory to the larger-task reducers, how the
> performance would be improved.
>
> Anyway, if I want to manually configure memory size for a particular
> reducer, how can I do it? In ApplicationMaster? Or this would not work at
> all?
>
> Thank you.
>
>
>
>
>
> On 6/13/2013 12:47 PM, Sandy Ryza wrote:
>
> Hi Yuzhang,
>
>  Moving this question to the Hadoop user list.
>
>  Are you using MapReduce or writing your own YARN application?  In
> MapReduce, all maps must request the same amount of memory and all reduces
> must request the same amount of memory.  It would be trivial to do this in
> your own YARN application.
>
>  -Sandy
>
>
> On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:
>
>> Hi,
>>
>> I am wondering if I can allocate different size of containers to the
>> tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 =
>> 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>>
>> Yuzhang
>>
>
>
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
You mean different memory sizes for different reducers?  If you're willing
to mess with the MR Application Master code, you should be to do it there.
 Check out RMContainerAllocator.java and RMContainerRequestor.java.

-Sandy


On Thu, Jun 13, 2013 at 9:56 AM, Yuzhang Han <yu...@gmail.com>wrote:

>  Thank you Sandy.
>
> I am trying to do to some experiments with MapReduce on YARN, where some
> reducers gain larger processing tasks than the other reducers. I want to
> see that if I allocate more memory to the larger-task reducers, how the
> performance would be improved.
>
> Anyway, if I want to manually configure memory size for a particular
> reducer, how can I do it? In ApplicationMaster? Or this would not work at
> all?
>
> Thank you.
>
>
>
>
>
> On 6/13/2013 12:47 PM, Sandy Ryza wrote:
>
> Hi Yuzhang,
>
>  Moving this question to the Hadoop user list.
>
>  Are you using MapReduce or writing your own YARN application?  In
> MapReduce, all maps must request the same amount of memory and all reduces
> must request the same amount of memory.  It would be trivial to do this in
> your own YARN application.
>
>  -Sandy
>
>
> On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:
>
>> Hi,
>>
>> I am wondering if I can allocate different size of containers to the
>> tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 =
>> 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>>
>> Yuzhang
>>
>
>
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Yuzhang,

Moving this question to the Hadoop user list.

Are you using MapReduce or writing your own YARN application?  In
MapReduce, all maps must request the same amount of memory and all reduces
must request the same amount of memory.  It would be trivial to do this in
your own YARN application.

-Sandy


On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:

> Hi,
>
> I am wondering if I can allocate different size of containers to the tasks
> in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 = 1024MB,
> Task3 = 2048MB. How can I achieve this? Many thanks.
>
> Yuzhang
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Yuzhang,

Moving this question to the Hadoop user list.

Are you using MapReduce or writing your own YARN application?  In
MapReduce, all maps must request the same amount of memory and all reduces
must request the same amount of memory.  It would be trivial to do this in
your own YARN application.

-Sandy


On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:

> Hi,
>
> I am wondering if I can allocate different size of containers to the tasks
> in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 = 1024MB,
> Task3 = 2048MB. How can I achieve this? Many thanks.
>
> Yuzhang
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Yuzhang,

Moving this question to the Hadoop user list.

Are you using MapReduce or writing your own YARN application?  In
MapReduce, all maps must request the same amount of memory and all reduces
must request the same amount of memory.  It would be trivial to do this in
your own YARN application.

-Sandy


On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:

> Hi,
>
> I am wondering if I can allocate different size of containers to the tasks
> in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 = 1024MB,
> Task3 = 2048MB. How can I achieve this? Many thanks.
>
> Yuzhang
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Yuzhang,

Moving this question to the Hadoop user list.

Are you using MapReduce or writing your own YARN application?  In
MapReduce, all maps must request the same amount of memory and all reduces
must request the same amount of memory.  It would be trivial to do this in
your own YARN application.

-Sandy


On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:

> Hi,
>
> I am wondering if I can allocate different size of containers to the tasks
> in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 = 1024MB,
> Task3 = 2048MB. How can I achieve this? Many thanks.
>
> Yuzhang
>

Re: Container size configuration

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Yuzhang,

Moving this question to the Hadoop user list.

Are you using MapReduce or writing your own YARN application?  In
MapReduce, all maps must request the same amount of memory and all reduces
must request the same amount of memory.  It would be trivial to do this in
your own YARN application.

-Sandy


On Thu, Jun 13, 2013 at 8:41 AM, Yuzhang Han <yu...@gmail.com>wrote:

> Hi,
>
> I am wondering if I can allocate different size of containers to the tasks
> in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2 = 1024MB,
> Task3 = 2048MB. How can I achieve this? Many thanks.
>
> Yuzhang
>

Re: Container size configuration

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Glad to hear we have not introduced any regression.

Thanks Bobby.


On Wed, Jun 19, 2013 at 8:31 AM, Robert Evans <ev...@yahoo-inc.com> wrote:

> Sorry I am a bit behind on some of the changes that have happened to the
> scheduler.  And I guess I am also behind on what has happened to the MR
> AM.  I just looked at the MR AM code and it goes only off of priority when
> assigning containers.  It also does a sanity check that the memory
> allocated is large enough to meet the needs of the given task.
>
> So I am happy to say I was wrong on this one :)
>
> Sorry I wasted your time, and thanks for the help.
>
> --Bobby
>
> On 6/18/13 12:59 PM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:
>
> >Bobby,
> >
> >With MAPREDUCE-5310 we removed normalization of resource request on the
> >MRAM side. This was done because the normalization is an implementation
> >detail of the RM scheduler.
> >
> >IMO, if this is a problem for the MRAM as you suggest, then we should fix
> >the MRAM logic.
> >
> >Note this may happen only the MR job specifies memory requirements for its
> >tasks that do not much with its normalize value.
> >
> >Thanks.
> >
> >
> >
> >On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <ev...@yahoo-inc.com>
> >wrote:
> >
> >> Even returning an over sized container can be very confusing for an
> >> application.  The MR AM will not handle it correctly.  If it sees a
> >> container returned that does not match exactly the priority and size it
> >> expects, I believe that container is thrown away.  We had deadlocks in
> >>the
> >> past where it somehow used a reducer container for a mapper and then
> >>never
> >> updated the reducer count to request a new one.  It is best for now to
> >>not
> >> mix the two, and we need to lock down/fix the semantics of what happens
> >>in
> >> those situations for a scheduler.
> >>
> >> --Bobby
> >>
> >> On 6/18/13 12:13 AM, "Bikas Saha" <bi...@hortonworks.com> wrote:
> >>
> >> >I think the API allows different size requests at the same priority.
> >>The
> >> >implementation of the scheduler drops the size information and uses the
> >> >last value set. We should probably at least change it to use the
> >>largest
> >> >value used so that users don't get containers that are too small for
> >>them.
> >> >YARN-847 tracks this.
> >> >
> >> >Bikas
> >> >
> >> >-----Original Message-----
> >> >From: Robert Evans [mailto:evans@yahoo-inc.com]
> >> >Sent: Friday, June 14, 2013 7:09 AM
> >> >To: yarn-dev@hadoop.apache.org
> >> >Subject: Re: Container size configuration
> >> >
> >> >Is this specifically for YARN?  If so, yes you can do this, MR does
> >>this
> >> >for Maps vs Reduces.  The API right now requires that the different
> >>sized
> >> >containers have a different priority.
> >> >
> >> >
> >>
> >>
> http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/W
> >>r
> >> >i
> >> >tingYarnApplications.html
> >> >
> >> >Shows how to make a resource request. It also shows how to make a
> >> >AllocateRequest.  If you put in multiple ResourceRequests into the
> >> >AllocateRequest it will allocate both of them.  But remember that that
> >>the
> >> >priority needs to be different, and the priority determines the order
> >>in
> >> >which the containers will be allocated to your application.
> >> >
> >> >--Bobby
> >> >
> >> >On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:
> >> >
> >> >>Hi,
> >> >>
> >> >>I am wondering if I can allocate different size of containers to the
> >> >>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 =
> >>Task2
> >> >>= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
> >> >>
> >> >>Yuzhang
> >>
> >>
> >
> >
> >--
> >Alejandro
>
>


-- 
Alejandro

RE: Container size configuration

Posted by Bikas Saha <bi...@hortonworks.com>.
That's correct. The previous behavior of the MR AM was tightly coupled to
the scheduler impl and thus fragile. The RM is supposed to not give a
container less than it needs because that's incorrect. It can always give
a container more than it needs based on its internal heuristics. Ideally
that should not be the case to prevent internal fragmentation.

Alejandro, after MAPREDUCE-5310 did we check that the MR AM works
correctly after making the M/R memory different from the normalized
values?

Bikas

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
Sent: Tuesday, June 18, 2013 10:59 AM
To: yarn-dev@hadoop.apache.org
Subject: Re: Container size configuration

Bobby,

With MAPREDUCE-5310 we removed normalization of resource request on the
MRAM side. This was done because the normalization is an implementation
detail of the RM scheduler.

IMO, if this is a problem for the MRAM as you suggest, then we should fix
the MRAM logic.

Note this may happen only the MR job specifies memory requirements for its
tasks that do not much with its normalize value.

Thanks.



On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <ev...@yahoo-inc.com>
wrote:

> Even returning an over sized container can be very confusing for an
> application.  The MR AM will not handle it correctly.  If it sees a
> container returned that does not match exactly the priority and size
> it expects, I believe that container is thrown away.  We had deadlocks
> in the past where it somehow used a reducer container for a mapper and
> then never updated the reducer count to request a new one.  It is best
> for now to not mix the two, and we need to lock down/fix the semantics
> of what happens in those situations for a scheduler.
>
> --Bobby
>
> On 6/18/13 12:13 AM, "Bikas Saha" <bi...@hortonworks.com> wrote:
>
> >I think the API allows different size requests at the same priority.
> >The implementation of the scheduler drops the size information and
> >uses the last value set. We should probably at least change it to use
> >the largest value used so that users don't get containers that are too
small for them.
> >YARN-847 tracks this.
> >
> >Bikas
> >
> >-----Original Message-----
> >From: Robert Evans [mailto:evans@yahoo-inc.com]
> >Sent: Friday, June 14, 2013 7:09 AM
> >To: yarn-dev@hadoop.apache.org
> >Subject: Re: Container size configuration
> >
> >Is this specifically for YARN?  If so, yes you can do this, MR does
> >this for Maps vs Reduces.  The API right now requires that the
> >different sized containers have a different priority.
> >
> >
> http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-sit
> e/Wr
> >i
> >tingYarnApplications.html
> >
> >Shows how to make a resource request. It also shows how to make a
> >AllocateRequest.  If you put in multiple ResourceRequests into the
> >AllocateRequest it will allocate both of them.  But remember that
> >that the priority needs to be different, and the priority determines
> >the order in which the containers will be allocated to your
application.
> >
> >--Bobby
> >
> >On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:
> >
> >>Hi,
> >>
> >>I am wondering if I can allocate different size of containers to the
> >>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 =
> >>Task2 = 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
> >>
> >>Yuzhang
>
>


--
Alejandro

Re: Container size configuration

Posted by Arun C Murthy <ac...@hortonworks.com>.
Bobby - MR AM also checks the priority of the allocated container… so we should be safe.

On Jun 19, 2013, at 8:31 AM, Robert Evans <ev...@yahoo-inc.com> wrote:

> Sorry I am a bit behind on some of the changes that have happened to the
> scheduler.  And I guess I am also behind on what has happened to the MR
> AM.  I just looked at the MR AM code and it goes only off of priority when
> assigning containers.  It also does a sanity check that the memory
> allocated is large enough to meet the needs of the given task.
> 
> So I am happy to say I was wrong on this one :)
> 
> Sorry I wasted your time, and thanks for the help.
> 
> --Bobby
> 
> On 6/18/13 12:59 PM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:
> 
>> Bobby,
>> 
>> With MAPREDUCE-5310 we removed normalization of resource request on the
>> MRAM side. This was done because the normalization is an implementation
>> detail of the RM scheduler.
>> 
>> IMO, if this is a problem for the MRAM as you suggest, then we should fix
>> the MRAM logic.
>> 
>> Note this may happen only the MR job specifies memory requirements for its
>> tasks that do not much with its normalize value.
>> 
>> Thanks.
>> 
>> 
>> 
>> On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <ev...@yahoo-inc.com>
>> wrote:
>> 
>>> Even returning an over sized container can be very confusing for an
>>> application.  The MR AM will not handle it correctly.  If it sees a
>>> container returned that does not match exactly the priority and size it
>>> expects, I believe that container is thrown away.  We had deadlocks in
>>> the
>>> past where it somehow used a reducer container for a mapper and then
>>> never
>>> updated the reducer count to request a new one.  It is best for now to
>>> not
>>> mix the two, and we need to lock down/fix the semantics of what happens
>>> in
>>> those situations for a scheduler.
>>> 
>>> --Bobby
>>> 
>>> On 6/18/13 12:13 AM, "Bikas Saha" <bi...@hortonworks.com> wrote:
>>> 
>>>> I think the API allows different size requests at the same priority.
>>> The
>>>> implementation of the scheduler drops the size information and uses the
>>>> last value set. We should probably at least change it to use the
>>> largest
>>>> value used so that users don't get containers that are too small for
>>> them.
>>>> YARN-847 tracks this.
>>>> 
>>>> Bikas
>>>> 
>>>> -----Original Message-----
>>>> From: Robert Evans [mailto:evans@yahoo-inc.com]
>>>> Sent: Friday, June 14, 2013 7:09 AM
>>>> To: yarn-dev@hadoop.apache.org
>>>> Subject: Re: Container size configuration
>>>> 
>>>> Is this specifically for YARN?  If so, yes you can do this, MR does
>>> this
>>>> for Maps vs Reduces.  The API right now requires that the different
>>> sized
>>>> containers have a different priority.
>>>> 
>>>> 
>>> 
>>> http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/W
>>> r
>>>> i
>>>> tingYarnApplications.html
>>>> 
>>>> Shows how to make a resource request. It also shows how to make a
>>>> AllocateRequest.  If you put in multiple ResourceRequests into the
>>>> AllocateRequest it will allocate both of them.  But remember that that
>>> the
>>>> priority needs to be different, and the priority determines the order
>>> in
>>>> which the containers will be allocated to your application.
>>>> 
>>>> --Bobby
>>>> 
>>>> On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am wondering if I can allocate different size of containers to the
>>>>> tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 =
>>> Task2
>>>>> = 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>>>>> 
>>>>> Yuzhang
>>> 
>>> 
>> 
>> 
>> -- 
>> Alejandro
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Container size configuration

Posted by Robert Evans <ev...@yahoo-inc.com>.
Sorry I am a bit behind on some of the changes that have happened to the
scheduler.  And I guess I am also behind on what has happened to the MR
AM.  I just looked at the MR AM code and it goes only off of priority when
assigning containers.  It also does a sanity check that the memory
allocated is large enough to meet the needs of the given task.

So I am happy to say I was wrong on this one :)

Sorry I wasted your time, and thanks for the help.

--Bobby

On 6/18/13 12:59 PM, "Alejandro Abdelnur" <tu...@cloudera.com> wrote:

>Bobby,
>
>With MAPREDUCE-5310 we removed normalization of resource request on the
>MRAM side. This was done because the normalization is an implementation
>detail of the RM scheduler.
>
>IMO, if this is a problem for the MRAM as you suggest, then we should fix
>the MRAM logic.
>
>Note this may happen only the MR job specifies memory requirements for its
>tasks that do not much with its normalize value.
>
>Thanks.
>
>
>
>On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <ev...@yahoo-inc.com>
>wrote:
>
>> Even returning an over sized container can be very confusing for an
>> application.  The MR AM will not handle it correctly.  If it sees a
>> container returned that does not match exactly the priority and size it
>> expects, I believe that container is thrown away.  We had deadlocks in
>>the
>> past where it somehow used a reducer container for a mapper and then
>>never
>> updated the reducer count to request a new one.  It is best for now to
>>not
>> mix the two, and we need to lock down/fix the semantics of what happens
>>in
>> those situations for a scheduler.
>>
>> --Bobby
>>
>> On 6/18/13 12:13 AM, "Bikas Saha" <bi...@hortonworks.com> wrote:
>>
>> >I think the API allows different size requests at the same priority.
>>The
>> >implementation of the scheduler drops the size information and uses the
>> >last value set. We should probably at least change it to use the
>>largest
>> >value used so that users don't get containers that are too small for
>>them.
>> >YARN-847 tracks this.
>> >
>> >Bikas
>> >
>> >-----Original Message-----
>> >From: Robert Evans [mailto:evans@yahoo-inc.com]
>> >Sent: Friday, June 14, 2013 7:09 AM
>> >To: yarn-dev@hadoop.apache.org
>> >Subject: Re: Container size configuration
>> >
>> >Is this specifically for YARN?  If so, yes you can do this, MR does
>>this
>> >for Maps vs Reduces.  The API right now requires that the different
>>sized
>> >containers have a different priority.
>> >
>> >
>> 
>>http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/W
>>r
>> >i
>> >tingYarnApplications.html
>> >
>> >Shows how to make a resource request. It also shows how to make a
>> >AllocateRequest.  If you put in multiple ResourceRequests into the
>> >AllocateRequest it will allocate both of them.  But remember that that
>>the
>> >priority needs to be different, and the priority determines the order
>>in
>> >which the containers will be allocated to your application.
>> >
>> >--Bobby
>> >
>> >On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:
>> >
>> >>Hi,
>> >>
>> >>I am wondering if I can allocate different size of containers to the
>> >>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 =
>>Task2
>> >>= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>> >>
>> >>Yuzhang
>>
>>
>
>
>-- 
>Alejandro


Re: Container size configuration

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Bobby,

With MAPREDUCE-5310 we removed normalization of resource request on the
MRAM side. This was done because the normalization is an implementation
detail of the RM scheduler.

IMO, if this is a problem for the MRAM as you suggest, then we should fix
the MRAM logic.

Note this may happen only the MR job specifies memory requirements for its
tasks that do not much with its normalize value.

Thanks.



On Tue, Jun 18, 2013 at 10:45 AM, Robert Evans <ev...@yahoo-inc.com> wrote:

> Even returning an over sized container can be very confusing for an
> application.  The MR AM will not handle it correctly.  If it sees a
> container returned that does not match exactly the priority and size it
> expects, I believe that container is thrown away.  We had deadlocks in the
> past where it somehow used a reducer container for a mapper and then never
> updated the reducer count to request a new one.  It is best for now to not
> mix the two, and we need to lock down/fix the semantics of what happens in
> those situations for a scheduler.
>
> --Bobby
>
> On 6/18/13 12:13 AM, "Bikas Saha" <bi...@hortonworks.com> wrote:
>
> >I think the API allows different size requests at the same priority. The
> >implementation of the scheduler drops the size information and uses the
> >last value set. We should probably at least change it to use the largest
> >value used so that users don't get containers that are too small for them.
> >YARN-847 tracks this.
> >
> >Bikas
> >
> >-----Original Message-----
> >From: Robert Evans [mailto:evans@yahoo-inc.com]
> >Sent: Friday, June 14, 2013 7:09 AM
> >To: yarn-dev@hadoop.apache.org
> >Subject: Re: Container size configuration
> >
> >Is this specifically for YARN?  If so, yes you can do this, MR does this
> >for Maps vs Reduces.  The API right now requires that the different sized
> >containers have a different priority.
> >
> >
> http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/Wr
> >i
> >tingYarnApplications.html
> >
> >Shows how to make a resource request. It also shows how to make a
> >AllocateRequest.  If you put in multiple ResourceRequests into the
> >AllocateRequest it will allocate both of them.  But remember that that the
> >priority needs to be different, and the priority determines the order in
> >which the containers will be allocated to your application.
> >
> >--Bobby
> >
> >On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:
> >
> >>Hi,
> >>
> >>I am wondering if I can allocate different size of containers to the
> >>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2
> >>= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
> >>
> >>Yuzhang
>
>


-- 
Alejandro

Re: Container size configuration

Posted by Robert Evans <ev...@yahoo-inc.com>.
Even returning an over sized container can be very confusing for an
application.  The MR AM will not handle it correctly.  If it sees a
container returned that does not match exactly the priority and size it
expects, I believe that container is thrown away.  We had deadlocks in the
past where it somehow used a reducer container for a mapper and then never
updated the reducer count to request a new one.  It is best for now to not
mix the two, and we need to lock down/fix the semantics of what happens in
those situations for a scheduler.

--Bobby

On 6/18/13 12:13 AM, "Bikas Saha" <bi...@hortonworks.com> wrote:

>I think the API allows different size requests at the same priority. The
>implementation of the scheduler drops the size information and uses the
>last value set. We should probably at least change it to use the largest
>value used so that users don't get containers that are too small for them.
>YARN-847 tracks this.
>
>Bikas
>
>-----Original Message-----
>From: Robert Evans [mailto:evans@yahoo-inc.com]
>Sent: Friday, June 14, 2013 7:09 AM
>To: yarn-dev@hadoop.apache.org
>Subject: Re: Container size configuration
>
>Is this specifically for YARN?  If so, yes you can do this, MR does this
>for Maps vs Reduces.  The API right now requires that the different sized
>containers have a different priority.
>
>http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/Wr
>i
>tingYarnApplications.html
>
>Shows how to make a resource request. It also shows how to make a
>AllocateRequest.  If you put in multiple ResourceRequests into the
>AllocateRequest it will allocate both of them.  But remember that that the
>priority needs to be different, and the priority determines the order in
>which the containers will be allocated to your application.
>
>--Bobby
>
>On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:
>
>>Hi,
>>
>>I am wondering if I can allocate different size of containers to the
>>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2
>>= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>>
>>Yuzhang


RE: Container size configuration

Posted by Bikas Saha <bi...@hortonworks.com>.
I think the API allows different size requests at the same priority. The
implementation of the scheduler drops the size information and uses the
last value set. We should probably at least change it to use the largest
value used so that users don't get containers that are too small for them.
YARN-847 tracks this.

Bikas

-----Original Message-----
From: Robert Evans [mailto:evans@yahoo-inc.com]
Sent: Friday, June 14, 2013 7:09 AM
To: yarn-dev@hadoop.apache.org
Subject: Re: Container size configuration

Is this specifically for YARN?  If so, yes you can do this, MR does this
for Maps vs Reduces.  The API right now requires that the different sized
containers have a different priority.

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/Wr
i
tingYarnApplications.html

Shows how to make a resource request. It also shows how to make a
AllocateRequest.  If you put in multiple ResourceRequests into the
AllocateRequest it will allocate both of them.  But remember that that the
priority needs to be different, and the priority determines the order in
which the containers will be allocated to your application.

--Bobby

On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:

>Hi,
>
>I am wondering if I can allocate different size of containers to the
>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2
>= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>
>Yuzhang

Re: Container size configuration

Posted by Robert Evans <ev...@yahoo-inc.com>.
Is this specifically for YARN?  If so, yes you can do this, MR does this
for Maps vs Reduces.  The API right now requires that the different sized
containers have a different priority.

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-site/Wri
tingYarnApplications.html

Shows how to make a resource request. It also shows how to make a
AllocateRequest.  If you put in multiple ResourceRequests into the
AllocateRequest it will allocate both of them.  But remember that that the
priority needs to be different, and the priority determines the order in
which the containers will be allocated to your application.

--Bobby

On 6/13/13 10:41 AM, "Yuzhang Han" <yu...@gmail.com> wrote:

>Hi,
>
>I am wondering if I can allocate different size of containers to the
>tasks in a job. For example: Job = <Task1, Task2, Task3>, Task1 = Task2
>= 1024MB, Task3 = 2048MB. How can I achieve this? Many thanks.
>
>Yuzhang