You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2012/01/25 12:24:19 UTC

Understanding fair schedulers

Understanding Fair Schedulers better.

Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
correct me.

Suppose I have 2 pools in my fair-scheduler.xml

1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
50
2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce : 80

I have 5 users, who will be using these pools. How will I allocate specific
pools to specific users ?

Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
to use "Admin users"

In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
they have mentioned allocations something like this.

<?xml version="1.0"?>
<allocations>
  <pool name="sample_pool">
    <minMaps>5</minMaps>
    <minReduces>5</minReduces>
    <maxMaps>25</maxMaps>
    <maxReduces>25</maxReduces>
    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
  </pool>
  <user name="sample_user">
    <maxRunningJobs>6</maxRunningJobs>
  </user>
  <userMaxJobsDefault>3</userMaxJobsDefault>
  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
</allocations>

I tried creating more pools, its happening, but how to allocate users to
use specific pools ?

Thanks,
Praveenesh

Re: Understanding fair schedulers

Posted by Srinivas Surasani <va...@gmail.com>.

Praveenesh,

You can try specifying "mapred.fairscheduler.pool" to your pool name while
running the job. By default, mapred.faircheduler.poolnameproperty set to
user.name ( each job run by user is allocated to his named pool ) and you
can also change this property to group.name.

Srinivas --

Also, you can set

On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <pr...@gmail.com>wrote:

> Understanding Fair Schedulers better.
>
> Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> correct me.
>
> Suppose I have 2 pools in my fair-scheduler.xml
>
> 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
> 50
> 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> 80
>
> I have 5 users, who will be using these pools. How will I allocate specific
> pools to specific users ?
>
> Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
> to use "Admin users"
>
> In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> they have mentioned allocations something like this.
>
> <?xml version="1.0"?>
> <allocations>
>  <pool name="sample_pool">
>    <minMaps>5</minMaps>
>    <minReduces>5</minReduces>
>    <maxMaps>25</maxMaps>
>    <maxReduces>25</maxReduces>
>    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>  </pool>
>  <user name="sample_user">
>    <maxRunningJobs>6</maxRunningJobs>
>  </user>
>  <userMaxJobsDefault>3</userMaxJobsDefault>
>  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> </allocations>
>
> I tried creating more pools, its happening, but how to allocate users to
> use specific pools ?
>
> Thanks,
> Praveenesh
>

Re: Understanding fair schedulers

Posted by praveenesh kumar <pr...@gmail.com>.

okie got it.. same pool name.. as group name...

On Wed, Jan 25, 2012 at 8:51 PM, Harsh J <ha...@cloudera.com> wrote:

> Not exactly. See, the poolnameproperty being group.name will map the
> group name as a pool name. So you need to only use <pool name="ABC">
> for configuring a group "ABC". Does that make sense?
>
> On Wed, Jan 25, 2012 at 8:49 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > Then in that case, will I be using group name tag in allocations file,
> like
> > this inside each pool ?
> >
> > < group name="ABC">
> >    <maxRunningJobs>6</maxRunningJobs>
> >  </group>
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote:
> >
> >> A solution would be to place your users into groups, and use
> >> group.name identifier to be the  poolnameproperty. Would this work for
> >> you instead?
> >>
> >> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <praveenesh@gmail.com
> >
> >> wrote:
> >> > Also, with the above mentioned method, my problem is I am having one
> >> > pool/user (thats obviously not a good way of configuring schedulers)
> >> > How can I allocate multiple users to one pool in the xml properties,
> so
> >> > that I don't have to care giving any options inside my codes.
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <
> praveenesh@gmail.com
> >> >wrote:
> >> >
> >> >> I am looking for the solution where we can do it permanently without
> >> >> specify these things inside jobs.
> >> >> I want to keep these things hidden from the end-user.
> >> >> End-user would just write pig scripts and all the jobs submitted by
> the
> >> >> particular user will get submit to their respective pools
> automatically.
> >> >>
> >> >> What I am doing write now is something like this
> >> >>
> >> >>  <allocations>
> >> >>   <pool name="ABC">
> >> >>     <minMaps>10</minMaps>
> >> >>     <minReduces>10</minReduces>
> >> >>     <maxMaps>192</maxMaps>
> >> >>     <maxReduces>96</maxReduces>
> >> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >>   </pool>
> >> >>   <user name="ABC">
> >> >>
> >> >>     <maxRunningJobs>6</maxRunningJobs>
> >> >>   </user>
> >> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> >>
> >> >>   <pool name="XYZ">
> >> >>     <minMaps>10</minMaps>
> >> >>     <minReduces>10</minReduces>
> >> >>     <maxMaps>192</maxMaps>
> >> >>     <maxReduces>96</maxReduces>
> >> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >>   </pool>
> >> >>   <user name="XYZ">
> >> >>
> >> >>    <maxRunningJobs>6</maxRunningJobs>
> >> >>   </user>
> >> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> >>
> >> >> </allocations>
> >> >>
> >> >> By doing this, I am able to see different pools per user, without
> >> >> mentioning anything inside the jobs.
> >> >> Automatically jobs are going to the respective pools.
> >> >>
> >> >> But what I wanted to know , is this the right method to do ?
> >> >>
> >> >> Thanks,
> >> >> Praveenesh
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >>> Set the property in Pig with the 'set' command or other ways:
> >> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >> >>>
> >> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >> >>> you set the property to "group.name".
> >> >>>
> >> >>> Then you can provide per-poolname config overrides via the "pool"
> >> >>> element config described in
> >> >>>
> >> >>>
> >>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >> >>>
> >> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> >> praveenesh@gmail.com>
> >> >>> wrote:
> >> >>> > I am running pig jobs, how can I specify on which pool, it should
> >> run ?
> >> >>> > Also do you mean, the pool allocation is done job wise, not user
> >> wise ?
> >> >>> >
> >> >>> >
> >> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <
> vasajb@gmail.com
> >> >
> >> >>> wrote:
> >> >>> >
> >> >>> >> Praveenesh,
> >> >>> >>
> >> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool
> name
> >> >>> while
> >> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> >> set
> >> >>> to
> >> >>> >> user.name ( each job run by user is allocated to his named pool
> )
> >> and
> >> >>> you
> >> >>> >> can also change this property to group.name.
> >> >>> >>
> >> >>> >> Srinivas --
> >> >>> >>
> >> >>> >> Also, you can set
> >> >>> >>
> >> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >> >>> praveenesh@gmail.com
> >> >>> >> >wrote:
> >> >>> >>
> >> >>> >> > Understanding Fair Schedulers better.
> >> >>> >> >
> >> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> >> Please
> >> >>> >> > correct me.
> >> >>> >> >
> >> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >>> >> >
> >> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10,
> Max
> >> >>> >> Reduce :
> >> >>> >> > 50
> >> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20,
> Max
> >> >>> Reduce :
> >> >>> >> > 80
> >> >>> >> >
> >> >>> >> > I have 5 users, who will be using these pools. How will I
> allocate
> >> >>> >> specific
> >> >>> >> > pools to specific users ?
> >> >>> >> >
> >> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> >>> >> user3,user4,user5
> >> >>> >> > to use "Admin users"
> >> >>> >> >
> >> >>> >> > In
> >> >>>
> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >> >>> >> > they have mentioned allocations something like this.
> >> >>> >> >
> >> >>> >> > <?xml version="1.0"?>
> >> >>> >> > <allocations>
> >> >>> >> >  <pool name="sample_pool">
> >> >>> >> >    <minMaps>5</minMaps>
> >> >>> >> >    <minReduces>5</minReduces>
> >> >>> >> >    <maxMaps>25</maxMaps>
> >> >>> >> >    <maxReduces>25</maxReduces>
> >> >>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >>> >> >  </pool>
> >> >>> >> >  <user name="sample_user">
> >> >>> >> >    <maxRunningJobs>6</maxRunningJobs>
> >> >>> >> >  </user>
> >> >>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> >>> >> > </allocations>
> >> >>> >> >
> >> >>> >> > I tried creating more pools, its happening, but how to allocate
> >> >>> users to
> >> >>> >> > use specific pools ?
> >> >>> >> >
> >> >>> >> > Thanks,
> >> >>> >> > Praveenesh
> >> >>> >> >
> >> >>> >>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Harsh J
> >> >>> Customer Ops. Engineer, Cloudera
> >> >>>
> >> >>
> >> >>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >> Customer Ops. Engineer, Cloudera
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>

Re: Understanding fair schedulers

Posted by Harsh J <ha...@cloudera.com>.

Not exactly. See, the poolnameproperty being group.name will map the
group name as a pool name. So you need to only use <pool name="ABC">
for configuring a group "ABC". Does that make sense?

On Wed, Jan 25, 2012 at 8:49 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Then in that case, will I be using group name tag in allocations file, like
> this inside each pool ?
>
> < group name="ABC">
>    <maxRunningJobs>6</maxRunningJobs>
>  </group>
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> A solution would be to place your users into groups, and use
>> group.name identifier to be the  poolnameproperty. Would this work for
>> you instead?
>>
>> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <pr...@gmail.com>
>> wrote:
>> > Also, with the above mentioned method, my problem is I am having one
>> > pool/user (thats obviously not a good way of configuring schedulers)
>> > How can I allocate multiple users to one pool in the xml properties, so
>> > that I don't have to care giving any options inside my codes.
>> >
>> > Thanks,
>> > Praveenesh
>> >
>> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveenesh@gmail.com
>> >wrote:
>> >
>> >> I am looking for the solution where we can do it permanently without
>> >> specify these things inside jobs.
>> >> I want to keep these things hidden from the end-user.
>> >> End-user would just write pig scripts and all the jobs submitted by the
>> >> particular user will get submit to their respective pools automatically.
>> >>
>> >> What I am doing write now is something like this
>> >>
>> >>  <allocations>
>> >>   <pool name="ABC">
>> >>     <minMaps>10</minMaps>
>> >>     <minReduces>10</minReduces>
>> >>     <maxMaps>192</maxMaps>
>> >>     <maxReduces>96</maxReduces>
>> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >>   </pool>
>> >>   <user name="ABC">
>> >>
>> >>     <maxRunningJobs>6</maxRunningJobs>
>> >>   </user>
>> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
>> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >>
>> >>   <pool name="XYZ">
>> >>     <minMaps>10</minMaps>
>> >>     <minReduces>10</minReduces>
>> >>     <maxMaps>192</maxMaps>
>> >>     <maxReduces>96</maxReduces>
>> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >>   </pool>
>> >>   <user name="XYZ">
>> >>
>> >>    <maxRunningJobs>6</maxRunningJobs>
>> >>   </user>
>> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
>> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >>
>> >> </allocations>
>> >>
>> >> By doing this, I am able to see different pools per user, without
>> >> mentioning anything inside the jobs.
>> >> Automatically jobs are going to the respective pools.
>> >>
>> >> But what I wanted to know , is this the right method to do ?
>> >>
>> >> Thanks,
>> >> Praveenesh
>> >>
>> >>
>> >>
>> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >>> Set the property in Pig with the 'set' command or other ways:
>> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>> >>>
>> >>> As Srinivas covered earlier, pool allocation can be done per-user if
>> >>> you set the scheduler poolnameproperty to "user.name". Per group if
>> >>> you set the property to "group.name".
>> >>>
>> >>> Then you can provide per-poolname config overrides via the "pool"
>> >>> element config described in
>> >>>
>> >>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>> >>>
>> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
>> praveenesh@gmail.com>
>> >>> wrote:
>> >>> > I am running pig jobs, how can I specify on which pool, it should
>> run ?
>> >>> > Also do you mean, the pool allocation is done job wise, not user
>> wise ?
>> >>> >
>> >>> >
>> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com
>> >
>> >>> wrote:
>> >>> >
>> >>> >> Praveenesh,
>> >>> >>
>> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> >>> while
>> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
>> set
>> >>> to
>> >>> >> user.name ( each job run by user is allocated to his named pool )
>> and
>> >>> you
>> >>> >> can also change this property to group.name.
>> >>> >>
>> >>> >> Srinivas --
>> >>> >>
>> >>> >> Also, you can set
>> >>> >>
>> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> >>> praveenesh@gmail.com
>> >>> >> >wrote:
>> >>> >>
>> >>> >> > Understanding Fair Schedulers better.
>> >>> >> >
>> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
>> Please
>> >>> >> > correct me.
>> >>> >> >
>> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >>> >> >
>> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >>> >> Reduce :
>> >>> >> > 50
>> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> >>> Reduce :
>> >>> >> > 80
>> >>> >> >
>> >>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >>> >> specific
>> >>> >> > pools to specific users ?
>> >>> >> >
>> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >>> >> user3,user4,user5
>> >>> >> > to use "Admin users"
>> >>> >> >
>> >>> >> > In
>> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >>> >> > they have mentioned allocations something like this.
>> >>> >> >
>> >>> >> > <?xml version="1.0"?>
>> >>> >> > <allocations>
>> >>> >> >  <pool name="sample_pool">
>> >>> >> >    <minMaps>5</minMaps>
>> >>> >> >    <minReduces>5</minReduces>
>> >>> >> >    <maxMaps>25</maxMaps>
>> >>> >> >    <maxReduces>25</maxReduces>
>> >>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >>> >> >  </pool>
>> >>> >> >  <user name="sample_user">
>> >>> >> >    <maxRunningJobs>6</maxRunningJobs>
>> >>> >> >  </user>
>> >>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
>> >>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >>> >> > </allocations>
>> >>> >> >
>> >>> >> > I tried creating more pools, its happening, but how to allocate
>> >>> users to
>> >>> >> > use specific pools ?
>> >>> >> >
>> >>> >> > Thanks,
>> >>> >> > Praveenesh
>> >>> >> >
>> >>> >>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Harsh J
>> >>> Customer Ops. Engineer, Cloudera
>> >>>
>> >>
>> >>
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera

Re: Understanding fair schedulers

Posted by praveenesh kumar <pr...@gmail.com>.

Then in that case, will I be using group name tag in allocations file, like
this inside each pool ?

< group name="ABC">
    <maxRunningJobs>6</maxRunningJobs>
  </group>

Thanks,
Praveenesh

On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote:

> A solution would be to place your users into groups, and use
> group.name identifier to be the  poolnameproperty. Would this work for
> you instead?
>
> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > Also, with the above mentioned method, my problem is I am having one
> > pool/user (thats obviously not a good way of configuring schedulers)
> > How can I allocate multiple users to one pool in the xml properties, so
> > that I don't have to care giving any options inside my codes.
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
> >
> >> I am looking for the solution where we can do it permanently without
> >> specify these things inside jobs.
> >> I want to keep these things hidden from the end-user.
> >> End-user would just write pig scripts and all the jobs submitted by the
> >> particular user will get submit to their respective pools automatically.
> >>
> >> What I am doing write now is something like this
> >>
> >>  <allocations>
> >>   <pool name="ABC">
> >>     <minMaps>10</minMaps>
> >>     <minReduces>10</minReduces>
> >>     <maxMaps>192</maxMaps>
> >>     <maxReduces>96</maxReduces>
> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>   </pool>
> >>   <user name="ABC">
> >>
> >>     <maxRunningJobs>6</maxRunningJobs>
> >>   </user>
> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>
> >>   <pool name="XYZ">
> >>     <minMaps>10</minMaps>
> >>     <minReduces>10</minReduces>
> >>     <maxMaps>192</maxMaps>
> >>     <maxReduces>96</maxReduces>
> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>   </pool>
> >>   <user name="XYZ">
> >>
> >>    <maxRunningJobs>6</maxRunningJobs>
> >>   </user>
> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>
> >> </allocations>
> >>
> >> By doing this, I am able to see different pools per user, without
> >> mentioning anything inside the jobs.
> >> Automatically jobs are going to the respective pools.
> >>
> >> But what I wanted to know , is this the right method to do ?
> >>
> >> Thanks,
> >> Praveenesh
> >>
> >>
> >>
> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >>> Set the property in Pig with the 'set' command or other ways:
> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >>>
> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >>> you set the property to "group.name".
> >>>
> >>> Then you can provide per-poolname config overrides via the "pool"
> >>> element config described in
> >>>
> >>>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >>>
> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> praveenesh@gmail.com>
> >>> wrote:
> >>> > I am running pig jobs, how can I specify on which pool, it should
> run ?
> >>> > Also do you mean, the pool allocation is done job wise, not user
> wise ?
> >>> >
> >>> >
> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com
> >
> >>> wrote:
> >>> >
> >>> >> Praveenesh,
> >>> >>
> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> >>> while
> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> set
> >>> to
> >>> >> user.name ( each job run by user is allocated to his named pool )
> and
> >>> you
> >>> >> can also change this property to group.name.
> >>> >>
> >>> >> Srinivas --
> >>> >>
> >>> >> Also, you can set
> >>> >>
> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >>> praveenesh@gmail.com
> >>> >> >wrote:
> >>> >>
> >>> >> > Understanding Fair Schedulers better.
> >>> >> >
> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> Please
> >>> >> > correct me.
> >>> >> >
> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >>> >> >
> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >>> >> Reduce :
> >>> >> > 50
> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> >>> Reduce :
> >>> >> > 80
> >>> >> >
> >>> >> > I have 5 users, who will be using these pools. How will I allocate
> >>> >> specific
> >>> >> > pools to specific users ?
> >>> >> >
> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >>> >> user3,user4,user5
> >>> >> > to use "Admin users"
> >>> >> >
> >>> >> > In
> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >>> >> > they have mentioned allocations something like this.
> >>> >> >
> >>> >> > <?xml version="1.0"?>
> >>> >> > <allocations>
> >>> >> >  <pool name="sample_pool">
> >>> >> >    <minMaps>5</minMaps>
> >>> >> >    <minReduces>5</minReduces>
> >>> >> >    <maxMaps>25</maxMaps>
> >>> >> >    <maxReduces>25</maxReduces>
> >>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>> >> >  </pool>
> >>> >> >  <user name="sample_user">
> >>> >> >    <maxRunningJobs>6</maxRunningJobs>
> >>> >> >  </user>
> >>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
> >>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>> >> > </allocations>
> >>> >> >
> >>> >> > I tried creating more pools, its happening, but how to allocate
> >>> users to
> >>> >> > use specific pools ?
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Praveenesh
> >>> >> >
> >>> >>
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>> Customer Ops. Engineer, Cloudera
> >>>
> >>
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>

Re: Understanding fair schedulers

Posted by Harsh J <ha...@cloudera.com>.

A solution would be to place your users into groups, and use
group.name identifier to be the  poolnameproperty. Would this work for
you instead?

On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Also, with the above mentioned method, my problem is I am having one
> pool/user (thats obviously not a good way of configuring schedulers)
> How can I allocate multiple users to one pool in the xml properties, so
> that I don't have to care giving any options inside my codes.
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <pr...@gmail.com>wrote:
>
>> I am looking for the solution where we can do it permanently without
>> specify these things inside jobs.
>> I want to keep these things hidden from the end-user.
>> End-user would just write pig scripts and all the jobs submitted by the
>> particular user will get submit to their respective pools automatically.
>>
>> What I am doing write now is something like this
>>
>>  <allocations>
>>   <pool name="ABC">
>>     <minMaps>10</minMaps>
>>     <minReduces>10</minReduces>
>>     <maxMaps>192</maxMaps>
>>     <maxReduces>96</maxReduces>
>>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>   </pool>
>>   <user name="ABC">
>>
>>     <maxRunningJobs>6</maxRunningJobs>
>>   </user>
>>   <userMaxJobsDefault>3</userMaxJobsDefault>
>>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>
>>   <pool name="XYZ">
>>     <minMaps>10</minMaps>
>>     <minReduces>10</minReduces>
>>     <maxMaps>192</maxMaps>
>>     <maxReduces>96</maxReduces>
>>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>   </pool>
>>   <user name="XYZ">
>>
>>    <maxRunningJobs>6</maxRunningJobs>
>>   </user>
>>   <userMaxJobsDefault>3</userMaxJobsDefault>
>>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>
>> </allocations>
>>
>> By doing this, I am able to see different pools per user, without
>> mentioning anything inside the jobs.
>> Automatically jobs are going to the respective pools.
>>
>> But what I wanted to know , is this the right method to do ?
>>
>> Thanks,
>> Praveenesh
>>
>>
>>
>> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Set the property in Pig with the 'set' command or other ways:
>>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>>
>>> As Srinivas covered earlier, pool allocation can be done per-user if
>>> you set the scheduler poolnameproperty to "user.name". Per group if
>>> you set the property to "group.name".
>>>
>>> Then you can provide per-poolname config overrides via the "pool"
>>> element config described in
>>>
>>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>>
>>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com>
>>> wrote:
>>> > I am running pig jobs, how can I specify on which pool, it should run ?
>>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>>> >
>>> >
>>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com>
>>> wrote:
>>> >
>>> >> Praveenesh,
>>> >>
>>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>>> while
>>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>>> to
>>> >> user.name ( each job run by user is allocated to his named pool ) and
>>> you
>>> >> can also change this property to group.name.
>>> >>
>>> >> Srinivas --
>>> >>
>>> >> Also, you can set
>>> >>
>>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>>> praveenesh@gmail.com
>>> >> >wrote:
>>> >>
>>> >> > Understanding Fair Schedulers better.
>>> >> >
>>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>>> >> > correct me.
>>> >> >
>>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>>> >> >
>>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>>> >> Reduce :
>>> >> > 50
>>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>>> Reduce :
>>> >> > 80
>>> >> >
>>> >> > I have 5 users, who will be using these pools. How will I allocate
>>> >> specific
>>> >> > pools to specific users ?
>>> >> >
>>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>>> >> user3,user4,user5
>>> >> > to use "Admin users"
>>> >> >
>>> >> > In
>>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>>> >> > they have mentioned allocations something like this.
>>> >> >
>>> >> > <?xml version="1.0"?>
>>> >> > <allocations>
>>> >> >  <pool name="sample_pool">
>>> >> >    <minMaps>5</minMaps>
>>> >> >    <minReduces>5</minReduces>
>>> >> >    <maxMaps>25</maxMaps>
>>> >> >    <maxReduces>25</maxReduces>
>>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>> >> >  </pool>
>>> >> >  <user name="sample_user">
>>> >> >    <maxRunningJobs>6</maxRunningJobs>
>>> >> >  </user>
>>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
>>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>> >> > </allocations>
>>> >> >
>>> >> > I tried creating more pools, its happening, but how to allocate
>>> users to
>>> >> > use specific pools ?
>>> >> >
>>> >> > Thanks,
>>> >> > Praveenesh
>>> >> >
>>> >>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>> Customer Ops. Engineer, Cloudera
>>>
>>
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera

Re: Understanding fair schedulers

Posted by praveenesh kumar <pr...@gmail.com>.

Also, with the above mentioned method, my problem is I am having one
pool/user (thats obviously not a good way of configuring schedulers)
How can I allocate multiple users to one pool in the xml properties, so
that I don't have to care giving any options inside my codes.

Thanks,
Praveenesh

On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <pr...@gmail.com>wrote:

> I am looking for the solution where we can do it permanently without
> specify these things inside jobs.
> I want to keep these things hidden from the end-user.
> End-user would just write pig scripts and all the jobs submitted by the
> particular user will get submit to their respective pools automatically.
>
> What I am doing write now is something like this
>
>  <allocations>
>   <pool name="ABC">
>     <minMaps>10</minMaps>
>     <minReduces>10</minReduces>
>     <maxMaps>192</maxMaps>
>     <maxReduces>96</maxReduces>
>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>   </pool>
>   <user name="ABC">
>
>     <maxRunningJobs>6</maxRunningJobs>
>   </user>
>   <userMaxJobsDefault>3</userMaxJobsDefault>
>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>
>   <pool name="XYZ">
>     <minMaps>10</minMaps>
>     <minReduces>10</minReduces>
>     <maxMaps>192</maxMaps>
>     <maxReduces>96</maxReduces>
>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>   </pool>
>   <user name="XYZ">
>
>    <maxRunningJobs>6</maxRunningJobs>
>   </user>
>   <userMaxJobsDefault>3</userMaxJobsDefault>
>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>
> </allocations>
>
> By doing this, I am able to see different pools per user, without
> mentioning anything inside the jobs.
> Automatically jobs are going to the respective pools.
>
> But what I wanted to know , is this the right method to do ?
>
> Thanks,
> Praveenesh
>
>
>
> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Set the property in Pig with the 'set' command or other ways:
>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>
>> As Srinivas covered earlier, pool allocation can be done per-user if
>> you set the scheduler poolnameproperty to "user.name". Per group if
>> you set the property to "group.name".
>>
>> Then you can provide per-poolname config overrides via the "pool"
>> element config described in
>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>
>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com>
>> wrote:
>> > I am running pig jobs, how can I specify on which pool, it should run ?
>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>> >
>> >
>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com>
>> wrote:
>> >
>> >> Praveenesh,
>> >>
>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> while
>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>> to
>> >> user.name ( each job run by user is allocated to his named pool ) and
>> you
>> >> can also change this property to group.name.
>> >>
>> >> Srinivas --
>> >>
>> >> Also, you can set
>> >>
>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> praveenesh@gmail.com
>> >> >wrote:
>> >>
>> >> > Understanding Fair Schedulers better.
>> >> >
>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> >> > correct me.
>> >> >
>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >> >
>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >> Reduce :
>> >> > 50
>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> Reduce :
>> >> > 80
>> >> >
>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >> specific
>> >> > pools to specific users ?
>> >> >
>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >> user3,user4,user5
>> >> > to use "Admin users"
>> >> >
>> >> > In
>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >> > they have mentioned allocations something like this.
>> >> >
>> >> > <?xml version="1.0"?>
>> >> > <allocations>
>> >> >  <pool name="sample_pool">
>> >> >    <minMaps>5</minMaps>
>> >> >    <minReduces>5</minReduces>
>> >> >    <maxMaps>25</maxMaps>
>> >> >    <maxReduces>25</maxReduces>
>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >> >  </pool>
>> >> >  <user name="sample_user">
>> >> >    <maxRunningJobs>6</maxRunningJobs>
>> >> >  </user>
>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >> > </allocations>
>> >> >
>> >> > I tried creating more pools, its happening, but how to allocate
>> users to
>> >> > use specific pools ?
>> >> >
>> >> > Thanks,
>> >> > Praveenesh
>> >> >
>> >>
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
>
>

Re: Understanding fair schedulers

Posted by praveenesh kumar <pr...@gmail.com>.

I am looking for the solution where we can do it permanently without
specify these things inside jobs.
I want to keep these things hidden from the end-user.
End-user would just write pig scripts and all the jobs submitted by the
particular user will get submit to their respective pools automatically.

What I am doing write now is something like this

 <allocations>
  <pool name="ABC">
    <minMaps>10</minMaps>
    <minReduces>10</minReduces>
    <maxMaps>192</maxMaps>
    <maxReduces>96</maxReduces>
    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
  </pool>
  <user name="ABC">
    <maxRunningJobs>6</maxRunningJobs>
  </user>
  <userMaxJobsDefault>3</userMaxJobsDefault>
  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>

  <pool name="XYZ">
    <minMaps>10</minMaps>
    <minReduces>10</minReduces>
    <maxMaps>192</maxMaps>
    <maxReduces>96</maxReduces>
    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
  </pool>
  <user name="XYZ">
   <maxRunningJobs>6</maxRunningJobs>
  </user>
  <userMaxJobsDefault>3</userMaxJobsDefault>
  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>

</allocations>

By doing this, I am able to see different pools per user, without
mentioning anything inside the jobs.
Automatically jobs are going to the respective pools.

But what I wanted to know , is this the right method to do ?

Thanks,
Praveenesh


On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:

> Set the property in Pig with the 'set' command or other ways:
> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> http://pig.apache.org/docs/r0.9.1/start.html#properties
>
> As Srinivas covered earlier, pool allocation can be done per-user if
> you set the scheduler poolnameproperty to "user.name". Per group if
> you set the property to "group.name".
>
> Then you can provide per-poolname config overrides via the "pool"
> element config described in
>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>
> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > I am running pig jobs, how can I specify on which pool, it should run ?
> > Also do you mean, the pool allocation is done job wise, not user wise ?
> >
> >
> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com>
> wrote:
> >
> >> Praveenesh,
> >>
> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> while
> >> running the job. By default, mapred.faircheduler.poolnameproperty set to
> >> user.name ( each job run by user is allocated to his named pool ) and
> you
> >> can also change this property to group.name.
> >>
> >> Srinivas --
> >>
> >> Also, you can set
> >>
> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
> >> >wrote:
> >>
> >> > Understanding Fair Schedulers better.
> >> >
> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> >> > correct me.
> >> >
> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >
> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >> Reduce :
> >> > 50
> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> Reduce :
> >> > 80
> >> >
> >> > I have 5 users, who will be using these pools. How will I allocate
> >> specific
> >> > pools to specific users ?
> >> >
> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> user3,user4,user5
> >> > to use "Admin users"
> >> >
> >> > In
> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >> > they have mentioned allocations something like this.
> >> >
> >> > <?xml version="1.0"?>
> >> > <allocations>
> >> >  <pool name="sample_pool">
> >> >    <minMaps>5</minMaps>
> >> >    <minReduces>5</minReduces>
> >> >    <maxMaps>25</maxMaps>
> >> >    <maxReduces>25</maxReduces>
> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >  </pool>
> >> >  <user name="sample_user">
> >> >    <maxRunningJobs>6</maxRunningJobs>
> >> >  </user>
> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> > </allocations>
> >> >
> >> > I tried creating more pools, its happening, but how to allocate users
> to
> >> > use specific pools ?
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>

Re: Understanding fair schedulers

Posted by Harsh J <ha...@cloudera.com>.

Set the property in Pig with the 'set' command or other ways:
http://pig.apache.org/docs/r0.9.1/cmds.html#set or
http://pig.apache.org/docs/r0.9.1/start.html#properties

As Srinivas covered earlier, pool allocation can be done per-user if
you set the scheduler poolnameproperty to "user.name". Per group if
you set the property to "group.name".

Then you can provide per-poolname config overrides via the "pool"
element config described in
http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29

On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com> wrote:
> I am running pig jobs, how can I specify on which pool, it should run ?
> Also do you mean, the pool allocation is done job wise, not user wise ?
>
>
> On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com> wrote:
>
>> Praveenesh,
>>
>> You can try specifying "mapred.fairscheduler.pool" to your pool name while
>> running the job. By default, mapred.faircheduler.poolnameproperty set to
>> user.name ( each job run by user is allocated to his named pool ) and you
>> can also change this property to group.name.
>>
>> Srinivas --
>>
>> Also, you can set
>>
>> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
>> >wrote:
>>
>> > Understanding Fair Schedulers better.
>> >
>> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> > correct me.
>> >
>> > Suppose I have 2 pools in my fair-scheduler.xml
>> >
>> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> Reduce :
>> > 50
>> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
>> > 80
>> >
>> > I have 5 users, who will be using these pools. How will I allocate
>> specific
>> > pools to specific users ?
>> >
>> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> user3,user4,user5
>> > to use "Admin users"
>> >
>> > In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> > they have mentioned allocations something like this.
>> >
>> > <?xml version="1.0"?>
>> > <allocations>
>> >  <pool name="sample_pool">
>> >    <minMaps>5</minMaps>
>> >    <minReduces>5</minReduces>
>> >    <maxMaps>25</maxMaps>
>> >    <maxReduces>25</maxReduces>
>> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >  </pool>
>> >  <user name="sample_user">
>> >    <maxRunningJobs>6</maxRunningJobs>
>> >  </user>
>> >  <userMaxJobsDefault>3</userMaxJobsDefault>
>> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> > </allocations>
>> >
>> > I tried creating more pools, its happening, but how to allocate users to
>> > use specific pools ?
>> >
>> > Thanks,
>> > Praveenesh
>> >
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera

Re: Understanding fair schedulers

Posted by praveenesh kumar <pr...@gmail.com>.

I am running pig jobs, how can I specify on which pool, it should run ?
Also do you mean, the pool allocation is done job wise, not user wise ?


On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com> wrote:

> Praveenesh,
>
> You can try specifying "mapred.fairscheduler.pool" to your pool name while
> running the job. By default, mapred.faircheduler.poolnameproperty set to
> user.name ( each job run by user is allocated to his named pool ) and you
> can also change this property to group.name.
>
> Srinivas --
>
> Also, you can set
>
> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
>
> > Understanding Fair Schedulers better.
> >
> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> > correct me.
> >
> > Suppose I have 2 pools in my fair-scheduler.xml
> >
> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> Reduce :
> > 50
> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> > 80
> >
> > I have 5 users, who will be using these pools. How will I allocate
> specific
> > pools to specific users ?
> >
> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> user3,user4,user5
> > to use "Admin users"
> >
> > In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> > they have mentioned allocations something like this.
> >
> > <?xml version="1.0"?>
> > <allocations>
> >  <pool name="sample_pool">
> >    <minMaps>5</minMaps>
> >    <minReduces>5</minReduces>
> >    <maxMaps>25</maxMaps>
> >    <maxReduces>25</maxReduces>
> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >  </pool>
> >  <user name="sample_user">
> >    <maxRunningJobs>6</maxRunningJobs>
> >  </user>
> >  <userMaxJobsDefault>3</userMaxJobsDefault>
> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> > </allocations>
> >
> > I tried creating more pools, its happening, but how to allocate users to
> > use specific pools ?
> >
> > Thanks,
> > Praveenesh
> >
>

Re: Understanding fair schedulers

Posted by Srinivas Surasani <va...@gmail.com>.

Praveenesh,

You can try specifying "mapred.fairscheduler.pool" to your pool name while
running the job. By default, mapred.faircheduler.poolnameproperty set to
user.name ( each job run by user is allocated to his named pool ) and you
can also change this property to group.name.

Srinivas --

Also, you can set

On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <pr...@gmail.com>wrote:

> Understanding Fair Schedulers better.
>
> Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> correct me.
>
> Suppose I have 2 pools in my fair-scheduler.xml
>
> 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
> 50
> 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> 80
>
> I have 5 users, who will be using these pools. How will I allocate specific
> pools to specific users ?
>
> Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
> to use "Admin users"
>
> In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> they have mentioned allocations something like this.
>
> <?xml version="1.0"?>
> <allocations>
>  <pool name="sample_pool">
>    <minMaps>5</minMaps>
>    <minReduces>5</minReduces>
>    <maxMaps>25</maxMaps>
>    <maxReduces>25</maxReduces>
>    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>  </pool>
>  <user name="sample_user">
>    <maxRunningJobs>6</maxRunningJobs>
>  </user>
>  <userMaxJobsDefault>3</userMaxJobsDefault>
>  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> </allocations>
>
> I tried creating more pools, its happening, but how to allocate users to
> use specific pools ?
>
> Thanks,
> Praveenesh
>