You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2012/01/25 12:24:19 UTC
Understanding fair schedulers
Understanding Fair Schedulers better.
Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
correct me.
Suppose I have 2 pools in my fair-scheduler.xml
1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
50
2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce : 80
I have 5 users, who will be using these pools. How will I allocate specific
pools to specific users ?
Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
to use "Admin users"
In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
they have mentioned allocations something like this.
<?xml version="1.0"?>
<allocations>
<pool name="sample_pool">
<minMaps>5</minMaps>
<minReduces>5</minReduces>
<maxMaps>25</maxMaps>
<maxReduces>25</maxReduces>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
</pool>
<user name="sample_user">
<maxRunningJobs>6</maxRunningJobs>
</user>
<userMaxJobsDefault>3</userMaxJobsDefault>
<fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
</allocations>
I tried creating more pools, its happening, but how to allocate users to
use specific pools ?
Thanks,
Praveenesh
Re: Understanding fair schedulers
Posted by Srinivas Surasani <va...@gmail.com>.
Praveenesh,
You can try specifying "mapred.fairscheduler.pool" to your pool name while
running the job. By default, mapred.faircheduler.poolnameproperty set to
user.name ( each job run by user is allocated to his named pool ) and you
can also change this property to group.name.
Srinivas --
Also, you can set
On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <pr...@gmail.com>wrote:
> Understanding Fair Schedulers better.
>
> Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> correct me.
>
> Suppose I have 2 pools in my fair-scheduler.xml
>
> 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
> 50
> 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> 80
>
> I have 5 users, who will be using these pools. How will I allocate specific
> pools to specific users ?
>
> Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
> to use "Admin users"
>
> In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> they have mentioned allocations something like this.
>
> <?xml version="1.0"?>
> <allocations>
> <pool name="sample_pool">
> <minMaps>5</minMaps>
> <minReduces>5</minReduces>
> <maxMaps>25</maxMaps>
> <maxReduces>25</maxReduces>
> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> </pool>
> <user name="sample_user">
> <maxRunningJobs>6</maxRunningJobs>
> </user>
> <userMaxJobsDefault>3</userMaxJobsDefault>
> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> </allocations>
>
> I tried creating more pools, its happening, but how to allocate users to
> use specific pools ?
>
> Thanks,
> Praveenesh
>
Re: Understanding fair schedulers
Posted by praveenesh kumar <pr...@gmail.com>.
okie got it.. same pool name.. as group name...
On Wed, Jan 25, 2012 at 8:51 PM, Harsh J <ha...@cloudera.com> wrote:
> Not exactly. See, the poolnameproperty being group.name will map the
> group name as a pool name. So you need to only use <pool name="ABC">
> for configuring a group "ABC". Does that make sense?
>
> On Wed, Jan 25, 2012 at 8:49 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > Then in that case, will I be using group name tag in allocations file,
> like
> > this inside each pool ?
> >
> > < group name="ABC">
> > <maxRunningJobs>6</maxRunningJobs>
> > </group>
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote:
> >
> >> A solution would be to place your users into groups, and use
> >> group.name identifier to be the poolnameproperty. Would this work for
> >> you instead?
> >>
> >> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <praveenesh@gmail.com
> >
> >> wrote:
> >> > Also, with the above mentioned method, my problem is I am having one
> >> > pool/user (thats obviously not a good way of configuring schedulers)
> >> > How can I allocate multiple users to one pool in the xml properties,
> so
> >> > that I don't have to care giving any options inside my codes.
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <
> praveenesh@gmail.com
> >> >wrote:
> >> >
> >> >> I am looking for the solution where we can do it permanently without
> >> >> specify these things inside jobs.
> >> >> I want to keep these things hidden from the end-user.
> >> >> End-user would just write pig scripts and all the jobs submitted by
> the
> >> >> particular user will get submit to their respective pools
> automatically.
> >> >>
> >> >> What I am doing write now is something like this
> >> >>
> >> >> <allocations>
> >> >> <pool name="ABC">
> >> >> <minMaps>10</minMaps>
> >> >> <minReduces>10</minReduces>
> >> >> <maxMaps>192</maxMaps>
> >> >> <maxReduces>96</maxReduces>
> >> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >> </pool>
> >> >> <user name="ABC">
> >> >>
> >> >> <maxRunningJobs>6</maxRunningJobs>
> >> >> </user>
> >> >> <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> >>
> >> >> <pool name="XYZ">
> >> >> <minMaps>10</minMaps>
> >> >> <minReduces>10</minReduces>
> >> >> <maxMaps>192</maxMaps>
> >> >> <maxReduces>96</maxReduces>
> >> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >> </pool>
> >> >> <user name="XYZ">
> >> >>
> >> >> <maxRunningJobs>6</maxRunningJobs>
> >> >> </user>
> >> >> <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> >>
> >> >> </allocations>
> >> >>
> >> >> By doing this, I am able to see different pools per user, without
> >> >> mentioning anything inside the jobs.
> >> >> Automatically jobs are going to the respective pools.
> >> >>
> >> >> But what I wanted to know , is this the right method to do ?
> >> >>
> >> >> Thanks,
> >> >> Praveenesh
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >> >>
> >> >>> Set the property in Pig with the 'set' command or other ways:
> >> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >> >>>
> >> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >> >>> you set the property to "group.name".
> >> >>>
> >> >>> Then you can provide per-poolname config overrides via the "pool"
> >> >>> element config described in
> >> >>>
> >> >>>
> >>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >> >>>
> >> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> >> praveenesh@gmail.com>
> >> >>> wrote:
> >> >>> > I am running pig jobs, how can I specify on which pool, it should
> >> run ?
> >> >>> > Also do you mean, the pool allocation is done job wise, not user
> >> wise ?
> >> >>> >
> >> >>> >
> >> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <
> vasajb@gmail.com
> >> >
> >> >>> wrote:
> >> >>> >
> >> >>> >> Praveenesh,
> >> >>> >>
> >> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool
> name
> >> >>> while
> >> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> >> set
> >> >>> to
> >> >>> >> user.name ( each job run by user is allocated to his named pool
> )
> >> and
> >> >>> you
> >> >>> >> can also change this property to group.name.
> >> >>> >>
> >> >>> >> Srinivas --
> >> >>> >>
> >> >>> >> Also, you can set
> >> >>> >>
> >> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >> >>> praveenesh@gmail.com
> >> >>> >> >wrote:
> >> >>> >>
> >> >>> >> > Understanding Fair Schedulers better.
> >> >>> >> >
> >> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> >> Please
> >> >>> >> > correct me.
> >> >>> >> >
> >> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >>> >> >
> >> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10,
> Max
> >> >>> >> Reduce :
> >> >>> >> > 50
> >> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20,
> Max
> >> >>> Reduce :
> >> >>> >> > 80
> >> >>> >> >
> >> >>> >> > I have 5 users, who will be using these pools. How will I
> allocate
> >> >>> >> specific
> >> >>> >> > pools to specific users ?
> >> >>> >> >
> >> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> >>> >> user3,user4,user5
> >> >>> >> > to use "Admin users"
> >> >>> >> >
> >> >>> >> > In
> >> >>>
> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >> >>> >> > they have mentioned allocations something like this.
> >> >>> >> >
> >> >>> >> > <?xml version="1.0"?>
> >> >>> >> > <allocations>
> >> >>> >> > <pool name="sample_pool">
> >> >>> >> > <minMaps>5</minMaps>
> >> >>> >> > <minReduces>5</minReduces>
> >> >>> >> > <maxMaps>25</maxMaps>
> >> >>> >> > <maxReduces>25</maxReduces>
> >> >>> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >>> >> > </pool>
> >> >>> >> > <user name="sample_user">
> >> >>> >> > <maxRunningJobs>6</maxRunningJobs>
> >> >>> >> > </user>
> >> >>> >> > <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >>> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> >>> >> > </allocations>
> >> >>> >> >
> >> >>> >> > I tried creating more pools, its happening, but how to allocate
> >> >>> users to
> >> >>> >> > use specific pools ?
> >> >>> >> >
> >> >>> >> > Thanks,
> >> >>> >> > Praveenesh
> >> >>> >> >
> >> >>> >>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Harsh J
> >> >>> Customer Ops. Engineer, Cloudera
> >> >>>
> >> >>
> >> >>
> >>
> >>
> >>
> >> --
> >> Harsh J
> >> Customer Ops. Engineer, Cloudera
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>
Re: Understanding fair schedulers
Posted by Harsh J <ha...@cloudera.com>.
Not exactly. See, the poolnameproperty being group.name will map the
group name as a pool name. So you need to only use <pool name="ABC">
for configuring a group "ABC". Does that make sense?
On Wed, Jan 25, 2012 at 8:49 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Then in that case, will I be using group name tag in allocations file, like
> this inside each pool ?
>
> < group name="ABC">
> <maxRunningJobs>6</maxRunningJobs>
> </group>
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> A solution would be to place your users into groups, and use
>> group.name identifier to be the poolnameproperty. Would this work for
>> you instead?
>>
>> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <pr...@gmail.com>
>> wrote:
>> > Also, with the above mentioned method, my problem is I am having one
>> > pool/user (thats obviously not a good way of configuring schedulers)
>> > How can I allocate multiple users to one pool in the xml properties, so
>> > that I don't have to care giving any options inside my codes.
>> >
>> > Thanks,
>> > Praveenesh
>> >
>> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveenesh@gmail.com
>> >wrote:
>> >
>> >> I am looking for the solution where we can do it permanently without
>> >> specify these things inside jobs.
>> >> I want to keep these things hidden from the end-user.
>> >> End-user would just write pig scripts and all the jobs submitted by the
>> >> particular user will get submit to their respective pools automatically.
>> >>
>> >> What I am doing write now is something like this
>> >>
>> >> <allocations>
>> >> <pool name="ABC">
>> >> <minMaps>10</minMaps>
>> >> <minReduces>10</minReduces>
>> >> <maxMaps>192</maxMaps>
>> >> <maxReduces>96</maxReduces>
>> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >> </pool>
>> >> <user name="ABC">
>> >>
>> >> <maxRunningJobs>6</maxRunningJobs>
>> >> </user>
>> >> <userMaxJobsDefault>3</userMaxJobsDefault>
>> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >>
>> >> <pool name="XYZ">
>> >> <minMaps>10</minMaps>
>> >> <minReduces>10</minReduces>
>> >> <maxMaps>192</maxMaps>
>> >> <maxReduces>96</maxReduces>
>> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >> </pool>
>> >> <user name="XYZ">
>> >>
>> >> <maxRunningJobs>6</maxRunningJobs>
>> >> </user>
>> >> <userMaxJobsDefault>3</userMaxJobsDefault>
>> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >>
>> >> </allocations>
>> >>
>> >> By doing this, I am able to see different pools per user, without
>> >> mentioning anything inside the jobs.
>> >> Automatically jobs are going to the respective pools.
>> >>
>> >> But what I wanted to know , is this the right method to do ?
>> >>
>> >> Thanks,
>> >> Praveenesh
>> >>
>> >>
>> >>
>> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >>> Set the property in Pig with the 'set' command or other ways:
>> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>> >>>
>> >>> As Srinivas covered earlier, pool allocation can be done per-user if
>> >>> you set the scheduler poolnameproperty to "user.name". Per group if
>> >>> you set the property to "group.name".
>> >>>
>> >>> Then you can provide per-poolname config overrides via the "pool"
>> >>> element config described in
>> >>>
>> >>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>> >>>
>> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
>> praveenesh@gmail.com>
>> >>> wrote:
>> >>> > I am running pig jobs, how can I specify on which pool, it should
>> run ?
>> >>> > Also do you mean, the pool allocation is done job wise, not user
>> wise ?
>> >>> >
>> >>> >
>> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com
>> >
>> >>> wrote:
>> >>> >
>> >>> >> Praveenesh,
>> >>> >>
>> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> >>> while
>> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
>> set
>> >>> to
>> >>> >> user.name ( each job run by user is allocated to his named pool )
>> and
>> >>> you
>> >>> >> can also change this property to group.name.
>> >>> >>
>> >>> >> Srinivas --
>> >>> >>
>> >>> >> Also, you can set
>> >>> >>
>> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> >>> praveenesh@gmail.com
>> >>> >> >wrote:
>> >>> >>
>> >>> >> > Understanding Fair Schedulers better.
>> >>> >> >
>> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
>> Please
>> >>> >> > correct me.
>> >>> >> >
>> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >>> >> >
>> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >>> >> Reduce :
>> >>> >> > 50
>> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> >>> Reduce :
>> >>> >> > 80
>> >>> >> >
>> >>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >>> >> specific
>> >>> >> > pools to specific users ?
>> >>> >> >
>> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >>> >> user3,user4,user5
>> >>> >> > to use "Admin users"
>> >>> >> >
>> >>> >> > In
>> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >>> >> > they have mentioned allocations something like this.
>> >>> >> >
>> >>> >> > <?xml version="1.0"?>
>> >>> >> > <allocations>
>> >>> >> > <pool name="sample_pool">
>> >>> >> > <minMaps>5</minMaps>
>> >>> >> > <minReduces>5</minReduces>
>> >>> >> > <maxMaps>25</maxMaps>
>> >>> >> > <maxReduces>25</maxReduces>
>> >>> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >>> >> > </pool>
>> >>> >> > <user name="sample_user">
>> >>> >> > <maxRunningJobs>6</maxRunningJobs>
>> >>> >> > </user>
>> >>> >> > <userMaxJobsDefault>3</userMaxJobsDefault>
>> >>> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >>> >> > </allocations>
>> >>> >> >
>> >>> >> > I tried creating more pools, its happening, but how to allocate
>> >>> users to
>> >>> >> > use specific pools ?
>> >>> >> >
>> >>> >> > Thanks,
>> >>> >> > Praveenesh
>> >>> >> >
>> >>> >>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Harsh J
>> >>> Customer Ops. Engineer, Cloudera
>> >>>
>> >>
>> >>
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
--
Harsh J
Customer Ops. Engineer, Cloudera
Re: Understanding fair schedulers
Posted by praveenesh kumar <pr...@gmail.com>.
Then in that case, will I be using group name tag in allocations file, like
this inside each pool ?
< group name="ABC">
<maxRunningJobs>6</maxRunningJobs>
</group>
Thanks,
Praveenesh
On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <ha...@cloudera.com> wrote:
> A solution would be to place your users into groups, and use
> group.name identifier to be the poolnameproperty. Would this work for
> you instead?
>
> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > Also, with the above mentioned method, my problem is I am having one
> > pool/user (thats obviously not a good way of configuring schedulers)
> > How can I allocate multiple users to one pool in the xml properties, so
> > that I don't have to care giving any options inside my codes.
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
> >
> >> I am looking for the solution where we can do it permanently without
> >> specify these things inside jobs.
> >> I want to keep these things hidden from the end-user.
> >> End-user would just write pig scripts and all the jobs submitted by the
> >> particular user will get submit to their respective pools automatically.
> >>
> >> What I am doing write now is something like this
> >>
> >> <allocations>
> >> <pool name="ABC">
> >> <minMaps>10</minMaps>
> >> <minReduces>10</minReduces>
> >> <maxMaps>192</maxMaps>
> >> <maxReduces>96</maxReduces>
> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> </pool>
> >> <user name="ABC">
> >>
> >> <maxRunningJobs>6</maxRunningJobs>
> >> </user>
> >> <userMaxJobsDefault>3</userMaxJobsDefault>
> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>
> >> <pool name="XYZ">
> >> <minMaps>10</minMaps>
> >> <minReduces>10</minReduces>
> >> <maxMaps>192</maxMaps>
> >> <maxReduces>96</maxReduces>
> >> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> </pool>
> >> <user name="XYZ">
> >>
> >> <maxRunningJobs>6</maxRunningJobs>
> >> </user>
> >> <userMaxJobsDefault>3</userMaxJobsDefault>
> >> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>
> >> </allocations>
> >>
> >> By doing this, I am able to see different pools per user, without
> >> mentioning anything inside the jobs.
> >> Automatically jobs are going to the respective pools.
> >>
> >> But what I wanted to know , is this the right method to do ?
> >>
> >> Thanks,
> >> Praveenesh
> >>
> >>
> >>
> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >>> Set the property in Pig with the 'set' command or other ways:
> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >>>
> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >>> you set the property to "group.name".
> >>>
> >>> Then you can provide per-poolname config overrides via the "pool"
> >>> element config described in
> >>>
> >>>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >>>
> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> praveenesh@gmail.com>
> >>> wrote:
> >>> > I am running pig jobs, how can I specify on which pool, it should
> run ?
> >>> > Also do you mean, the pool allocation is done job wise, not user
> wise ?
> >>> >
> >>> >
> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com
> >
> >>> wrote:
> >>> >
> >>> >> Praveenesh,
> >>> >>
> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> >>> while
> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> set
> >>> to
> >>> >> user.name ( each job run by user is allocated to his named pool )
> and
> >>> you
> >>> >> can also change this property to group.name.
> >>> >>
> >>> >> Srinivas --
> >>> >>
> >>> >> Also, you can set
> >>> >>
> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >>> praveenesh@gmail.com
> >>> >> >wrote:
> >>> >>
> >>> >> > Understanding Fair Schedulers better.
> >>> >> >
> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> Please
> >>> >> > correct me.
> >>> >> >
> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >>> >> >
> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >>> >> Reduce :
> >>> >> > 50
> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> >>> Reduce :
> >>> >> > 80
> >>> >> >
> >>> >> > I have 5 users, who will be using these pools. How will I allocate
> >>> >> specific
> >>> >> > pools to specific users ?
> >>> >> >
> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >>> >> user3,user4,user5
> >>> >> > to use "Admin users"
> >>> >> >
> >>> >> > In
> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >>> >> > they have mentioned allocations something like this.
> >>> >> >
> >>> >> > <?xml version="1.0"?>
> >>> >> > <allocations>
> >>> >> > <pool name="sample_pool">
> >>> >> > <minMaps>5</minMaps>
> >>> >> > <minReduces>5</minReduces>
> >>> >> > <maxMaps>25</maxMaps>
> >>> >> > <maxReduces>25</maxReduces>
> >>> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>> >> > </pool>
> >>> >> > <user name="sample_user">
> >>> >> > <maxRunningJobs>6</maxRunningJobs>
> >>> >> > </user>
> >>> >> > <userMaxJobsDefault>3</userMaxJobsDefault>
> >>> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>> >> > </allocations>
> >>> >> >
> >>> >> > I tried creating more pools, its happening, but how to allocate
> >>> users to
> >>> >> > use specific pools ?
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Praveenesh
> >>> >> >
> >>> >>
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>> Customer Ops. Engineer, Cloudera
> >>>
> >>
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>
Re: Understanding fair schedulers
Posted by Harsh J <ha...@cloudera.com>.
A solution would be to place your users into groups, and use
group.name identifier to be the poolnameproperty. Would this work for
you instead?
On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <pr...@gmail.com> wrote:
> Also, with the above mentioned method, my problem is I am having one
> pool/user (thats obviously not a good way of configuring schedulers)
> How can I allocate multiple users to one pool in the xml properties, so
> that I don't have to care giving any options inside my codes.
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <pr...@gmail.com>wrote:
>
>> I am looking for the solution where we can do it permanently without
>> specify these things inside jobs.
>> I want to keep these things hidden from the end-user.
>> End-user would just write pig scripts and all the jobs submitted by the
>> particular user will get submit to their respective pools automatically.
>>
>> What I am doing write now is something like this
>>
>> <allocations>
>> <pool name="ABC">
>> <minMaps>10</minMaps>
>> <minReduces>10</minReduces>
>> <maxMaps>192</maxMaps>
>> <maxReduces>96</maxReduces>
>> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> </pool>
>> <user name="ABC">
>>
>> <maxRunningJobs>6</maxRunningJobs>
>> </user>
>> <userMaxJobsDefault>3</userMaxJobsDefault>
>> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>
>> <pool name="XYZ">
>> <minMaps>10</minMaps>
>> <minReduces>10</minReduces>
>> <maxMaps>192</maxMaps>
>> <maxReduces>96</maxReduces>
>> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> </pool>
>> <user name="XYZ">
>>
>> <maxRunningJobs>6</maxRunningJobs>
>> </user>
>> <userMaxJobsDefault>3</userMaxJobsDefault>
>> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>
>> </allocations>
>>
>> By doing this, I am able to see different pools per user, without
>> mentioning anything inside the jobs.
>> Automatically jobs are going to the respective pools.
>>
>> But what I wanted to know , is this the right method to do ?
>>
>> Thanks,
>> Praveenesh
>>
>>
>>
>> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>>> Set the property in Pig with the 'set' command or other ways:
>>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>>
>>> As Srinivas covered earlier, pool allocation can be done per-user if
>>> you set the scheduler poolnameproperty to "user.name". Per group if
>>> you set the property to "group.name".
>>>
>>> Then you can provide per-poolname config overrides via the "pool"
>>> element config described in
>>>
>>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>>
>>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com>
>>> wrote:
>>> > I am running pig jobs, how can I specify on which pool, it should run ?
>>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>>> >
>>> >
>>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com>
>>> wrote:
>>> >
>>> >> Praveenesh,
>>> >>
>>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>>> while
>>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>>> to
>>> >> user.name ( each job run by user is allocated to his named pool ) and
>>> you
>>> >> can also change this property to group.name.
>>> >>
>>> >> Srinivas --
>>> >>
>>> >> Also, you can set
>>> >>
>>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>>> praveenesh@gmail.com
>>> >> >wrote:
>>> >>
>>> >> > Understanding Fair Schedulers better.
>>> >> >
>>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>>> >> > correct me.
>>> >> >
>>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>>> >> >
>>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>>> >> Reduce :
>>> >> > 50
>>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>>> Reduce :
>>> >> > 80
>>> >> >
>>> >> > I have 5 users, who will be using these pools. How will I allocate
>>> >> specific
>>> >> > pools to specific users ?
>>> >> >
>>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>>> >> user3,user4,user5
>>> >> > to use "Admin users"
>>> >> >
>>> >> > In
>>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>>> >> > they have mentioned allocations something like this.
>>> >> >
>>> >> > <?xml version="1.0"?>
>>> >> > <allocations>
>>> >> > <pool name="sample_pool">
>>> >> > <minMaps>5</minMaps>
>>> >> > <minReduces>5</minReduces>
>>> >> > <maxMaps>25</maxMaps>
>>> >> > <maxReduces>25</maxReduces>
>>> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>>> >> > </pool>
>>> >> > <user name="sample_user">
>>> >> > <maxRunningJobs>6</maxRunningJobs>
>>> >> > </user>
>>> >> > <userMaxJobsDefault>3</userMaxJobsDefault>
>>> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>>> >> > </allocations>
>>> >> >
>>> >> > I tried creating more pools, its happening, but how to allocate
>>> users to
>>> >> > use specific pools ?
>>> >> >
>>> >> > Thanks,
>>> >> > Praveenesh
>>> >> >
>>> >>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>> Customer Ops. Engineer, Cloudera
>>>
>>
>>
--
Harsh J
Customer Ops. Engineer, Cloudera
Re: Understanding fair schedulers
Posted by praveenesh kumar <pr...@gmail.com>.
Also, with the above mentioned method, my problem is I am having one
pool/user (thats obviously not a good way of configuring schedulers)
How can I allocate multiple users to one pool in the xml properties, so
that I don't have to care giving any options inside my codes.
Thanks,
Praveenesh
On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <pr...@gmail.com>wrote:
> I am looking for the solution where we can do it permanently without
> specify these things inside jobs.
> I want to keep these things hidden from the end-user.
> End-user would just write pig scripts and all the jobs submitted by the
> particular user will get submit to their respective pools automatically.
>
> What I am doing write now is something like this
>
> <allocations>
> <pool name="ABC">
> <minMaps>10</minMaps>
> <minReduces>10</minReduces>
> <maxMaps>192</maxMaps>
> <maxReduces>96</maxReduces>
> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> </pool>
> <user name="ABC">
>
> <maxRunningJobs>6</maxRunningJobs>
> </user>
> <userMaxJobsDefault>3</userMaxJobsDefault>
> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>
> <pool name="XYZ">
> <minMaps>10</minMaps>
> <minReduces>10</minReduces>
> <maxMaps>192</maxMaps>
> <maxReduces>96</maxReduces>
> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> </pool>
> <user name="XYZ">
>
> <maxRunningJobs>6</maxRunningJobs>
> </user>
> <userMaxJobsDefault>3</userMaxJobsDefault>
> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>
> </allocations>
>
> By doing this, I am able to see different pools per user, without
> mentioning anything inside the jobs.
> Automatically jobs are going to the respective pools.
>
> But what I wanted to know , is this the right method to do ?
>
> Thanks,
> Praveenesh
>
>
>
> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Set the property in Pig with the 'set' command or other ways:
>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>
>> As Srinivas covered earlier, pool allocation can be done per-user if
>> you set the scheduler poolnameproperty to "user.name". Per group if
>> you set the property to "group.name".
>>
>> Then you can provide per-poolname config overrides via the "pool"
>> element config described in
>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>
>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com>
>> wrote:
>> > I am running pig jobs, how can I specify on which pool, it should run ?
>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>> >
>> >
>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com>
>> wrote:
>> >
>> >> Praveenesh,
>> >>
>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> while
>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>> to
>> >> user.name ( each job run by user is allocated to his named pool ) and
>> you
>> >> can also change this property to group.name.
>> >>
>> >> Srinivas --
>> >>
>> >> Also, you can set
>> >>
>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> praveenesh@gmail.com
>> >> >wrote:
>> >>
>> >> > Understanding Fair Schedulers better.
>> >> >
>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> >> > correct me.
>> >> >
>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >> >
>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >> Reduce :
>> >> > 50
>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> Reduce :
>> >> > 80
>> >> >
>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >> specific
>> >> > pools to specific users ?
>> >> >
>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >> user3,user4,user5
>> >> > to use "Admin users"
>> >> >
>> >> > In
>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >> > they have mentioned allocations something like this.
>> >> >
>> >> > <?xml version="1.0"?>
>> >> > <allocations>
>> >> > <pool name="sample_pool">
>> >> > <minMaps>5</minMaps>
>> >> > <minReduces>5</minReduces>
>> >> > <maxMaps>25</maxMaps>
>> >> > <maxReduces>25</maxReduces>
>> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >> > </pool>
>> >> > <user name="sample_user">
>> >> > <maxRunningJobs>6</maxRunningJobs>
>> >> > </user>
>> >> > <userMaxJobsDefault>3</userMaxJobsDefault>
>> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >> > </allocations>
>> >> >
>> >> > I tried creating more pools, its happening, but how to allocate
>> users to
>> >> > use specific pools ?
>> >> >
>> >> > Thanks,
>> >> > Praveenesh
>> >> >
>> >>
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
>
>
Re: Understanding fair schedulers
Posted by praveenesh kumar <pr...@gmail.com>.
I am looking for the solution where we can do it permanently without
specify these things inside jobs.
I want to keep these things hidden from the end-user.
End-user would just write pig scripts and all the jobs submitted by the
particular user will get submit to their respective pools automatically.
What I am doing write now is something like this
<allocations>
<pool name="ABC">
<minMaps>10</minMaps>
<minReduces>10</minReduces>
<maxMaps>192</maxMaps>
<maxReduces>96</maxReduces>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
</pool>
<user name="ABC">
<maxRunningJobs>6</maxRunningJobs>
</user>
<userMaxJobsDefault>3</userMaxJobsDefault>
<fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
<pool name="XYZ">
<minMaps>10</minMaps>
<minReduces>10</minReduces>
<maxMaps>192</maxMaps>
<maxReduces>96</maxReduces>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
</pool>
<user name="XYZ">
<maxRunningJobs>6</maxRunningJobs>
</user>
<userMaxJobsDefault>3</userMaxJobsDefault>
<fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
</allocations>
By doing this, I am able to see different pools per user, without
mentioning anything inside the jobs.
Automatically jobs are going to the respective pools.
But what I wanted to know , is this the right method to do ?
Thanks,
Praveenesh
On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <ha...@cloudera.com> wrote:
> Set the property in Pig with the 'set' command or other ways:
> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> http://pig.apache.org/docs/r0.9.1/start.html#properties
>
> As Srinivas covered earlier, pool allocation can be done per-user if
> you set the scheduler poolnameproperty to "user.name". Per group if
> you set the property to "group.name".
>
> Then you can provide per-poolname config overrides via the "pool"
> element config described in
>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>
> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com>
> wrote:
> > I am running pig jobs, how can I specify on which pool, it should run ?
> > Also do you mean, the pool allocation is done job wise, not user wise ?
> >
> >
> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com>
> wrote:
> >
> >> Praveenesh,
> >>
> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> while
> >> running the job. By default, mapred.faircheduler.poolnameproperty set to
> >> user.name ( each job run by user is allocated to his named pool ) and
> you
> >> can also change this property to group.name.
> >>
> >> Srinivas --
> >>
> >> Also, you can set
> >>
> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
> >> >wrote:
> >>
> >> > Understanding Fair Schedulers better.
> >> >
> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> >> > correct me.
> >> >
> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >
> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >> Reduce :
> >> > 50
> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> Reduce :
> >> > 80
> >> >
> >> > I have 5 users, who will be using these pools. How will I allocate
> >> specific
> >> > pools to specific users ?
> >> >
> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> user3,user4,user5
> >> > to use "Admin users"
> >> >
> >> > In
> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >> > they have mentioned allocations something like this.
> >> >
> >> > <?xml version="1.0"?>
> >> > <allocations>
> >> > <pool name="sample_pool">
> >> > <minMaps>5</minMaps>
> >> > <minReduces>5</minReduces>
> >> > <maxMaps>25</maxMaps>
> >> > <maxReduces>25</maxReduces>
> >> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> > </pool>
> >> > <user name="sample_user">
> >> > <maxRunningJobs>6</maxRunningJobs>
> >> > </user>
> >> > <userMaxJobsDefault>3</userMaxJobsDefault>
> >> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> > </allocations>
> >> >
> >> > I tried creating more pools, its happening, but how to allocate users
> to
> >> > use specific pools ?
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>
Re: Understanding fair schedulers
Posted by Harsh J <ha...@cloudera.com>.
Set the property in Pig with the 'set' command or other ways:
http://pig.apache.org/docs/r0.9.1/cmds.html#set or
http://pig.apache.org/docs/r0.9.1/start.html#properties
As Srinivas covered earlier, pool allocation can be done per-user if
you set the scheduler poolnameproperty to "user.name". Per group if
you set the property to "group.name".
Then you can provide per-poolname config overrides via the "pool"
element config described in
http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <pr...@gmail.com> wrote:
> I am running pig jobs, how can I specify on which pool, it should run ?
> Also do you mean, the pool allocation is done job wise, not user wise ?
>
>
> On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com> wrote:
>
>> Praveenesh,
>>
>> You can try specifying "mapred.fairscheduler.pool" to your pool name while
>> running the job. By default, mapred.faircheduler.poolnameproperty set to
>> user.name ( each job run by user is allocated to his named pool ) and you
>> can also change this property to group.name.
>>
>> Srinivas --
>>
>> Also, you can set
>>
>> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
>> >wrote:
>>
>> > Understanding Fair Schedulers better.
>> >
>> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> > correct me.
>> >
>> > Suppose I have 2 pools in my fair-scheduler.xml
>> >
>> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> Reduce :
>> > 50
>> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
>> > 80
>> >
>> > I have 5 users, who will be using these pools. How will I allocate
>> specific
>> > pools to specific users ?
>> >
>> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> user3,user4,user5
>> > to use "Admin users"
>> >
>> > In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> > they have mentioned allocations something like this.
>> >
>> > <?xml version="1.0"?>
>> > <allocations>
>> > <pool name="sample_pool">
>> > <minMaps>5</minMaps>
>> > <minReduces>5</minReduces>
>> > <maxMaps>25</maxMaps>
>> > <maxReduces>25</maxReduces>
>> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> > </pool>
>> > <user name="sample_user">
>> > <maxRunningJobs>6</maxRunningJobs>
>> > </user>
>> > <userMaxJobsDefault>3</userMaxJobsDefault>
>> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> > </allocations>
>> >
>> > I tried creating more pools, its happening, but how to allocate users to
>> > use specific pools ?
>> >
>> > Thanks,
>> > Praveenesh
>> >
>>
--
Harsh J
Customer Ops. Engineer, Cloudera
Re: Understanding fair schedulers
Posted by praveenesh kumar <pr...@gmail.com>.
I am running pig jobs, how can I specify on which pool, it should run ?
Also do you mean, the pool allocation is done job wise, not user wise ?
On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <va...@gmail.com> wrote:
> Praveenesh,
>
> You can try specifying "mapred.fairscheduler.pool" to your pool name while
> running the job. By default, mapred.faircheduler.poolnameproperty set to
> user.name ( each job run by user is allocated to his named pool ) and you
> can also change this property to group.name.
>
> Srinivas --
>
> Also, you can set
>
> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
>
> > Understanding Fair Schedulers better.
> >
> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> > correct me.
> >
> > Suppose I have 2 pools in my fair-scheduler.xml
> >
> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> Reduce :
> > 50
> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> > 80
> >
> > I have 5 users, who will be using these pools. How will I allocate
> specific
> > pools to specific users ?
> >
> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> user3,user4,user5
> > to use "Admin users"
> >
> > In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> > they have mentioned allocations something like this.
> >
> > <?xml version="1.0"?>
> > <allocations>
> > <pool name="sample_pool">
> > <minMaps>5</minMaps>
> > <minReduces>5</minReduces>
> > <maxMaps>25</maxMaps>
> > <maxReduces>25</maxReduces>
> > <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> > </pool>
> > <user name="sample_user">
> > <maxRunningJobs>6</maxRunningJobs>
> > </user>
> > <userMaxJobsDefault>3</userMaxJobsDefault>
> > <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> > </allocations>
> >
> > I tried creating more pools, its happening, but how to allocate users to
> > use specific pools ?
> >
> > Thanks,
> > Praveenesh
> >
>
Re: Understanding fair schedulers
Posted by Srinivas Surasani <va...@gmail.com>.
Praveenesh,
You can try specifying "mapred.fairscheduler.pool" to your pool name while
running the job. By default, mapred.faircheduler.poolnameproperty set to
user.name ( each job run by user is allocated to his named pool ) and you
can also change this property to group.name.
Srinivas --
Also, you can set
On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <pr...@gmail.com>wrote:
> Understanding Fair Schedulers better.
>
> Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> correct me.
>
> Suppose I have 2 pools in my fair-scheduler.xml
>
> 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
> 50
> 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> 80
>
> I have 5 users, who will be using these pools. How will I allocate specific
> pools to specific users ?
>
> Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
> to use "Admin users"
>
> In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> they have mentioned allocations something like this.
>
> <?xml version="1.0"?>
> <allocations>
> <pool name="sample_pool">
> <minMaps>5</minMaps>
> <minReduces>5</minReduces>
> <maxMaps>25</maxMaps>
> <maxReduces>25</maxReduces>
> <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> </pool>
> <user name="sample_user">
> <maxRunningJobs>6</maxRunningJobs>
> </user>
> <userMaxJobsDefault>3</userMaxJobsDefault>
> <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> </allocations>
>
> I tried creating more pools, its happening, but how to allocate users to
> use specific pools ?
>
> Thanks,
> Praveenesh
>