You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by ricky l <ri...@gmail.com> on 2014/01/09 17:48:08 UTC
expressing job anti-affinity in Yarn.
Hi all,
Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.
thanks.
RE: expressing job anti-affinity in Yarn.
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted
You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject
From: German Florez-Larrahondo [mailto:german.fl@samsung.com]
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes
Ashish
Could this be related to the scheduler you are using and its settings?.
On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.
You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)
I think just changing yarn-site.xml as follows could demonstrate this
theory (note that how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly).
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>
</property>
Regards
./g
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
Hi all,
Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.
thanks.
RE: expressing job anti-affinity in Yarn.
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted
You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject
From: German Florez-Larrahondo [mailto:german.fl@samsung.com]
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes
Ashish
Could this be related to the scheduler you are using and its settings?.
On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.
You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)
I think just changing yarn-site.xml as follows could demonstrate this
theory (note that how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly).
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>
</property>
Regards
./g
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
Hi all,
Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.
thanks.
RE: expressing job anti-affinity in Yarn.
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted
You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject
From: German Florez-Larrahondo [mailto:german.fl@samsung.com]
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes
Ashish
Could this be related to the scheduler you are using and its settings?.
On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.
You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)
I think just changing yarn-site.xml as follows could demonstrate this
theory (note that how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly).
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>
</property>
Regards
./g
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
Hi all,
Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.
thanks.
RE: expressing job anti-affinity in Yarn.
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted
You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject
From: German Florez-Larrahondo [mailto:german.fl@samsung.com]
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes
Ashish
Could this be related to the scheduler you are using and its settings?.
On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.
You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)
I think just changing yarn-site.xml as follows could demonstrate this
theory (note that how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly).
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>
</property>
Regards
./g
From: Ted Yu [mailto:yuzhihong@gmail.com]
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
Hi all,
Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.
thanks.
Re: expressing job anti-affinity in Yarn.
Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved". Can I assume
the feature is not supported yet?
On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:
> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>
Re: expressing job anti-affinity in Yarn.
Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved". Can I assume
the feature is not supported yet?
On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:
> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>
Re: expressing job anti-affinity in Yarn.
Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved". Can I assume
the feature is not supported yet?
On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:
> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>
Re: expressing job anti-affinity in Yarn.
Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved". Can I assume
the feature is not supported yet?
On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:
> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>
Re: expressing job anti-affinity in Yarn.
Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>
Re: expressing job anti-affinity in Yarn.
Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>
Re: expressing job anti-affinity in Yarn.
Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>
Re: expressing job anti-affinity in Yarn.
Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests
On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>