You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by ricky l <ri...@gmail.com> on 2014/01/09 17:48:08 UTC

expressing job anti-affinity in Yarn.

Hi all,

Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.

thanks.

RE: expressing job anti-affinity in Yarn.

Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted

You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject

 

From: German Florez-Larrahondo [mailto:german.fl@samsung.com] 
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes

 

Ashish

Could this be related to the scheduler you are using and its settings?.

 

On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.

 

You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)

 

I think just changing yarn-site.xml  as follows could demonstrate this
theory (note that  how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly). 

 

<property>

  <name>yarn.resourcemanager.scheduler.class</name>

 
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>

</property>

 

Regards

./g

 

 

 

From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.

 

See:

YARN-1042 add ability to specify affinity/anti-affinity in container
requests

 

On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

Hi all,

 

Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.

 

thanks.

 


RE: expressing job anti-affinity in Yarn.

Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted

You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject

 

From: German Florez-Larrahondo [mailto:german.fl@samsung.com] 
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes

 

Ashish

Could this be related to the scheduler you are using and its settings?.

 

On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.

 

You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)

 

I think just changing yarn-site.xml  as follows could demonstrate this
theory (note that  how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly). 

 

<property>

  <name>yarn.resourcemanager.scheduler.class</name>

 
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>

</property>

 

Regards

./g

 

 

 

From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.

 

See:

YARN-1042 add ability to specify affinity/anti-affinity in container
requests

 

On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

Hi all,

 

Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.

 

thanks.

 


RE: expressing job anti-affinity in Yarn.

Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted

You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject

 

From: German Florez-Larrahondo [mailto:german.fl@samsung.com] 
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes

 

Ashish

Could this be related to the scheduler you are using and its settings?.

 

On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.

 

You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)

 

I think just changing yarn-site.xml  as follows could demonstrate this
theory (note that  how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly). 

 

<property>

  <name>yarn.resourcemanager.scheduler.class</name>

 
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>

</property>

 

Regards

./g

 

 

 

From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.

 

See:

YARN-1042 add ability to specify affinity/anti-affinity in container
requests

 

On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

Hi all,

 

Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.

 

thanks.

 


RE: expressing job anti-affinity in Yarn.

Posted by German Florez-Larrahondo <ge...@samsung.com>.
Ted

You could try with the fairscheduler as well. See a comment I made a few
hours ago on the same subject

 

From: German Florez-Larrahondo [mailto:german.fl@samsung.com] 
Sent: Thursday, January 09, 2014 8:23 AM
To: user@hadoop.apache.org
Subject: RE: Distributing the code to multiple nodes

 

Ashish

Could this be related to the scheduler you are using and its settings?.

 

On lab environments when running a single type of job I often use
FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does a
good job distributing the load.

 

You could give that a try
(https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairSch
eduler.html)

 

I think just changing yarn-site.xml  as follows could demonstrate this
theory (note that  how the jobs are scheduled depend on resources such as
memory on the nodes and you would need to setup yarn-site.xml accordingly). 

 

<property>

  <name>yarn.resourcemanager.scheduler.class</name>

 
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSche
duler</value>

</property>

 

Regards

./g

 

 

 

From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Thursday, January 09, 2014 11:00 AM
To: common-user@hadoop.apache.org
Subject: Re: expressing job anti-affinity in Yarn.

 

See:

YARN-1042 add ability to specify affinity/anti-affinity in container
requests

 

On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

Hi all,

 

Is it possible to express the job anti-affinity in the Yarn-based hadoop? I
have a job that is very IO-intensive, and I want to spread the tasks across
all available machines. In a default Yarn RM scheduler, it seems many tasks
are scheduled in one machine while others are idle.

 

thanks.

 


Re: expressing job anti-affinity in Yarn.

Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved".  Can I assume
the feature is not supported yet?


On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:

> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>

Re: expressing job anti-affinity in Yarn.

Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved".  Can I assume
the feature is not supported yet?


On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:

> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>

Re: expressing job anti-affinity in Yarn.

Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved".  Can I assume
the feature is not supported yet?


On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:

> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>

Re: expressing job anti-affinity in Yarn.

Posted by ricky l <ri...@gmail.com>.
Thanks. I checked the issue and it seems still "unresolved".  Can I assume
the feature is not supported yet?


On Thu, Jan 9, 2014 at 12:00 PM, Ted Yu <yu...@gmail.com> wrote:

> See:
> YARN-1042 add ability to specify affinity/anti-affinity in container
> requests
>
>
> On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:
>
>> Hi all,
>>
>> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
>> I have a job that is very IO-intensive, and I want to spread the tasks
>> across all available machines. In a default Yarn RM scheduler, it seems
>> many tasks are scheduled in one machine while others are idle.
>>
>> thanks.
>>
>
>

Re: expressing job anti-affinity in Yarn.

Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests


On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>

Re: expressing job anti-affinity in Yarn.

Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests


On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>

Re: expressing job anti-affinity in Yarn.

Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests


On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>

Re: expressing job anti-affinity in Yarn.

Posted by Ted Yu <yu...@gmail.com>.
See:
YARN-1042 add ability to specify affinity/anti-affinity in container
requests


On Thu, Jan 9, 2014 at 8:48 AM, ricky l <ri...@gmail.com> wrote:

> Hi all,
>
> Is it possible to express the job anti-affinity in the Yarn-based hadoop?
> I have a job that is very IO-intensive, and I want to spread the tasks
> across all available machines. In a default Yarn RM scheduler, it seems
> many tasks are scheduled in one machine while others are idle.
>
> thanks.
>