You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dhanasekaran Anbalagan <bu...@gmail.com> on 2013/02/21 14:16:18 UTC

Hadoop efficient resource isolation

Hi Guys,

It's possible isolation job submission for hadoop cluster, we currently
running 48 machine cluster. we  monitor Hadoop is not provides efficient
resource isolation. In my case we ran for tech and research pool, When tech
job some memory leak will haven, It's occupy the hole cluster.  Finally
we figure out  issue with tech job. It's  screwed up hole hadoop cluster.
finally 10 data node  are dead.

Any prevention of job submission efficient way resource allocation. When
something wrong in   particular job, effect particular pool, Not effect
others job. Any way to archive this

Please guide me guys.

My idea is, When tech user submit job means only apply job in for my
case submit 24 machine. other machine only for research user.

It's will prevent the memory leak problem.


-Dhanasekaran.
Did I learn something today? If not, I wasted it.

Re: Hadoop efficient resource isolation

Posted by Michel Segel <mi...@hotmail.com>.
Not sure what the question is... Have you looked at either the fair scheduler or better yet capacity scheduler?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan <bu...@gmail.com> wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler has features to allow a user to specify the amount of virtual memory per map/reduce task and the TaskTracker monitors all tasks and their process-trees to ensure fork-bombs don't kill the node.

On Feb 25, 2013, at 8:27 PM, Marcin Mejran wrote:

> That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),
> 
> I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?
> 
> Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.
> 
> -Marcin
> 
> On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com> wrote:
> 
>> CapacityScheduler is what you want...
>> 
>> On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:
>> 
>>> Hi Guys,
>>> 
>>> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
>>> 
>>> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
>>> 
>>> Please guide me guys.
>>> 
>>> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
>>> 
>>> It's will prevent the memory leak problem. 
>>>  
>>> 
>>> -Dhanasekaran.
>>> Did I learn something today? If not, I wasted it.
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler has features to allow a user to specify the amount of virtual memory per map/reduce task and the TaskTracker monitors all tasks and their process-trees to ensure fork-bombs don't kill the node.

On Feb 25, 2013, at 8:27 PM, Marcin Mejran wrote:

> That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),
> 
> I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?
> 
> Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.
> 
> -Marcin
> 
> On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com> wrote:
> 
>> CapacityScheduler is what you want...
>> 
>> On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:
>> 
>>> Hi Guys,
>>> 
>>> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
>>> 
>>> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
>>> 
>>> Please guide me guys.
>>> 
>>> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
>>> 
>>> It's will prevent the memory leak problem. 
>>>  
>>> 
>>> -Dhanasekaran.
>>> Did I learn something today? If not, I wasted it.
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler has features to allow a user to specify the amount of virtual memory per map/reduce task and the TaskTracker monitors all tasks and their process-trees to ensure fork-bombs don't kill the node.

On Feb 25, 2013, at 8:27 PM, Marcin Mejran wrote:

> That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),
> 
> I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?
> 
> Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.
> 
> -Marcin
> 
> On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com> wrote:
> 
>> CapacityScheduler is what you want...
>> 
>> On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:
>> 
>>> Hi Guys,
>>> 
>>> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
>>> 
>>> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
>>> 
>>> Please guide me guys.
>>> 
>>> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
>>> 
>>> It's will prevent the memory leak problem. 
>>>  
>>> 
>>> -Dhanasekaran.
>>> Did I learn something today? If not, I wasted it.
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler has features to allow a user to specify the amount of virtual memory per map/reduce task and the TaskTracker monitors all tasks and their process-trees to ensure fork-bombs don't kill the node.

On Feb 25, 2013, at 8:27 PM, Marcin Mejran wrote:

> That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),
> 
> I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?
> 
> Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.
> 
> -Marcin
> 
> On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com> wrote:
> 
>> CapacityScheduler is what you want...
>> 
>> On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:
>> 
>>> Hi Guys,
>>> 
>>> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
>>> 
>>> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
>>> 
>>> Please guide me guys.
>>> 
>>> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
>>> 
>>> It's will prevent the memory leak problem. 
>>>  
>>> 
>>> -Dhanasekaran.
>>> Did I learn something today? If not, I wasted it.
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Marcin Mejran <ma...@hooklogic.com>.
That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),

I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?

Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.

-Marcin

On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com>> wrote:

CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

Hi Guys,

It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.

Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this

Please guide me guys.

My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.

It's will prevent the memory leak problem.


-Dhanasekaran.
Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Marcin Mejran <ma...@hooklogic.com>.
That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),

I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?

Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.

-Marcin

On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com>> wrote:

CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

Hi Guys,

It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.

Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this

Please guide me guys.

My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.

It's will prevent the memory leak problem.


-Dhanasekaran.
Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Marcin Mejran <ma...@hooklogic.com>.
That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),

I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?

Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.

-Marcin

On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com>> wrote:

CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

Hi Guys,

It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.

Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this

Please guide me guys.

My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.

It's will prevent the memory leak problem.


-Dhanasekaran.
Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Marcin Mejran <ma...@hooklogic.com>.
That won't stop a bad job (say a fork bomb or a massive memory leak in a streaming script) from taking out a node which is what I believe Dhanasekaran was asking about. He wants to physically isolate certain lobs to certain "non critical" nodes. I don't believe this is possible and data would be spread to those nodes, assuming they're data nodes, which would still cause cluster wide issues (and if data is isolate why not have two separate clusters?),

I've read references in the docs about some type of memory based contrains in Hadoop but I don't know of the details. Anyone know how they work?

Also, I believe there are tools in Linux that can kill processes in case of memory issues and otherwise restrict what a certain user can do. These seem like a more flexible solution although they won't cover all potential issues.

-Marcin

On Feb 25, 2013, at 7:20 PM, "Arun C Murthy" <ac...@hortonworks.com>> wrote:

CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

Hi Guys,

It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.

Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this

Please guide me guys.

My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.

It's will prevent the memory leak problem.


-Dhanasekaran.
Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Jeffrey Buell <jb...@vmware.com>.
This is one reason to consider virtualizing Hadoop clusters. The idea is to create multiple virtual clusters on a single physical cluster and apply various kinds of resource controls (CPU, memory, I/O) on the virtual machines that make up each virtual cluster. Then if any application or VM within a virtual cluster crashes, hangs, or tries to hog resources, the other virtual clusters will be unaffected. Multi-tenancy is also enabled since the isolation between virtual clusters is secure. 

Jeff 

----- Original Message -----

From: "Hemanth Yamijala" <yh...@thoughtworks.com> 
To: user@hadoop.apache.org 
Sent: Thursday, February 21, 2013 8:51:04 AM 
Subject: Re: Hadoop efficient resource isolation 

Supporting a multiuser scenario like this is always hard under Hadoop. There are a few configuration knobs that offer some administrative control and protection. 

Specifically for the problem you describe, you could probably set Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job that is exceeding these limits will be killed. Of course, a side effect of this would be that jobs would be bound by some limits even if they legitimately require more memory. 


But you could try starting with this. 


Thanks 
Hemanth 

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote: 



Hi Guys, 


It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster. Finally we figure out issue with tech job. It's screwed up hole hadoop cluster. finally 10 data node are dead. 


Any prevention of job submission efficient way resource allocation. When something wrong in particular job, effect particular pool, Not effect others job. Any way to archive this 


Please guide me guys. 


My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user. 

It's will prevent the memory leak problem. 




-Dhanasekaran. 

Did I learn something today? If not, I wasted it. 



Re: Hadoop efficient resource isolation

Posted by Jeffrey Buell <jb...@vmware.com>.
This is one reason to consider virtualizing Hadoop clusters. The idea is to create multiple virtual clusters on a single physical cluster and apply various kinds of resource controls (CPU, memory, I/O) on the virtual machines that make up each virtual cluster. Then if any application or VM within a virtual cluster crashes, hangs, or tries to hog resources, the other virtual clusters will be unaffected. Multi-tenancy is also enabled since the isolation between virtual clusters is secure. 

Jeff 

----- Original Message -----

From: "Hemanth Yamijala" <yh...@thoughtworks.com> 
To: user@hadoop.apache.org 
Sent: Thursday, February 21, 2013 8:51:04 AM 
Subject: Re: Hadoop efficient resource isolation 

Supporting a multiuser scenario like this is always hard under Hadoop. There are a few configuration knobs that offer some administrative control and protection. 

Specifically for the problem you describe, you could probably set Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job that is exceeding these limits will be killed. Of course, a side effect of this would be that jobs would be bound by some limits even if they legitimately require more memory. 


But you could try starting with this. 


Thanks 
Hemanth 

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote: 



Hi Guys, 


It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster. Finally we figure out issue with tech job. It's screwed up hole hadoop cluster. finally 10 data node are dead. 


Any prevention of job submission efficient way resource allocation. When something wrong in particular job, effect particular pool, Not effect others job. Any way to archive this 


Please guide me guys. 


My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user. 

It's will prevent the memory leak problem. 




-Dhanasekaran. 

Did I learn something today? If not, I wasted it. 



Re: Hadoop efficient resource isolation

Posted by Jeffrey Buell <jb...@vmware.com>.
This is one reason to consider virtualizing Hadoop clusters. The idea is to create multiple virtual clusters on a single physical cluster and apply various kinds of resource controls (CPU, memory, I/O) on the virtual machines that make up each virtual cluster. Then if any application or VM within a virtual cluster crashes, hangs, or tries to hog resources, the other virtual clusters will be unaffected. Multi-tenancy is also enabled since the isolation between virtual clusters is secure. 

Jeff 

----- Original Message -----

From: "Hemanth Yamijala" <yh...@thoughtworks.com> 
To: user@hadoop.apache.org 
Sent: Thursday, February 21, 2013 8:51:04 AM 
Subject: Re: Hadoop efficient resource isolation 

Supporting a multiuser scenario like this is always hard under Hadoop. There are a few configuration knobs that offer some administrative control and protection. 

Specifically for the problem you describe, you could probably set Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job that is exceeding these limits will be killed. Of course, a side effect of this would be that jobs would be bound by some limits even if they legitimately require more memory. 


But you could try starting with this. 


Thanks 
Hemanth 

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote: 



Hi Guys, 


It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster. Finally we figure out issue with tech job. It's screwed up hole hadoop cluster. finally 10 data node are dead. 


Any prevention of job submission efficient way resource allocation. When something wrong in particular job, effect particular pool, Not effect others job. Any way to archive this 


Please guide me guys. 


My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user. 

It's will prevent the memory leak problem. 




-Dhanasekaran. 

Did I learn something today? If not, I wasted it. 



Re: Hadoop efficient resource isolation

Posted by Jeffrey Buell <jb...@vmware.com>.
This is one reason to consider virtualizing Hadoop clusters. The idea is to create multiple virtual clusters on a single physical cluster and apply various kinds of resource controls (CPU, memory, I/O) on the virtual machines that make up each virtual cluster. Then if any application or VM within a virtual cluster crashes, hangs, or tries to hog resources, the other virtual clusters will be unaffected. Multi-tenancy is also enabled since the isolation between virtual clusters is secure. 

Jeff 

----- Original Message -----

From: "Hemanth Yamijala" <yh...@thoughtworks.com> 
To: user@hadoop.apache.org 
Sent: Thursday, February 21, 2013 8:51:04 AM 
Subject: Re: Hadoop efficient resource isolation 

Supporting a multiuser scenario like this is always hard under Hadoop. There are a few configuration knobs that offer some administrative control and protection. 

Specifically for the problem you describe, you could probably set Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job that is exceeding these limits will be killed. Of course, a side effect of this would be that jobs would be bound by some limits even if they legitimately require more memory. 


But you could try starting with this. 


Thanks 
Hemanth 

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote: 



Hi Guys, 


It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster. Finally we figure out issue with tech job. It's screwed up hole hadoop cluster. finally 10 data node are dead. 


Any prevention of job submission efficient way resource allocation. When something wrong in particular job, effect particular pool, Not effect others job. Any way to archive this 


Please guide me guys. 


My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user. 

It's will prevent the memory leak problem. 




-Dhanasekaran. 

Did I learn something today? If not, I wasted it. 



Re: Hadoop efficient resource isolation

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Supporting a multiuser scenario like this is always hard under Hadoop.
There are a few configuration knobs that offer some administrative control
and protection.

Specifically for the problem you describe, you could probably set
Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job
that is exceeding these limits will be killed. Of course, a side effect of
this would be that jobs would be bound by some limits even if they
legitimately require more memory.

But you could try starting with this.

Thanks
Hemanth

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote:

> Hi Guys,
>
> It's possible isolation job submission for hadoop cluster, we currently
> running 48 machine cluster. we  monitor Hadoop is not provides efficient
> resource isolation. In my case we ran for tech and research pool, When tech
> job some memory leak will haven, It's occupy the hole cluster.  Finally
> we figure out  issue with tech job. It's  screwed up hole hadoop cluster.
> finally 10 data node  are dead.
>
> Any prevention of job submission efficient way resource allocation. When
> something wrong in   particular job, effect particular pool, Not effect
> others job. Any way to archive this
>
> Please guide me guys.
>
> My idea is, When tech user submit job means only apply job in for my
> case submit 24 machine. other machine only for research user.
>
> It's will prevent the memory leak problem.
>
>
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.
>

Re: Hadoop efficient resource isolation

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Supporting a multiuser scenario like this is always hard under Hadoop.
There are a few configuration knobs that offer some administrative control
and protection.

Specifically for the problem you describe, you could probably set
Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job
that is exceeding these limits will be killed. Of course, a side effect of
this would be that jobs would be bound by some limits even if they
legitimately require more memory.

But you could try starting with this.

Thanks
Hemanth

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote:

> Hi Guys,
>
> It's possible isolation job submission for hadoop cluster, we currently
> running 48 machine cluster. we  monitor Hadoop is not provides efficient
> resource isolation. In my case we ran for tech and research pool, When tech
> job some memory leak will haven, It's occupy the hole cluster.  Finally
> we figure out  issue with tech job. It's  screwed up hole hadoop cluster.
> finally 10 data node  are dead.
>
> Any prevention of job submission efficient way resource allocation. When
> something wrong in   particular job, effect particular pool, Not effect
> others job. Any way to archive this
>
> Please guide me guys.
>
> My idea is, When tech user submit job means only apply job in for my
> case submit 24 machine. other machine only for research user.
>
> It's will prevent the memory leak problem.
>
>
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.
>

Re: Hadoop efficient resource isolation

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Supporting a multiuser scenario like this is always hard under Hadoop.
There are a few configuration knobs that offer some administrative control
and protection.

Specifically for the problem you describe, you could probably set
Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job
that is exceeding these limits will be killed. Of course, a side effect of
this would be that jobs would be bound by some limits even if they
legitimately require more memory.

But you could try starting with this.

Thanks
Hemanth

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote:

> Hi Guys,
>
> It's possible isolation job submission for hadoop cluster, we currently
> running 48 machine cluster. we  monitor Hadoop is not provides efficient
> resource isolation. In my case we ran for tech and research pool, When tech
> job some memory leak will haven, It's occupy the hole cluster.  Finally
> we figure out  issue with tech job. It's  screwed up hole hadoop cluster.
> finally 10 data node  are dead.
>
> Any prevention of job submission efficient way resource allocation. When
> something wrong in   particular job, effect particular pool, Not effect
> others job. Any way to archive this
>
> Please guide me guys.
>
> My idea is, When tech user submit job means only apply job in for my
> case submit 24 machine. other machine only for research user.
>
> It's will prevent the memory leak problem.
>
>
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.
>

Re: Hadoop efficient resource isolation

Posted by Michel Segel <mi...@hotmail.com>.
Not sure what the question is... Have you looked at either the fair scheduler or better yet capacity scheduler?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan <bu...@gmail.com> wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

Re: Hadoop efficient resource isolation

Posted by Hemanth Yamijala <yh...@thoughtworks.com>.
Supporting a multiuser scenario like this is always hard under Hadoop.
There are a few configuration knobs that offer some administrative control
and protection.

Specifically for the problem you describe, you could probably set
Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any job
that is exceeding these limits will be killed. Of course, a side effect of
this would be that jobs would be bound by some limits even if they
legitimately require more memory.

But you could try starting with this.

Thanks
Hemanth

On Thursday, February 21, 2013, Dhanasekaran Anbalagan wrote:

> Hi Guys,
>
> It's possible isolation job submission for hadoop cluster, we currently
> running 48 machine cluster. we  monitor Hadoop is not provides efficient
> resource isolation. In my case we ran for tech and research pool, When tech
> job some memory leak will haven, It's occupy the hole cluster.  Finally
> we figure out  issue with tech job. It's  screwed up hole hadoop cluster.
> finally 10 data node  are dead.
>
> Any prevention of job submission efficient way resource allocation. When
> something wrong in   particular job, effect particular pool, Not effect
> others job. Any way to archive this
>
> Please guide me guys.
>
> My idea is, When tech user submit job means only apply job in for my
> case submit 24 machine. other machine only for research user.
>
> It's will prevent the memory leak problem.
>
>
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.
>

Re: Hadoop efficient resource isolation

Posted by Michel Segel <mi...@hotmail.com>.
Not sure what the question is... Have you looked at either the fair scheduler or better yet capacity scheduler?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan <bu...@gmail.com> wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Arun C Murthy <ac...@hortonworks.com>.
CapacityScheduler is what you want...

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Hadoop efficient resource isolation

Posted by Michel Segel <mi...@hotmail.com>.
Not sure what the question is... Have you looked at either the fair scheduler or better yet capacity scheduler?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 21, 2013, at 5:16 AM, Dhanasekaran Anbalagan <bu...@gmail.com> wrote:

> Hi Guys,
> 
> It's possible isolation job submission for hadoop cluster, we currently running 48 machine cluster. we  monitor Hadoop is not provides efficient resource isolation. In my case we ran for tech and research pool, When tech job some memory leak will haven, It's occupy the hole cluster.  Finally we figure out  issue with tech job. It's  screwed up hole hadoop cluster. finally 10 data node  are dead.
> 
> Any prevention of job submission efficient way resource allocation. When something wrong in   particular job, effect particular pool, Not effect others job. Any way to archive this
> 
> Please guide me guys.
> 
> My idea is, When tech user submit job means only apply job in for my case submit 24 machine. other machine only for research user.
> 
> It's will prevent the memory leak problem. 
>  
> 
> -Dhanasekaran.
> Did I learn something today? If not, I wasted it.