You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chris Anderson <jc...@grabb.it> on 2008/06/26 03:59:53 UTC
process limits for streaming jar
Hi there,
I'm running some streaming jobs on ec2 (ruby parsing scripts) and in
my most recent test I managed to spike the load on my large instances
to 25 or so. As a result, I lost communication with one instance. I
think I took down sshd. Whoops.
My question is, has anyone got strategies for managing resources used
by the processes spawned by streaming jar? Ideally I'd like to run my
ruby scripts under nice.
I can hack something together with wrappers, but I'm thinking there
might be a configuration option to handle this within Streaming jar.
Thanks for any suggestions!
--
Chris Anderson
http://jchris.mfdz.com
Re: process limits for streaming jar
Posted by Rick Cox <ri...@gmail.com>.
On Fri, Jun 27, 2008 at 08:57, Chris Anderson <jc...@grabb.it> wrote:
> The problem is that when there are a large number of map tasks to
> complete, Hadoop doesn't seem to obey the map.tasks.maximum. Instead,
> it is spawning 8 map tasks per tasktracker (even when I change the
> mapred.tasktracker.map.tasks.maximum in hadoop-site.xml to 2, on the
> master). The cluster was booted with the setting at 8. Do I need to
> change hadoop-site.xml on all the slaves, and restart the task
> trackers, in order to make the limit apply? That seems unlikely - I'd
> really like to manage this parameter on a per-job level.
>
Yes, mapred.tasktracker.map.tasks.maximum is configured per
tasktracker on startup. It can't be configured per job because it's
not a job-scope parameter (if there are multiple concurrent jobs, they
have to share the task limit).
rick
Re: process limits for streaming jar
Posted by Chris Anderson <jc...@grabb.it>.
Having experimented some more, I've found that the simple solution is
to limit the resource usage by limiting the # of map tasks and the
memory they are allowed to consume.
I'm specifying the constraints on the command line like this:
-jobconf mapred.tasktracker.map.tasks.maximum=2 mapred.child.ulimit=1048576
The configuration parameters seem to take, in the job.xml available
from the web console, I see these lines:
mapred.child.ulimit 1048576
mapred.tasktracker.map.tasks.maximum 2
The problem is that when there are a large number of map tasks to
complete, Hadoop doesn't seem to obey the map.tasks.maximum. Instead,
it is spawning 8 map tasks per tasktracker (even when I change the
mapred.tasktracker.map.tasks.maximum in hadoop-site.xml to 2, on the
master). The cluster was booted with the setting at 8. Do I need to
change hadoop-site.xml on all the slaves, and restart the task
trackers, in order to make the limit apply? That seems unlikely - I'd
really like to manage this parameter on a per-job level.
Thanks for any input!
Chris
--
Chris Anderson
http://jchris.mfdz.com
Re: process limits for streaming jar
Posted by Vinod KV <vi...@yahoo-inc.com>.
Allen Wittenauer wrote:
> This is essentially what we're doing via torque (and therefore hod).
>
When we intend to move away from HOD(and thus torque) to using Hadoop
Resource manager(HADOOP-3421) and scheduler(HADOOP-3412) interfaces, we
need to move this resource management functionality into Hadoop.
> As I said in HADOOP-3280, I'm still convinced that having Hadoop set
> these types of restrictions directly is the wrong approach. It is almost
> always going to be better to use the OS controls that are specific to the
> installation. Enabling OS specific features means that, at a maximum,
> hadoop should likely being calling a script rather than doing ulimits or
> having the equivalent of #ifdef code everywhere. [After all, what if I
> want
> to use Solaris-specific features like projects or privileges?]
>
> But I suspect most of the tunables that people will care about can
> likely be managed at the OS level before hadoop is even involved.
Please see HADOOP-3581(Prevent memory intensive user tasks from taking
down nodes). This issue is aimed at a general solution for putting
aggregate memory limits on the tasks and any subprocesses that tasks
might launch.
Your comments don't seem to be in line with what I proposed on this
issue JIRA. Having Hadoop to just call a sript, rather than doing
ulimits itself, does look nice, but by doing just ulimits alone, Hadoop
cannot have a complete control over what the tasks do. For e.g. as
stated on the JIRA, run-away tasks that fork themselves repeatedly could
wreck havoc and disturb the normal functioning of not only Hadoop
daemons but also might bring down the nodes themselves.
The current solution HADOOP-3280, doesn't preclude this, it only limits
memory usable by a single process, not its subprocesses. Neither do
ulimits via limits.conf will suffice - it just limits vmem usable
per-process per-user as I checked.
Not to sound imitating Torque a bit too much, even torque resorts to
have control over the process tree itself rather than depending directly
on OS specific tools - this is done by specific code for each platform.
We need a consensus on all of this. And I might be missing something
too. Can you please comment on HADOOP-3581?
Thanks,
-Vinod
Re: process limits for streaming jar
Posted by Allen Wittenauer <aw...@yahoo-inc.com>.
On 6/26/08 10:14 AM, "Joydeep Sen Sarma" <js...@facebook.com> wrote:
> However - in our environment - we spawn all streaming tasks through a
> wrapper program (with the wrapper being defaulted across all users). We
> can control resource uses from this wrapper (and also do some level of
> job control).
This is essentially what we're doing via torque (and therefore hod).
> This does require some minor code changes and would be happy to contrib
> - but the approach doesn't quite fit in well with the approach in 2765
> that passes each one of these resource limits as a hadoop config param.
As I said in HADOOP-3280, I'm still convinced that having Hadoop set
these types of restrictions directly is the wrong approach. It is almost
always going to be better to use the OS controls that are specific to the
installation. Enabling OS specific features means that, at a maximum,
hadoop should likely being calling a script rather than doing ulimits or
having the equivalent of #ifdef code everywhere. [After all, what if I want
to use Solaris-specific features like projects or privileges?]
But I suspect most of the tunables that people will care about can
likely be managed at the OS level before hadoop is even involved.
RE: process limits for streaming jar
Posted by Joydeep Sen Sarma <js...@facebook.com>.
Memory limits were handled in this jira:
https://issues.apache.org/jira/browse/HADOOP-2765
However - in our environment - we spawn all streaming tasks through a
wrapper program (with the wrapper being defaulted across all users). We
can control resource uses from this wrapper (and also do some level of
job control).
This does require some minor code changes and would be happy to contrib
- but the approach doesn't quite fit in well with the approach in 2765
that passes each one of these resource limits as a hadoop config param.
-----Original Message-----
From: jchris@gmail.com [mailto:jchris@gmail.com] On Behalf Of Chris
Anderson
Sent: Wednesday, June 25, 2008 7:00 PM
To: core-user@hadoop.apache.org
Subject: process limits for streaming jar
Hi there,
I'm running some streaming jobs on ec2 (ruby parsing scripts) and in
my most recent test I managed to spike the load on my large instances
to 25 or so. As a result, I lost communication with one instance. I
think I took down sshd. Whoops.
My question is, has anyone got strategies for managing resources used
by the processes spawned by streaming jar? Ideally I'd like to run my
ruby scripts under nice.
I can hack something together with wrappers, but I'm thinking there
might be a configuration option to handle this within Streaming jar.
Thanks for any suggestions!
--
Chris Anderson
http://jchris.mfdz.com