You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Doug Cutting <cu...@apache.org> on 2007/09/06 20:08:25 UTC

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Apache Wiki wrote:
> + Sort performances on 1400 nodes and 2000 nodes are pretty good too - sorting 14TB of data on a 1400-node cluster takes 2.2 hours; sorting 20TB on a 2000-node cluster takes 2.5 hours. The updates to the above configuration being: 
> +   * `mapred.job.tracker.handler.count = 60`
> +   * `mapred.reduce.parallel.copies = 50`
> +   * `tasktracker.http.threads = 50`

This is a pretty good indication of stuff that we might better specify 
as proportional to cluster size.  For example, we might replace the 
first with something like mapred.jobtracker.tasks.per.handler=30.  To 
determine the number of handlers we'd determine the number of task slots 
(#nodes * mapred.tasktracker.tasks.maximum) and divide that by 
tasks.per.handler to determine the number of handlers.  Then folks 
wouldn't need to alter these settings as their cluster grows.

It's best if folks don't have to change defaults for good performance. 
Not only does that simplify configuration, but it means we can more 
easily change implementations.  For example, if we switch to async RPC 
responses, then the handler count may change significantly, and we'll 
probably change the default, and it would be nice if most folks were not 
overriding the default.

Thoughts?  Should we file an issue?

Doug

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Doug Cutting wrote:

> The urgent thing, since we expect the best settings for large clusters 
> to change, is to make it so that folks don't need to adjust these 
> manually, even if the automation is an ill-understood heuristic.  I 
> think we can easily get some workable heuristics into 0.15, but we might 
> not get be able to implement async responses or figure out how to adjust 
> it automatically in Server.java or whatever in that timeframe. 

+1.

Raghu.
> Perhaps 
> we should just change the defaults to be big enough for 2000 nodes, but 
> that seems like too big of a hammer.
> 
> Doug


Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Michele Catasta <mi...@deri.org>.
Hi,

> 1GB seems like a reasonable default today.  We should err on the low
> side, since that will give a better initial experience.  We could even
> try to configure some of this automatically, based on /proc/meminfo.
> (That works on Linux and Cygwin.  Does it work on OSX?)

OSX does not have procfs.

Parsing `sysctl -an hw.memsize` should be the quickest way.

Regards,
    -Michele Catasta

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Doug Cutting <cu...@apache.org>.
Eric Baldeschwieler wrote:
> I think we should also add an available RAM variable and then do a 
> reasonable job of deriving a bunch of the other variables in these 
> settings from that (we may need one for task trackers, one for namenodes 
> and so on.

I filed an issue for this:

https://issues.apache.org/jira/browse/HADOOP-1867

I also filed an issue for simplifying cluster configuration:

https://issues.apache.org/jira/browse/HADOOP-1850

> What RAM size should we assume is a reasonable default?
> 2GB? 1GB?

1GB seems like a reasonable default today.  We should err on the low 
side, since that will give a better initial experience.  We could even 
try to configure some of this automatically, based on /proc/meminfo. 
(That works on Linux and Cygwin.  Does it work on OSX?)

Doug

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Jim Kellerman <ji...@powerset.com>.
On Fri, 2007-09-07 at 23:14 -0700, Eric Baldeschwieler wrote:
> I think we should also add an available RAM variable and then do a  
> reasonable job of deriving a bunch of the other variables in these  
> settings from that (we may need one for task trackers, one for  
> namenodes and so on.

+1

> A lot of the memory related default settings make no sense on the  
> boxes we use.
> 
> What RAM size should we assume is a reasonable default?
> 2GB? 1GB?

If you are using EC2, I think all you get is 1GB.

Our current machines are 8 core with 16GB, but we are 'zenifying' them
so each instance will have 1 core with 2GB. The exception will be the
name node, especially as our cluster grows, but I am not sure how that
will be configured. (maybe 4 cores and 8GB)?

> We are currently standardizing on 8.
> 
> On Sep 7, 2007, at 7:41 AM, Enis Soztutar wrote:
> 
> > Hadoop has been used in quite varying cluster sizes (in the range
> > 1-2000), so am strongly in favor of as much automatic configuration as
> > possible.
> >
> > Doug Cutting wrote:
> > > Raghu Angadi wrote:
> > >> Right now Namenode does not know about the cluster size before
> > >> starting IPC server.
> > >
> > > Sounds like perhaps we should make the handler count, queue size,  
> > etc.
> > > dynamically adjustable, e.g., by adding Server methods for
> > > setHandlerCount(), setQueueSize(), etc.  There's been talk of trying
> > > to automatically adjust these within Server.java, based on load, and
> > > that would be better yet, but short of that, we might adjust them
> > > heuristically based on cluster size.
> > >
> > > The urgent thing, since we expect the best settings for large  
> > clusters
> > > to change, is to make it so that folks don't need to adjust these
> > > manually, even if the automation is an ill-understood heuristic.  I
> > > think we can easily get some workable heuristics into 0.15, but we
> > > might not get be able to implement async responses or figure out how
> > > to adjust it automatically in Server.java or whatever in that
> > > timeframe.  Perhaps we should just change the defaults to be big
> > > enough for 2000 nodes, but that seems like too big of a hammer.
> > >
> > > Doug
> > >
> >
> 
-- 
Jim Kellerman, Senior Engineer; Powerset
jim@powerset.com

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
I think we should also add an available RAM variable and then do a  
reasonable job of deriving a bunch of the other variables in these  
settings from that (we may need one for task trackers, one for  
namenodes and so on.

A lot of the memory related default settings make no sense on the  
boxes we use.

What RAM size should we assume is a reasonable default?
2GB? 1GB?

We are currently standardizing on 8.

On Sep 7, 2007, at 7:41 AM, Enis Soztutar wrote:

> Hadoop has been used in quite varying cluster sizes (in the range
> 1-2000), so am strongly in favor of as much automatic configuration as
> possible.
>
> Doug Cutting wrote:
> > Raghu Angadi wrote:
> >> Right now Namenode does not know about the cluster size before
> >> starting IPC server.
> >
> > Sounds like perhaps we should make the handler count, queue size,  
> etc.
> > dynamically adjustable, e.g., by adding Server methods for
> > setHandlerCount(), setQueueSize(), etc.  There's been talk of trying
> > to automatically adjust these within Server.java, based on load, and
> > that would be better yet, but short of that, we might adjust them
> > heuristically based on cluster size.
> >
> > The urgent thing, since we expect the best settings for large  
> clusters
> > to change, is to make it so that folks don't need to adjust these
> > manually, even if the automation is an ill-understood heuristic.  I
> > think we can easily get some workable heuristics into 0.15, but we
> > might not get be able to implement async responses or figure out how
> > to adjust it automatically in Server.java or whatever in that
> > timeframe.  Perhaps we should just change the defaults to be big
> > enough for 2000 nodes, but that seems like too big of a hammer.
> >
> > Doug
> >
>


Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Enis Soztutar <en...@gmail.com>.
Hadoop has been used in quite varying cluster sizes (in the range 
1-2000), so am strongly in favor of as much automatic configuration as 
possible.

Doug Cutting wrote:
> Raghu Angadi wrote:
>> Right now Namenode does not know about the cluster size before 
>> starting IPC server.
>
> Sounds like perhaps we should make the handler count, queue size, etc. 
> dynamically adjustable, e.g., by adding Server methods for 
> setHandlerCount(), setQueueSize(), etc.  There's been talk of trying 
> to automatically adjust these within Server.java, based on load, and 
> that would be better yet, but short of that, we might adjust them 
> heuristically based on cluster size.
>
> The urgent thing, since we expect the best settings for large clusters 
> to change, is to make it so that folks don't need to adjust these 
> manually, even if the automation is an ill-understood heuristic.  I 
> think we can easily get some workable heuristics into 0.15, but we 
> might not get be able to implement async responses or figure out how 
> to adjust it automatically in Server.java or whatever in that 
> timeframe.  Perhaps we should just change the defaults to be big 
> enough for 2000 nodes, but that seems like too big of a hammer.
>
> Doug
>

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Doug Cutting <cu...@apache.org>.
Raghu Angadi wrote:
> Right now Namenode does not know about the cluster size before starting 
> IPC server.

Sounds like perhaps we should make the handler count, queue size, etc. 
dynamically adjustable, e.g., by adding Server methods for 
setHandlerCount(), setQueueSize(), etc.  There's been talk of trying to 
automatically adjust these within Server.java, based on load, and that 
would be better yet, but short of that, we might adjust them 
heuristically based on cluster size.

The urgent thing, since we expect the best settings for large clusters 
to change, is to make it so that folks don't need to adjust these 
manually, even if the automation is an ill-understood heuristic.  I 
think we can easily get some workable heuristics into 0.15, but we might 
not get be able to implement async responses or figure out how to adjust 
it automatically in Server.java or whatever in that timeframe.  Perhaps 
we should just change the defaults to be big enough for 2000 nodes, but 
that seems like too big of a hammer.

Doug

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Doug Cutting wrote:
> Raghu Angadi wrote:
>> I don't think there is an explanation of why increasing the handlers 
>> proportionally helps (I does help, but it might be a big hammer 
>> approach). I  think ipc Q-length and q management also matters a lot. 
>> I will open a Jira with couple thoughts/explanation/improvements regd 
>> Q mangagement in our IPC.
> 
> Regardless, we should try to make such proportional parameters 
> automatically adjust to cluster size. 

Agreed. I was implying that we should have at least some 
back-of-the-envolope explantions for these formulas. Yes, such 
proportional config would be very useful.

Right now Namenode does not know about the cluster size before starting 
IPC server.

Raghu.

> So instead of having parameters 
> whose values must be proportional to cluster size, we should have 
> parameters that are constants of proportionality (e.g., multiplied by 
> the cluster size to determine the runtime value).  We have a list here 
> of three such parameters and we should thus work to fix each of these to 
> be proportional to cluster size, no?  If today it's handlers and 
> tomorrow it's queues we'll be better served if folks don't have to edit 
> their config files when we make such changes.
> 
> Doug


Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Doug Cutting <cu...@apache.org>.
Raghu Angadi wrote:
> I don't think there is an explanation of why increasing the handlers 
> proportionally helps (I does help, but it might be a big hammer 
> approach). I  think ipc Q-length and q management also matters a lot. I 
> will open a Jira with couple thoughts/explanation/improvements regd Q 
> mangagement in our IPC.

Regardless, we should try to make such proportional parameters 
automatically adjust to cluster size.  So instead of having parameters 
whose values must be proportional to cluster size, we should have 
parameters that are constants of proportionality (e.g., multiplied by 
the cluster size to determine the runtime value).  We have a list here 
of three such parameters and we should thus work to fix each of these to 
be proportional to cluster size, no?  If today it's handlers and 
tomorrow it's queues we'll be better served if folks don't have to edit 
their config files when we make such changes.

Doug

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Doug Cutting wrote:
> Apache Wiki wrote:
>> + Sort performances on 1400 nodes and 2000 nodes are pretty good too - 
>> sorting 14TB of data on a 1400-node cluster takes 2.2 hours; sorting 
>> 20TB on a 2000-node cluster takes 2.5 hours. The updates to the above 
>> configuration being: +   * `mapred.job.tracker.handler.count = 60`
>> +   * `mapred.reduce.parallel.copies = 50`
>> +   * `tasktracker.http.threads = 50`
> 
> This is a pretty good indication of stuff that we might better specify 
> as proportional to cluster size.  For example, we might replace the 
> first with something like mapred.jobtracker.tasks.per.handler=30.  To 
> determine the number of handlers we'd determine the number of task slots 
> (#nodes * mapred.tasktracker.tasks.maximum) and divide that by 
> tasks.per.handler to determine the number of handlers.  Then folks 
> wouldn't need to alter these settings as their cluster grows.
> 
> It's best if folks don't have to change defaults for good performance. 
> Not only does that simplify configuration, but it means we can more 
> easily change implementations.  For example, if we switch to async RPC 
> responses, then the handler count may change significantly, and we'll 
> probably change the default, and it would be nice if most folks were not 
> overriding the default.
> 
> Thoughts?  Should we file an issue?

I don't think there is an explanation of why increasing the handlers 
proportionally helps (I does help, but it might be a big hammer 
approach). I  think ipc Q-length and q management also matters a lot. I 
will open a Jira with couple thoughts/explanation/improvements regd Q 
mangagement in our IPC.

Raghu.