You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Todd Lipcon <to...@cloudera.com> on 2010/04/09 17:34:31 UTC

Re: Network problems Hadoop 0.20.2 and Terasort on Debian 2.6.32 kernel

On Fri, Apr 9, 2010 at 8:18 AM, stephen mulcahy <st...@deri.org>wrote:

> Allen Wittenauer wrote:
>
>> On Apr 8, 2010, at 9:37 AM, stephen mulcahy wrote:
>>
>>> When I run this on the Debian 2.6.32 kernel - over the course of the run,
>>> 1 or 2 datanodes of the cluster enter a state whereby they are no longer
>>> responsive to network traffic.
>>>
>>
>> How much free memory do you have?
>>
>
> Lots, a few GB
>
>
>
>> How many tasks per node do you have?
>>
>
> I left this at the default.
>
>
>
>> What are the service times, etc, on your IO system?
>>
>
> Can you clarify this query?
>
>
>
>>  Has anyone run into similar problems with their environments? I noticed
>>> that the when the nodes become unresponsive, it often happens when the
>>> TeraSort is at
>>>
>>
>> I've always seen Linux nodes go unresponsive when they get memory starved
>> to the point that the OOM can't function because it can't allocate enough
>> mem.
>>
>
> Sure, but I can login to the unresponsive nodes via the console - it's just
> the network that has become responsive. To be clear here, I don't suspect
> Hadoop is the root cause of the problem - I suspect either a kernel bug or
> some other operating system level bug. I was wondering if others had run
> into similar problems.
>

Most likely a kernel bug. In previous versions of Debian there was a buggy
forcedeth driver, for example, that caused it to drop off the network in
high load. Who knows what new bug is in 2.6.32 which is brand spanking new.


>
> I was also wondering in general what kernel versions and distros people are
> using, especially for larger production clusters.
>
>
The overwhelming majority of production clusters run on RHEL 5.3 or RHEL 5.4
in my experience (I'm lumping CentOS 5.3/5.4 in with RHEL here). I know one
or two production clusters running Debian Lenny, but none running something
as new as what you're talking about. Hadoop doesn't exercise the new
features in very recent kernels, so there's no sense accepting instability -
just go with something old that works!

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Network problems Hadoop 0.20.2 and Terasort on Debian 2.6.32 kernel

Posted by stephen mulcahy <st...@deri.org>.

Steve Loughran wrote:
> Tom White is planning to split off a Hadoop 0.21 branch from SVN_TRUNK 
> at the end of the month, so if you still want to do some cluster 
> testing, he'd be grateful for that being tested on debian too

If I have a testing window available I'd be happy to.

> #of HDDs/server will be a factor too, and no, I don't know how to 
> predict it.

We have 4 SATA HDDs/server - haven't done much tuning to the config though.

-stephen

-- 
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com

Re: Network problems Hadoop 0.20.2 and Terasort on Debian 2.6.32 kernel

Posted by Steve Loughran <st...@apache.org>.

Todd Lipcon wrote:
> On Tue, Apr 13, 2010 at 4:13 AM, stephen mulcahy
> <st...@deri.org>wrote:

>> Sure, but I figured I'd go with a distro now that can be largely left
>> untouched for the next 2-3 years and Debian lenny felt that bit old for
>> that. I know RHEL/CentOS would fit that requirement also, will see. I'm also
>> interested in using DRBD in some of our nodes for redundancy, again, running
>> with a newer distro should reduce the pain of configuring that.
>>
>> Finally, I figured burning in our cluster was a good opportunity to give
>> back to the community and do some testing on their behalf.
>>
> 
> Very admirable of you :) It is good to have some people running new kernels
> to suss these issues out before the rest of us check out modern technology
> ;-)

Tom White is planning to split off a Hadoop 0.21 branch from SVN_TRUNK 
at the end of the month, so if you still want to do some cluster 
testing, he'd be grateful for that being tested on debian too

> 
> 
>> With regard to our TeraSort benchmark time of ~23 minutes - is that in the
>> right ballpark for a cluster of 45 data nodes and a nn and 2nn?

#of HDDs/server will be a factor too, and no, I don't know how to 
predict it.

Re: Network problems Hadoop 0.20.2 and Terasort on Debian 2.6.32 kernel

Posted by stephen mulcahy <st...@deri.org>.

Todd Lipcon wrote:
>> Yes, it looks like it is a kernel bug alright (see thread on kernel netdev
>> at http://marc.info/?t=127094288900001&r=1&w=2 if interested). To be fair,
>> I don't think these bugs are confined to Debian - I did some initial testing
>> with Scientific Linux and also ran into problems with forcedeth.
> 
> 
> Interesting, good find. I try to avoid forcedeth now and have heard the same
> from ops people at various large linux deployments. Not sure why, but it's
> traditionally had a lot of bugs/regressions.

FYI, the netdev guys have proposed a patch and initial testing indicates 
it fixes the problem (and brings the TeraSort down to about 18 minutes, 
so win win :)

I share similar feelings about forcedeth, particularly after this, but 
then I'm also dubious about at least some broadcom chipsets and even 
Intel have had their issues 
(https://bugzilla.kernel.org/show_bug.cgi?id=11382) so maybe it's just 
that all nic's suck.

>> Finally, I figured burning in our cluster was a good opportunity to give
>> back to the community and do some testing on their behalf.
> 
> Very admirable of you :) It is good to have some people running new kernels
> to suss these issues out before the rest of us check out modern technology
> ;-)

It also means there aren't problems lurking for us in the future when we 
get forced to newer kernels for support/maintenance issues. I also ran 
into http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=556030 while 
testing a 2.6.30 kernel which may be lurking in older kernels too (and 
seems to have been fixed in 2.6.32) so there are perils to staying back 
and going forward.

>> With regard to our TeraSort benchmark time of ~23 minutes - is that in the
>> right ballpark for a cluster of 45 data nodes and a nn and 2nn?
>>
>>
> Yep, sounds about the right ballpark.

Cool, thanks for the feedback. I'm surprised that others didn't comment 
on the TeraSort result - perhaps others use something else for 
smoke-testing/benchmarking their Hadoop clusters? If so, anyone want to 
suggest what they do use? It'd be nice to see a collection of TeraSort 
results somewhere to get an idea of what cluster configs work well and 
for people who want to sanity check a new cluster.

-stephen

-- 
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com

Re: Network problems Hadoop 0.20.2 and Terasort on Debian 2.6.32 kernel

Posted by Todd Lipcon <to...@cloudera.com>.

On Tue, Apr 13, 2010 at 4:13 AM, stephen mulcahy
<st...@deri.org>wrote:

> Todd Lipcon wrote:
>
>> Most likely a kernel bug. In previous versions of Debian there was a buggy
>> forcedeth driver, for example, that caused it to drop off the network in
>> high load. Who knows what new bug is in 2.6.32 which is brand spanking
>> new.
>>
>
> Yes, it looks like it is a kernel bug alright (see thread on kernel netdev
> at http://marc.info/?t=127094288900001&r=1&w=2 if interested). To be fair,
> I don't think these bugs are confined to Debian - I did some initial testing
> with Scientific Linux and also ran into problems with forcedeth.


Interesting, good find. I try to avoid forcedeth now and have heard the same
from ops people at various large linux deployments. Not sure why, but it's
traditionally had a lot of bugs/regressions.


> Sure, but I figured I'd go with a distro now that can be largely left
> untouched for the next 2-3 years and Debian lenny felt that bit old for
> that. I know RHEL/CentOS would fit that requirement also, will see. I'm also
> interested in using DRBD in some of our nodes for redundancy, again, running
> with a newer distro should reduce the pain of configuring that.
>
> Finally, I figured burning in our cluster was a good opportunity to give
> back to the community and do some testing on their behalf.
>

Very admirable of you :) It is good to have some people running new kernels
to suss these issues out before the rest of us check out modern technology
;-)


>
> With regard to our TeraSort benchmark time of ~23 minutes - is that in the
> right ballpark for a cluster of 45 data nodes and a nn and 2nn?
>
>
Yep, sounds about the right ballpark.

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Network problems Hadoop 0.20.2 and Terasort on Debian 2.6.32 kernel

Posted by stephen mulcahy <st...@deri.org>.

Todd Lipcon wrote:
> Most likely a kernel bug. In previous versions of Debian there was a buggy
> forcedeth driver, for example, that caused it to drop off the network in
> high load. Who knows what new bug is in 2.6.32 which is brand spanking new.

Yes, it looks like it is a kernel bug alright (see thread on kernel 
netdev at http://marc.info/?t=127094288900001&r=1&w=2 if interested). To 
be fair, I don't think these bugs are confined to Debian - I did some 
initial testing with Scientific Linux and also ran into problems with 
forcedeth.

> The overwhelming majority of production clusters run on RHEL 5.3 or RHEL 5.4
> in my experience (I'm lumping CentOS 5.3/5.4 in with RHEL here). I know one
> or two production clusters running Debian Lenny, but none running something
> as new as what you're talking about. 

This is useful info - much appreciated. I guess if we don't manage to 
stabilise the current config we'll look at moving to one of those.

> Hadoop doesn't exercise the new
> features in very recent kernels, so there's no sense accepting instability -
> just go with something old that works!

Sure, but I figured I'd go with a distro now that can be largely left 
untouched for the next 2-3 years and Debian lenny felt that bit old for 
that. I know RHEL/CentOS would fit that requirement also, will see. I'm 
also interested in using DRBD in some of our nodes for redundancy, 
again, running with a newer distro should reduce the pain of configuring 
that.

Finally, I figured burning in our cluster was a good opportunity to give 
back to the community and do some testing on their behalf.

With regard to our TeraSort benchmark time of ~23 minutes - is that in 
the right ballpark for a cluster of 45 data nodes and a nn and 2nn?

Thanks,

-stephen

-- 
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com