You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Owen O'Malley <oo...@yahoo-inc.com> on 2009/05/15 20:38:32 UTC

Beware sun's jvm version 1.6.0_05-b13 on linux

We have observed that the default jvm on RedHat 5 can cause  
significant data corruption in the map/reduce shuffle for those using  
Hadoop 0.20. In particular, the guilty jvm is:

java version "1.6.0_05"
Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode)

By upgrading to jvm build 1.6.0_13-b03, we fixed the problem. The  
observed behavior is that Jetty serves up random bytes from other  
transfers. In particular, some of them were valid transfers to the  
wrong reduce. We suspect the relevant java bug is:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933

We have also filed a bug on Hadoop to add sanity checks on the shuffle  
that will work around the problem:

https://issues.apache.org/jira/browse/HADOOP-5783

-- Owen

Re: Beware sun's jvm version 1.6.0_05-b13 on linux

Posted by Owen O'Malley <om...@apache.org>.
On May 18, 2009, at 3:42 AM, Steve Loughran wrote:

> Presumably its one of those hard-to-reproduce race conditions that  
> only surfaces under load on a big cluster so is hard to replicate in  
> a unit test, right?

Yes. It reliably happens on a 100TB or larger sort, but almost never  
happens on a small scale.

-- Owen

Re: Beware sun's jvm version 1.6.0_05-b13 on linux

Posted by Steve Loughran <st...@apache.org>.
Allen Wittenauer wrote:
> 
> 
> On 5/15/09 11:38 AM, "Owen O'Malley" <oo...@yahoo-inc.com> wrote:
> 
>> We have observed that the default jvm on RedHat 5
> 
>     I'm sure some people are scratching their heads at this.
> 
>     The default JVM on at least RHEL5u0/1 is a GCJ-based 1.4, clearly
> incapable of running Hadoop.  We [and, really, this is my doing... ^.^ ]
> replace it with the JVM from the JPackage folks.  So while this isn't the
> default JVM that comes from RHEL, the warning should still be heeded. 
> 

Presumably its one of those hard-to-reproduce race conditions that only 
surfaces under load on a big cluster so is hard to replicate in a unit 
test, right?


Re: Beware sun's jvm version 1.6.0_05-b13 on linux

Posted by Allen Wittenauer <aw...@yahoo-inc.com>.


On 5/15/09 11:38 AM, "Owen O'Malley" <oo...@yahoo-inc.com> wrote:

> We have observed that the default jvm on RedHat 5

    I'm sure some people are scratching their heads at this.

    The default JVM on at least RHEL5u0/1 is a GCJ-based 1.4, clearly
incapable of running Hadoop.  We [and, really, this is my doing... ^.^ ]
replace it with the JVM from the JPackage folks.  So while this isn't the
default JVM that comes from RHEL, the warning should still be heeded.