You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jinsong Hu <ji...@hotmail.com> on 2010/08/13 20:27:24 UTC

Fw: namenode crash


Hi, There:
  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version

Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
 x86_64 x86_64 x86_64 GNU/Linux

jdk version
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

and run the namenode with the following jvm config 
-Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G

but it crashed silently after 16 hours. 

I used jdk 
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good. 
but does anybody can recommend a good combination of jdk and os version that can run stably ?


This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.

Jimmy.



RE: namenode crash

Posted by Jeremy Carroll <je...@networkedinsights.com>.
I am currently running CentOS as well and have no issues. I believe your JVM settings are the culprit. AggressiveOpts should probably not be on at all. CMS IncrementalMode should be turned off as well in production.

- Cloudera CDH v3 B2
- CentOS 5.5 (Kernel: 2.6.18-194.8.1.el5). 
- Sun JVM 1.6.0_16
- JVM Opts: -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:MaxNewSize=64m -XX:NewSize=64m

________________________________________
From: Jinsong Hu [jinsong_hu@hotmail.com]
Sent: Friday, August 13, 2010 1:27 PM
To: user@hbase.apache.org
Subject: Fw: namenode crash

Hi, There:
  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version

Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
 x86_64 x86_64 x86_64 GNU/Linux

jdk version
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

and run the namenode with the following jvm config
-Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G

but it crashed silently after 16 hours.

I used jdk
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
but does anybody can recommend a good combination of jdk and os version that can run stably ?


This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.

Jimmy.



Re: Fw: namenode crash

Posted by Ryan Rawson <ry...@gmail.com>.
Also I second those JVM options, they can and do cause stability issues.

On Fri, Aug 13, 2010 at 12:17 PM, Jeremy Carroll
<je...@networkedinsights.com> wrote:
> I would second upping the NameNode RAM. Most name nodes have the most ram of any server in the cluster. Make sure you are not storing small files and have a very high block count. From the article that I linked below about 10 million files = 3Gb of JVM heap for the NameNode.
>
> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>
> ________________________________________
> From: Edward Capriolo [edlinuxguru@gmail.com]
> Sent: Friday, August 13, 2010 2:16 PM
> To: user@hbase.apache.org
> Subject: Re: Fw: namenode crash
>
> On Fri, Aug 13, 2010 at 3:03 PM, Ryan Rawson <ry...@gmail.com> wrote:
>> We don't use centos here at Stumbleupon... your version looks quite
>> old!  Our uname looks like:
>>
>> Linux host 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC
>> 2009 x86_64 GNU/Linux
>>
>> I'd consider using something newer than 2.6.18!
>>
>> On Fri, Aug 13, 2010 at 11:54 AM, Jean-Daniel Cryans
>> <jd...@apache.org> wrote:
>>> u18 should never be used.
>>>
>>> You say it's crashing on both u17 and u20? How is it crashing? (it's
>>> kind of a vague word)
>>>
>>> Here with use both u14 and u17 on 20 nodes clusters without any issue.
>>>
>>> J-D
>>>
>>> On Fri, Aug 13, 2010 at 11:27 AM, Jinsong Hu <ji...@hotmail.com> wrote:
>>>>
>>>>
>>>> Hi, There:
>>>>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
>>>>
>>>> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
>>>>  x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>> jdk version
>>>> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>>>>
>>>> and run the namenode with the following jvm config
>>>> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
>>>>
>>>> but it crashed silently after 16 hours.
>>>>
>>>> I used jdk
>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>>
>>>> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
>>>> but does anybody can recommend a good combination of jdk and os version that can run stably ?
>>>>
>>>>
>>>> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
>>>>
>>>> Jimmy.
>>>>
>>>>
>>>>
>>>
>>
>
> RedHat/CentOS backport kernel patches and attempt to keep the minor
> number relatively stable.
>
> Something like 2.6.18-194 is probably closer to 2.6.28 then 2.6.18.
>
> Do you have any more free memory? Maybe for fun raise you -Xmx4G.
>
> Edward
>

RE: Fw: namenode crash

Posted by Jeremy Carroll <je...@networkedinsights.com>.
I would second upping the NameNode RAM. Most name nodes have the most ram of any server in the cluster. Make sure you are not storing small files and have a very high block count. From the article that I linked below about 10 million files = 3Gb of JVM heap for the NameNode.

http://www.cloudera.com/blog/2009/02/the-small-files-problem/

________________________________________
From: Edward Capriolo [edlinuxguru@gmail.com]
Sent: Friday, August 13, 2010 2:16 PM
To: user@hbase.apache.org
Subject: Re: Fw: namenode crash

On Fri, Aug 13, 2010 at 3:03 PM, Ryan Rawson <ry...@gmail.com> wrote:
> We don't use centos here at Stumbleupon... your version looks quite
> old!  Our uname looks like:
>
> Linux host 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC
> 2009 x86_64 GNU/Linux
>
> I'd consider using something newer than 2.6.18!
>
> On Fri, Aug 13, 2010 at 11:54 AM, Jean-Daniel Cryans
> <jd...@apache.org> wrote:
>> u18 should never be used.
>>
>> You say it's crashing on both u17 and u20? How is it crashing? (it's
>> kind of a vague word)
>>
>> Here with use both u14 and u17 on 20 nodes clusters without any issue.
>>
>> J-D
>>
>> On Fri, Aug 13, 2010 at 11:27 AM, Jinsong Hu <ji...@hotmail.com> wrote:
>>>
>>>
>>> Hi, There:
>>>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
>>>
>>> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
>>>  x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> jdk version
>>> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
>>> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>>>
>>> and run the namenode with the following jvm config
>>> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
>>>
>>> but it crashed silently after 16 hours.
>>>
>>> I used jdk
>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>
>>> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
>>> but does anybody can recommend a good combination of jdk and os version that can run stably ?
>>>
>>>
>>> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
>>>
>>> Jimmy.
>>>
>>>
>>>
>>
>

RedHat/CentOS backport kernel patches and attempt to keep the minor
number relatively stable.

Something like 2.6.18-194 is probably closer to 2.6.28 then 2.6.18.

Do you have any more free memory? Maybe for fun raise you -Xmx4G.

Edward

Re: Fw: namenode crash

Posted by Edward Capriolo <ed...@gmail.com>.
On Fri, Aug 13, 2010 at 3:03 PM, Ryan Rawson <ry...@gmail.com> wrote:
> We don't use centos here at Stumbleupon... your version looks quite
> old!  Our uname looks like:
>
> Linux host 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC
> 2009 x86_64 GNU/Linux
>
> I'd consider using something newer than 2.6.18!
>
> On Fri, Aug 13, 2010 at 11:54 AM, Jean-Daniel Cryans
> <jd...@apache.org> wrote:
>> u18 should never be used.
>>
>> You say it's crashing on both u17 and u20? How is it crashing? (it's
>> kind of a vague word)
>>
>> Here with use both u14 and u17 on 20 nodes clusters without any issue.
>>
>> J-D
>>
>> On Fri, Aug 13, 2010 at 11:27 AM, Jinsong Hu <ji...@hotmail.com> wrote:
>>>
>>>
>>> Hi, There:
>>>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
>>>
>>> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
>>>  x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> jdk version
>>> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
>>> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>>>
>>> and run the namenode with the following jvm config
>>> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
>>>
>>> but it crashed silently after 16 hours.
>>>
>>> I used jdk
>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>>
>>> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
>>> but does anybody can recommend a good combination of jdk and os version that can run stably ?
>>>
>>>
>>> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
>>>
>>> Jimmy.
>>>
>>>
>>>
>>
>

RedHat/CentOS backport kernel patches and attempt to keep the minor
number relatively stable.

Something like 2.6.18-194 is probably closer to 2.6.28 then 2.6.18.

Do you have any more free memory? Maybe for fun raise you -Xmx4G.

Edward

Re: Fw: namenode crash

Posted by Ryan Rawson <ry...@gmail.com>.
We don't use centos here at Stumbleupon... your version looks quite
old!  Our uname looks like:

Linux host 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC
2009 x86_64 GNU/Linux

I'd consider using something newer than 2.6.18!

On Fri, Aug 13, 2010 at 11:54 AM, Jean-Daniel Cryans
<jd...@apache.org> wrote:
> u18 should never be used.
>
> You say it's crashing on both u17 and u20? How is it crashing? (it's
> kind of a vague word)
>
> Here with use both u14 and u17 on 20 nodes clusters without any issue.
>
> J-D
>
> On Fri, Aug 13, 2010 at 11:27 AM, Jinsong Hu <ji...@hotmail.com> wrote:
>>
>>
>> Hi, There:
>>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
>>
>> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
>>  x86_64 x86_64 x86_64 GNU/Linux
>>
>> jdk version
>> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
>> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>>
>> and run the namenode with the following jvm config
>> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
>>
>> but it crashed silently after 16 hours.
>>
>> I used jdk
>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>>
>> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
>> but does anybody can recommend a good combination of jdk and os version that can run stably ?
>>
>>
>> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
>>
>> Jimmy.
>>
>>
>>
>

Re: Fw: namenode crash

Posted by Jean-Daniel Cryans <jd...@apache.org>.
u18 should never be used.

You say it's crashing on both u17 and u20? How is it crashing? (it's
kind of a vague word)

Here with use both u14 and u17 on 20 nodes clusters without any issue.

J-D

On Fri, Aug 13, 2010 at 11:27 AM, Jinsong Hu <ji...@hotmail.com> wrote:
>
>
> Hi, There:
>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
>
> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
>  x86_64 x86_64 x86_64 GNU/Linux
>
> jdk version
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>
> and run the namenode with the following jvm config
> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
>
> but it crashed silently after 16 hours.
>
> I used jdk
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>
> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
> but does anybody can recommend a good combination of jdk and os version that can run stably ?
>
>
> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
>
> Jimmy.
>
>
>

RE: namenode crash

Posted by Jeremy Carroll <je...@networkedinsights.com>.
That JDK is unstable. I would highly recommend 1.6.0_16. I would also use the Sun JVM and not OpenJDK that comes with CentOS. There are differences. You can find JDK 6 u 16 here.

http://java.sun.com/products/archive/j2se/6u16/index.html
________________________________________
From: Richard Lackey [richlackey@roamingcloud.com]
Sent: Friday, August 13, 2010 3:06 PM
To: user@hbase.apache.org
Subject: Re: namenode crash

I am using CentOS 5.5 with suggested updates. JDK 1.6.0_21, JRE 1.6.0_21 all x64

Rich

Rich Lackey
RoamingCloud, LLC
President, CTO

408-373-7772
richlackey@roamingcloud.com



On Aug 13, 2010, at 1:27 PM, Jinsong Hu wrote:

>
>
> Hi, There:
>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
>
> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
> x86_64 x86_64 x86_64 GNU/Linux
>
> jdk version
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>
> and run the namenode with the following jvm config
> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
>
> but it crashed silently after 16 hours.
>
> I used jdk
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>
> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good.
> but does anybody can recommend a good combination of jdk and os version that can run stably ?
>
>
> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
>
> Jimmy.
>
>


Re: namenode crash

Posted by Richard Lackey <ri...@roamingcloud.com>.
I am using CentOS 5.5 with suggested updates. JDK 1.6.0_21, JRE 1.6.0_21 all x64

Rich

Rich Lackey
RoamingCloud, LLC
President, CTO

408-373-7772
richlackey@roamingcloud.com



On Aug 13, 2010, at 1:27 PM, Jinsong Hu wrote:

> 
> 
> Hi, There:
>  does anybody know of a good combination of centos version and jdk version that works stably ? I am using centos version
> 
> Linux  2.6.18-194.8.1.el5.centos.plus #1 SMP Wed Jul 7 11:45:38 EDT 2010
> x86_64 x86_64 x86_64 GNU/Linux
> 
> jdk version
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
> 
> and run the namenode with the following jvm config 
> -Xmx1000m  -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedOops -XX:+DoEscapeAnalysis -XX:+AggressiveOpts  -Xmx2G
> 
> but it crashed silently after 16 hours. 
> 
> I used jdk 
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
> 
> with the same jvm config, and the namenode crashed in about 1 week. I searched internet and people say 1.6.0_18 is not good. 
> but does anybody can recommend a good combination of jdk and os version that can run stably ?
> 
> 
> This crashing problem doesn't happen with a small cluster of 4 datanodes. but it happens with a cluster of 17 datanodes.
> 
> Jimmy.
> 
>