You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by tobe <to...@gmail.com> on 2014/09/02 10:30:11 UTC

Disastrous bug when running jinfo and jmap

It's not a bug of HBase but it's related to HBase. When I run jinfo or jmap
to HMaster or HRegionServer, it will "suspend" the Java process. It's 100%
reproduced for the long running processes.

Here're the detailed steps:

1. Pick a Java process which is running over 25 days(It's wired because
this doesn't work for new processes).
2. Run ps to check the state of the process, should be "Sl" which is
expected.
3. Run jinfo or jmap to this process(BTY, jstack doesn't have this issue).
4. Run ps to check the state of the process. This time it changes to "Tl"
which means STOPPED and the process doesn't response any requests.

Here's the output of our process:

[work@hadoop ~]$ ps aux |grep "qktst" |grep "RegionServer"
work     36663  0.1  1.7 24157828 1150820 ?    Sl   Aug06  72:54
/opt/soft/jdk/bin/java -cp
/home/work/app/hbase/qktst-qk/regionserver/:/home/work/app/hbase/qktst-qk/regionserver/package//:/home/work/app/hbase/qktst-qk/regionserver/package//lib/*:/home/work/app/hbase/qktst-qk/regionserver/package//*
-Djava.library.path=:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/Linux-amd64-64
-Xbootclasspath/p:/home/work/app/hbase/qktst-qk/regionserver/package/lib/hadoop-security-2.0.0-mdh1.1.0.jar
-Xmx10240m -Xms10240m -Xmn1024m -XX:MaxDirectMemorySize=1024m
-XX:MaxPermSize=512m
-Xloggc:/home/work/app/hbase/qktst-qk/regionserver/stdout/regionserver_gc_20140806-211157.log
-Xss256k -XX:PermSize=64m -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/home/work/app/hbase/qktst-qk/regionserver/log
-XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6
-XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled
-XX:+UseNUMA -XX:+CMSClassUnloadingEnabled
-XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=128m
-XX:CMSWaitDuration=2000 -XX:+CMSScavengeBeforeRemark
-XX:+PrintPromotionFailure -XX:ConcGCThreads=16 -XX:ParallelGCThreads=16
-XX:PretenureSizeThreshold=2097088 -XX:+CMSConcurrentMTEnabled
-XX:+ExplicitGCInvokesConcurrent -XX:+SafepointTimeout
-XX:MonitorBound=16384 -XX:-UseBiasedLocking -XX:MaxTenuringThreshold=3
-Dproc_regionserver
-Djava.security.auth.login.config=/home/work/app/hbase/qktst-qk/regionserver/jaas.conf
-Djava.net.preferIPv4Stack=true
-Dhbase.log.dir=/home/work/app/hbase/qktst-qk/regionserver/log
-Dhbase.pid=36663 -Dhbase.cluster=qktst-qk -Dhbase.log.level=debug
-Dhbase.policy.file=hbase-policy.xml
-Dhbase.home.dir=/home/work/app/hbase/qktst-qk/regionserver/package
-Djava.security.krb5.conf=/home/work/app/hbase/qktst-qk/regionserver/krb5.conf
-Dhbase.id.str=work org.apache.hadoop.hbase.regionserver.HRegionServer start
[work@hadoop ~]$ jinfo 36663 > tobe.jinfo
Attaching to process ID 36663, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 20.12-b01
[work@hadoop ~]$ ps aux |grep "qktst" |grep "RegionServer"
work     36663  0.1  1.7 24157828 1151008 ?    Tl   Aug06  72:54
/opt/soft/jdk/bin/java -cp
/home/work/app/hbase/qktst-qk/regionserver/:/home/work/app/hbase/qktst-qk/regionserver/package//:/home/work/app/hbase/qktst-qk/regionserver/package//lib/*:/home/work/app/hbase/qktst-qk/regionserver/package//*
-Djava.library.path=:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/Linux-amd64-64
-Xbootclasspath/p:/home/work/app/hbase/qktst-qk/regionserver/package/lib/hadoop-security-2.0.0-mdh1.1.0.jar
-Xmx10240m -Xms10240m -Xmn1024m -XX:MaxDirectMemorySize=1024m
-XX:MaxPermSize=512m
-Xloggc:/home/work/app/hbase/qktst-qk/regionserver/stdout/regionserver_gc_20140806-211157.log
-Xss256k -XX:PermSize=64m -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/home/work/app/hbase/qktst-qk/regionserver/log
-XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6
-XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled
-XX:+UseNUMA -XX:+CMSClassUnloadingEnabled
-XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=128m
-XX:CMSWaitDuration=2000 -XX:+CMSScavengeBeforeRemark
-XX:+PrintPromotionFailure -XX:ConcGCThreads=16 -XX:ParallelGCThreads=16
-XX:PretenureSizeThreshold=2097088 -XX:+CMSConcurrentMTEnabled
-XX:+ExplicitGCInvokesConcurrent -XX:+SafepointTimeout
-XX:MonitorBound=16384 -XX:-UseBiasedLocking -XX:MaxTenuringThreshold=3
-Dproc_regionserver
-Djava.security.auth.login.config=/home/work/app/hbase/qktst-qk/regionserver/jaas.conf
-Djava.net.preferIPv4Stack=true
-Dhbase.log.dir=/home/work/app/hbase/qktst-qk/regionserver/log
-Dhbase.pid=36663 -Dhbase.cluster=qktst-qk -Dhbase.log.level=debug
-Dhbase.policy.file=hbase-policy.xml
-Dhbase.home.dir=/home/work/app/hbase/qktst-qk/regionserver/package
-Djava.security.krb5.conf=/home/work/app/hbase/qktst-qk/regionserver/krb5.conf
-Dhbase.id.str=work org.apache.hadoop.hbase.regionserver.HRegionServer start


It may be the bug of JVM but I don't get any response after I reported this
to oracle. So I hope some JVM experts here could help.

$ java -version
java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)