You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Ian Brooks <i....@sensewhere.com> on 2014/05/14 14:47:32 UTC
Constant high cpu usage on datanode process
Hi,
I have noticed recently that the datanodes on a test HDFS cluster im running are run permemnantly at 100% cpu usage after a few hours of minimal use and over time switch up to 200% CPU event when there is almost nothing happening on HDFS.
Node 1:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15607 hadoop 20 0 1746m 480m 5380 S 102.2 6.1 190:24.45 java
ps -ef |grep 15607
hadoop 15607 1 14 May13 ? 03:12:33 /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java -Dproc_datanode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/usr/local/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/local/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/usr/local/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/usr/local/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop-hadoop-datanode-sw-hmaster-001.log -Dhadoop.home.dir=/usr/local/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
Node 6:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10809 hadoop 20 0 1750m 497m 17m S 200.0 6.3 277:07.34 java
ps -ef |grep 10809
hadoop 10809 1 99 09:56 ? 04:38:04 /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java -Dproc_datanode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/usr/local/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/local/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/usr/local/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/usr/local/hadoop-2.3.0/logs -Dhadoop.log.file=hadoop-hadoop-datanode-sw-hadoop-006.log -Dhadoop.home.dir=/usr/local/hadoop-2.3.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/usr/local/hadoop-2.3.0/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=ERROR,RFAS -Dhadoop.security.logger=ERROR,RFAS -Dcom.sun.management.jmxremote.port=1235 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.datanode.DataNode
I though initally it might be GC but monitoring the datanode process via JMX show almost no GC activity < 1%
If I restart the datanode process then CPU usage drops back doen tot 1-5% CPU for a few hours then jumps backup to 100% again.
The version of Hadoop i'm using is hadoop-2.3.0 and the main use is for HBase which is also not handling much traffic on the test platform. There are 8 nodes in the cluster each with 8GB ram, 4GB assigned to HBase, top shows 1.9GB ram free.
Hadoop heap size is set to 1GB with useage varying between 100 and 200MB, VisualVM shows most of the CPU time to be in
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService() - 33.5%
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run() - 23.3%
org.apache.hadoop.net.SocketIOWithTimeoujt$SelectorPool.select() - 16.9%
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.reportReceivedDeletedBlocks() - 12.9%
Any ideas why this would be happening, has anyone else seen this behaviour?
-Ian Brooks