You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by yaoxiaohua <ya...@outlook.com> on 2015/12/17 03:36:20 UTC
NodeManager crash with exception and oom urgent
Hi guys,
Environment:Hadoop2.3-cdh5.0.2, I have a cluster about sixty
nodes. Nn1,nn2 are ha namenodes. Dn1-dn58 are data
nodes(datanode,nodemanager).
Now one datanode 's nodemanager always crash after executing some containers
,sometimes after some hours ,sometimes some minutes.
Configuration are same with other datanodes. Kernel paramer are not, because
I am tunning for this issue.
I have spent a lot of time to investigate this issue, but have no solution.
This drives me crazy.
I tune some Linux kernel parameter:
vm.overcommit_memory=1
vm.swappiness = 20
#for dmesg page allocate failure
vm.zone_reclaim_mode = 1
vm.min_free_kbytes = 65536
I also change nodemanager's gc policy from gencon to optthruput;
Process log:
2015-12-16 17:24:39,663 ERROR org.apache.hadoop.mapred.ShuffleHandler:
Shuffle error [id: 0x34ed4c97, /172.19.206.148:34641 =>
/172.19.206.142:8080] EXCEPTII
ON: java.lang.ArrayIndexOutOfBoundsException
2015-12-16 17:24:39,663 FATAL
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[Container
Monitor,5,main] threw an Error. Shutting down noo
w...
java.lang.OutOfMemoryError: Java heapspace
at java.util.HashMap.inflateTable(HashMap.java:328)
at java.util.HashMap.<init>(HashMap.java:308)
at
org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.updateProcessTree(ProcfsB
asedProcessTree.java:154)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.Container
sMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:390)
2015-12-16 17:24:39,666 INFO org.apache.hadoop.util.ExitUtil: Halt with
status -1 Message: HaltException
Before this there are always some errors like this:
2015-12-16 17:24:35,336 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.Container
sMonitorImpl: Memory usage of ProcessTree 19947 forr
container-id container_1448915696877_23390_01_000037: 102.6 MB of 2 GB
physical memory used; 2.1 GB of 4.2 GB virtual memory used
2015-12-16 17:24:38,379 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.Container
sMonitorImpl: Uncaught exception in ContainerMemoryy
Manager while managing memory of container_1448915696877_23390_01_000543
java.lang.IllegalArgumentException: disparate values
at sun.misc.FDBigInt.quoRemIteration(FloatingDecimal.java:2931)
at
sun.misc.FormattedFloatingDecimal.dtoa(FormattedFloatingDecimal.java:922)
at
sun.misc.FormattedFloatingDecimal.<init>(FormattedFloatingDecimal.java:542)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:3264)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:3202)
at
java.util.Formatter$FormatSpecifier.printFloat(Formatter.java:2769)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:2720)
at java.util.Formatter.format(Formatter.java:2500)
at java.util.Formatter.format(Formatter.java:2435)
at java.lang.String.format(String.java:2148)
at org.apache.hadoop.util.StringUtils.format(StringUtils.java:123)
at
org.apache.hadoop.util.StringUtils$TraditionalBinaryPrefix.long2String(Strin
gUtils.java:758)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.Container
sMonitorImpl$MonitoringThread.formatUsageString(ContainersMonitorImpll
.java:487)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.Container
sMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:399)
2015-12-16 17:24:38,516 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.Container
sMonitorImpl: Uncaught exception in ContainerMemoryy
Manager while managing memory of container_1448915696877_23390_01_000374
java.lang.ArrayIndexOutOfBoundsException
at
sun.misc.FormattedFloatingDecimal.dtoa(FormattedFloatingDecimal.java:848)
at
sun.misc.FormattedFloatingDecimal.<init>(FormattedFloatingDecimal.java:542)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:3264)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:3202)
at
java.util.Formatter$FormatSpecifier.printFloat(Formatter.java:2769)
at java.util.Formatter$FormatSpecifier.print(Formatter.java:2720)
at java.util.Formatter.format(Formatter.java:2500)
Best Regards,
Evan Yao