You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Viral Bajaria <vi...@gmail.com> on 2019/09/30 05:04:47 UTC

datanode threads blocked on synchronized method snapshot on MetricsRegistry

Hi,

I am trying to figure out the reason why our datanode performance degrades
after few days and why all the issues fix itself on restart.

We have a lot of remote reads for our HDFS blocks and am noticing a lot of
threads are BLOCKED (see stack trace below).

All threads are entering the synchronized method located at
https://github.com/apache/hadoop/blob/release-2.9.2-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MetricsRegistry.java#L447

I am wondering why are so many threads trying to build a new snapshot ?

I am not that familiar with this code path and what's the significance of
it. Maybe we can turn this off via some configuration and see if that makes
the problem go away and then create a JIRA to solve this issue ?

Thanks,
Viral


===BLOCKED THREADS===
"RMI TCP Connection(584)-10.200.76.8" #12963867 daemon prio=5 os_prio=0
tid=0x00007f2b9a58a800 nid=0x1fd3f waiting for monitor entry
[0x00007f1ec57d4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at
org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java)
        - waiting to lock <0x00007f20fc199768> (a
org.apache.hadoop.metrics2.lib.MetricsRegistry)
        at
org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:80)
        at
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200)
        at
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:183)
        at
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getAttribute(MetricsSourceAdapter.java:107)
        at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
        at
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
        at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445)
        at
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
        at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
        at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
        at
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639)
        at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
        at sun.rmi.transport.Transport$1.run(Transport.java:200)
        at sun.rmi.transport.Transport$1.run(Transport.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
        at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$13/332664703.run(Unknown
Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)