You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Jonathan Hurley (JIRA)" <ji...@apache.org> on 2015/02/24 06:22:12 UTC
[jira] [Created] (AMBARI-9761) Performance: Cluster Installation
Deadlocks When Setting Component States
Jonathan Hurley created AMBARI-9761:
---------------------------------------
Summary: Performance: Cluster Installation Deadlocks When Setting Component States
Key: AMBARI-9761
URL: https://issues.apache.org/jira/browse/AMBARI-9761
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.0.0
Reporter: Jonathan Hurley
Assignee: Jonathan Hurley
Priority: Critical
Fix For: 2.0.0
Attachments: jstack2
During provisioning of a cluster with at least 200 hosts, Ambari Server becomes unresponsive. Based on the thread dump, there exists a deadlock between:
- Cluster readers
- Cluster writers
- ServiceComponentHost writers
{noformat}
qtp626652285-97 ClusterImpl.convertToResponse() (cluster readLock)
qtp1282624353-47 ServiceComponentHostImpl.setRestartRequired() (sch writeLock)
qtp626652285-97 ServiceComponentHostImpl.getMaintenanceState() (sch readLock BLOCKED by qtp1282624353-47)
qtp1282624353-60 ClusterImpl.recalculateClusterVersionState() (cluster writeLock BLOCKED by qtp626652285-97)
qtp1282624353-47 ServiceComponentHostImpl.isPersisted() (cluster readLock BLOCKED by qtp1282624353-47)
"qtp626652285-97" prio=10 tid=0x00007f2e2803a800 nid=0x5a3f waiting on condition [0x00007f2df17cd000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000079ebb1130> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
at org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl.getMaintenanceState(ServiceComponentHostImpl.java:1437)
at org.apache.ambari.server.controller.MaintenanceStateHelper.getEffectiveState(MaintenanceStateHelper.java:208)
at org.apache.ambari.server.controller.MaintenanceStateHelper.getEffectiveState(MaintenanceStateHelper.java:177)
at org.apache.ambari.server.controller.MaintenanceStateHelper.getEffectiveState(MaintenanceStateHelper.java:191)
at org.apache.ambari.server.state.cluster.ClusterImpl.getClusterHealthReport(ClusterImpl.java:2422)
at org.apache.ambari.server.state.cluster.ClusterImpl.convertToResponse(ClusterImpl.java:1606)
"qtp1282624353-47" prio=10 tid=0x00007f2e08015800 nid=0x59c2 waiting on condition [0x00007f2df37ef000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000079bf800d8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
at org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl.isPersisted(ServiceComponentHostImpl.java:1153)
at org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl.saveIfPersisted(ServiceComponentHostImpl.java:1266)
at org.apache.ambari.server.state.svccomphost.ServiceComponentHostImpl.setRestartRequired(ServiceComponentHostImpl.java:1480)
at org.apache.ambari.server.agent.HeartBeatHandler.processCommandReports(HeartBeatHandler.java:546)
at org.apache.ambari.server.agent.HeartBeatHandler.handleHeartBeat(HeartBeatHandler.java:253)
at org.apache.ambari.server.agent.rest.AgentResource.heartbeat(AgentResource.java:123)
"qtp1282624353-60" prio=10 tid=0x00007f2dfc014800 nid=0x59cf waiting on condition [0x00007f2df2ae1000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000079bf800d8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
at org.apache.ambari.server.state.cluster.ClusterImpl.recalculateClusterVersionState(ClusterImpl.java:1180)
at org.apache.ambari.server.events.listeners.upgrade.StackVersionListener.onAmbariEvent(StackVersionListener.java:81)
at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.google.common.eventbus.EventHandler.handleEvent(EventHandler.java:74)
at com.google.common.eventbus.EventBus.dispatch(EventBus.java:314)
at com.google.common.eventbus.EventBus.dispatchQueuedEvents(EventBus.java:296)
at com.google.common.eventbus.EventBus.post(EventBus.java:267)
"qtp1282624353-109" prio=10 tid=0x00007f2df8001000 nid=0x5a52 waiting on condition [0x00007f2df2be2000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000079b474368> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
at org.apache.ambari.server.events.listeners.upgrade.StackVersionListener.onAmbariEvent(StackVersionListener.java:76)
at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.google.common.eventbus.EventHandler.handleEvent(EventHandler.java:74)
at com.google.common.eventbus.EventBus.dispatch(EventBus.java:314)
at com.google.common.eventbus.EventBus.dispatchQueuedEvents(EventBus.java:296)
at com.google.common.eventbus.EventBus.post(EventBus.java:267)
"qtp626652285-106" prio=10 tid=0x00007f2e28026000 nid=0x5a4a waiting on condition [0x00007f2df3efa000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000079bf800d8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
at org.apache.ambari.server.state.cluster.ClusterImpl.convertToResponse(ClusterImpl.java:1602)
at org.apache.ambari.server.controller.AmbariManagementControllerImpl.getClusters(AmbariManagementControllerImpl.java:861)
at org.apache.ambari.server.controller.AmbariManagementControllerImpl.getClusters(AmbariManagementControllerImpl.java:2563)
at org.apache.ambari.server.controller.internal.ClusterResourceProvider$1.invoke(ClusterResourceProvider.java:182)
at org.apache.ambari.server.controller.internal.ClusterResourceProvider$1.invoke(ClusterResourceProvider.java:179)
at org.apache.ambari.server.controller.internal.AbstractResourceProvider.getResources(AbstractResourceProvider.java:302)
at org.apache.ambari.server.controller.internal.ClusterResourceProvider.getResources(ClusterResourceProvider.java:179)
at org.apache.ambari.server.controller.internal.ClusterControllerImpl$ExtendedResourceProviderWrapper.queryForResources(ClusterControllerImpl.java:945)
at org.apache.ambari.server.controller.internal.ClusterControllerImpl.getResources(ClusterControllerImpl.java:132)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)