You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "wangyongshuai (JIRA)" <ji...@apache.org> on 2018/12/26 12:14:00 UTC

[jira] [Created] (AMBARI-25068) ambari-server hostcomponentState sync slow with 10000+ hosts

wangyongshuai created AMBARI-25068:
--------------------------------------

             Summary: ambari-server hostcomponentState sync slow with 10000+ hosts
                 Key: AMBARI-25068
                 URL: https://issues.apache.org/jira/browse/AMBARI-25068
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.6.2
         Environment: ambari 2.6.2

jdk 1.8
            Reporter: wangyongshuai


it needs about btween 20 and 50 mintues to sync hostcomponentState.

So i changed HearBeatMonitor.java code like this:
{code:java}
//代码占位符
private void doWork(ThreadPoolService threadPoolService) throws InvalidStateTransitionException, AmbariException {
  List<Host> allHosts = clusters.getHosts();

    final long now = System.currentTimeMillis();
    List<Future> futureList = new ArrayList<>();

    for (final Host hostObj : allHosts) {
    //put the host into thread

        Future<?> future = threadPoolService.submit(new Runnable() {
                                                        @Override
                                                        public void run() {
                                                            try {
{code}
 

after ABTest, we found about it need 10 minutes to do the same thing. But with the jstack info,we found the method "generateStatusCommands()" with trigger a read lock like this:
{code:java}
//代码占位符
3560 "pool-10-thread-29" #83 prio=5 os_prio=0 tid=0x00007fc1d8036000 nid=0x29937 waiting for monitor entry [0x00007fc2186d4000]
3561 java.lang.Thread.State: BLOCKED (on object monitor)
3562 at org.eclipse.persistence.internal.helper.ConcurrencyManager.acquireReadLockNoWait(ConcurrencyManager.java:248)
3563 - waiting to lock <0x00007fc626b718c8> (a org.eclipse.persistence.internal.identitymaps.HardCacheWeakIdentityMap$ReferenceC acheKey)
3564 at org.eclipse.persistence.internal.identitymaps.CacheKey.acquireReadLockNoWait(CacheKey.java:287)
3565 at org.eclipse.persistence.internal.helper.WriteLockManager.acquireLockAndRelatedLocks(WriteLockManager.java:129)
3566 at org.eclipse.persistence.internal.helper.WriteLockManager.checkAndLockObject(WriteLockManager.java:450)
3567 at org.eclipse.persistence.internal.helper.WriteLockManager.traverseRelatedLocks(WriteLockManager.java:213)
3568 at org.eclipse.persistence.internal.helper.WriteLockManager.acquireLockAndRelatedLocks(WriteLockManager.java:144)
3569 at org.eclipse.persistence.internal.helper.WriteLockManager.checkAndLockObject(WriteLockManager.java:450)
3570 at org.eclipse.persistence.internal.helper.WriteLockManager.traverseRelatedLocks(WriteLockManager.java:213)
3571 at org.eclipse.persistence.internal.helper.WriteLockManager.acquireLockAndRelatedLocks(WriteLockManager.java:144)
3572 at org.eclipse.persistence.internal.helper.WriteLockManager.acquireLocksForClone(WriteLockManager.java:75)
3573 at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:997)
3574 at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:955)
3575 at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentity MapAccessor.java:209)
3576 at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor .java:137)
3577 at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3940)
3578 at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3894)
3579 at org.eclipse.persistence.mappings.CollectionMapping.buildElementUnitOfWorkClone(CollectionMapping.java:308)
3580 at org.eclipse.persistence.mappings.CollectionMapping.buildElementClone(CollectionMapping.java:321)
3581 at org.eclipse.persistence.internal.queries.ContainerPolicy.addNextValueFromIteratorInto(ContainerPolicy.java:215)
3582 at org.eclipse.persistence.mappings.CollectionMapping.buildCloneForPartObject(CollectionMapping.java:223)
3583 at org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder.buildCloneFor(UnitOfWorkQueryValueHolder.java:60 )
3584 at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiateImpl(UnitOfWorkValueHolder.java:173)
3585 at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiate(UnitOfWorkValueHolder.java:234)
3586 at org.eclipse.persistence.internal.indirection.DatabaseValueHolder.getValue(DatabaseValueHolder.java:89)
3587 - locked <0x00007fc4a1541fa8> (a org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder)
3588 at org.eclipse.persistence.indirection.IndirectList.buildDelegate(IndirectList.java:271)
3589 at org.eclipse.persistence.indirection.IndirectList.getDelegate(IndirectList.java:455)
3590 - locked <0x00007fc4a1542018> (a org.eclipse.persistence.internal.indirection.jdk8.IndirectList)
3591 at org.eclipse.persistence.internal.indirection.jdk8.IndirectList.access$000(IndirectList.java:32)
3592 at org.eclipse.persistence.internal.indirection.jdk8.IndirectList$1.<init>(IndirectList.java:57)
3593 at org.eclipse.persistence.internal.indirection.jdk8.IndirectList.listIterator(IndirectList.java:56)
3594 at org.eclipse.persistence.indirection.IndirectList.iterator(IndirectList.java:555)
3595 at org.apache.ambari.server.state.cluster.ClusterImpl.getDesiredConfigs(ClusterImpl.java:1468)
3596 at org.apache.ambari.server.state.cluster.ClusterImpl.getDesiredConfigs(ClusterImpl.java:1443)
3597 at org.apache.ambari.server.state.ConfigHelper.getEffectiveDesiredTags(ConfigHelper.java:178)
3598 at org.apache.ambari.server.state.ConfigHelper.getEffectiveDesiredTags(ConfigHelper.java:149)
3599 at org.apache.ambari.server.state.ConfigHelper.getEffectiveDesiredTags(ConfigHelper.java:129)
3600 at org.apache.ambari.server.agent.HeartbeatMonitor.createStatusCommand(HeartbeatMonitor.java:318)
3601 at org.apache.ambari.server.agent.HeartbeatMonitor.generateStatusCommands(HeartbeatMonitor.java:279)
3602 at org.apache.ambari.server.agent.HeartbeatMonitor$1.run(HeartbeatMonitor.java:227)
3603 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
3604 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
3605 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
3606 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
3607 at java.lang.Thread.run(Thread.java:745)
{code}
should we remove the read lock code "clusterGlobalLock.readLock().lock();" ? or something entities is locked? "for (ClusterConfigEntity configEntity : entities) {"

thx!

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)