You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/06/15 11:38:09 UTC

[jira] [Commented] (BROOKLYN-299) Stopping app takes long time: blocked on LocalUsageManager.recordLocationEvent, waiting for ssh connection timeout

    [ https://issues.apache.org/jira/browse/BROOKLYN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331567#comment-15331567 ] 

ASF GitHub Bot commented on BROOKLYN-299:
-----------------------------------------

Github user neykov commented on the issue:

    https://github.com/apache/brooklyn-server/pull/199
  
    I like the fix. Changes look good. Merging.


> Stopping app takes long time: blocked on LocalUsageManager.recordLocationEvent, waiting for ssh connection timeout
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: BROOKLYN-299
>                 URL: https://issues.apache.org/jira/browse/BROOKLYN-299
>             Project: Brooklyn
>          Issue Type: Bug
>            Reporter: Aled Sage
>
> With brooklyn 0.10.0-snapshot master...
> It is sometimes taking a long time to stop locations (and this is also affecting starting them) - a few minutes extra.
> The reason is that it's blocked, waiting on the mutex in {{LocalUsageManager.recordLocationEvent}}, or {{LocalUsageManager.recordApplicationEvent}}!
> While holding the lock, it calls {{location.toMetadataRecord}}. For some location types, this causes an ssh call to infer the machine details. That in itself is a bad idea. However, this is made even worse because we call it when unmanaging a {{JcloudsSshMachineLocation}} - i.e. when the VM has already been terminated. This means the ssh call times out, potentially after a couple of minutes.
> The stacktrace for two such threads are shown below (collected using jstack):
> {noformat}
> "brooklyn-execmanager-hgWZmzF2-7587" daemon prio=5 tid=0x00007f94ee04b800 nid=0x14b53 waiting for monitor entry [0x0000700009d91000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.brooklyn.core.mgmt.internal.LocalUsageManager.recordApplicationEvent(LocalUsageManager.java:247)
>         - waiting to lock <0x00000007c06ca2f0> (a java.lang.Object)
>         at org.apache.brooklyn.core.entity.AbstractApplication.recordApplicationEvent(AbstractApplication.java:263)
>         at org.apache.brooklyn.core.entity.AbstractApplication.setExpectedStateAndRecordLifecycleEvent(AbstractApplication.java:258)
>         at org.apache.brooklyn.core.entity.AbstractApplication.stop(AbstractApplication.java:221)
>         at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
>         at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
>         at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1207)
>         at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1016)
>         at groovy.lang.DelegatingMetaClass.invokeMethod(DelegatingMetaClass.java:149)
>         at groovy.lang.MetaObjectProtocol$invokeMethod.call(Unknown Source)
>         at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
>         at groovy.lang.MetaObjectProtocol$invokeMethod.call(Unknown Source)
>         at org.apache.brooklyn.util.groovy.GroovyJavaMethods.invokeMethodOnMetaClass(GroovyJavaMethods.java:191)
>         at org.apache.brooklyn.core.mgmt.internal.AbstractManagementContext.invokeEffectorMethodLocal(AbstractManagementContext.java:309)
>         at org.apache.brooklyn.core.mgmt.internal.AbstractManagementContext.invokeEffectorMethodSync(AbstractManagementContext.java:333)
>         at org.apache.brooklyn.core.mgmt.internal.EffectorUtils.invokeMethodEffector(EffectorUtils.java:256)
>         at org.apache.brooklyn.core.effector.MethodEffector.call(MethodEffector.java:149)
>         at org.apache.brooklyn.core.entity.trait.Startable$StopEffectorBody.call(Startable.java:68)
>         at org.apache.brooklyn.core.entity.trait.Startable$StopEffectorBody.call(Startable.java:60)
>         at org.apache.brooklyn.core.effector.EffectorTasks$EffectorBodyTaskFactory$1.call(EffectorTasks.java:82)
>         at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
>         at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> "brooklyn-execmanager-hgWZmzF2-7583" daemon prio=5 tid=0x00007f94ef137000 nid=0xf67f waiting on condition [0x0000700009476000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000007efa10000> (a java.util.concurrent.FutureTask)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:187)
>         at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
>         at org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:361)
>         at org.apache.brooklyn.util.core.task.BasicTask.getUnchecked(BasicTask.java:370)
>         at org.apache.brooklyn.core.location.BasicMachineDetails.forSshMachineLocationLive(BasicMachineDetails.java:96)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.inferMachineDetails(SshMachineLocation.java:1070)
>         at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.inferMachineDetails(JcloudsSshMachineLocation.java:566)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.getMachineDetails(SshMachineLocation.java:1057)
>         - locked <0x00000007c1f12560> (a java.lang.Object)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.getOsDetails(SshMachineLocation.java:1038)
>         at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.toMetadataRecord(JcloudsSshMachineLocation.java:590)
>         at org.apache.brooklyn.core.mgmt.internal.LocalUsageManager.recordLocationEvent(LocalUsageManager.java:312)
>         - locked <0x00000007c06ca2f0> (a java.lang.Object)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.recordLocationEvent(LocalLocationManager.java:345)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.access$200(LocalLocationManager.java:53)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager$2.apply(LocalLocationManager.java:318)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager$2.apply(LocalLocationManager.java:306)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.recursively(LocalLocationManager.java:354)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.unmanage(LocalLocationManager.java:306)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.unmanage(LocalLocationManager.java:268)
>         at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.unmanage(LocalLocationManager.java:264)
>         at org.apache.brooklyn.core.location.AbstractLocation.removeChild(AbstractLocation.java:714)
>         at org.apache.brooklyn.location.jclouds.JcloudsLocation.release(JcloudsLocation.java:2523)
>         at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks.stopAnyProvisionedMachines(MachineLifecycleEffectorTasks.java:912)
>         at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopAnyProvisionedMachinesTask.call(MachineLifecycleEffectorTasks.java:777)
>         at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopAnyProvisionedMachinesTask.call(MachineLifecycleEffectorTasks.java:775)
>         at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
>         at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> For that second thread, the task that it is waiting on is shown below (i.e. waiting for a timeout when trying to connect to a VM that is already terminated):
> {noformat}
> "brooklyn-execmanager-hgWZmzF2-7542" daemon prio=5 tid=0x00007f94ef17e800 nid=0x131c7 runnable [0x0000700007925000]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>         - locked <0x00000007ecfbe900> (a java.net.SocksSocketImpl)
>         at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
>         at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:579)
>         at net.schmizz.sshj.SocketClient.connect(SocketClient.java:70)
>         at net.schmizz.sshj.SocketClient.connect(SocketClient.java:77)
>         at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjClientConnection.create(SshjClientConnection.java:188)
>         at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjClientConnection.create(SshjClientConnection.java:41)
>         at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjTool.acquire(SshjTool.java:630)
>         at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjTool.acquire(SshjTool.java:616)
>         at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjTool.connect(SshjTool.java:205)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.connectSsh(SshMachineLocation.java:696)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation$10.get(SshMachineLocation.java:401)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation$10.get(SshMachineLocation.java:399)
>         at org.apache.brooklyn.util.pool.BasicPool.leaseObject(BasicPool.java:134)
>         at org.apache.brooklyn.util.pool.BasicPool.exec(BasicPool.java:143)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.execSsh(SshMachineLocation.java:611)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation$13.execWithTool(SshMachineLocation.java:790)
>         at org.apache.brooklyn.util.core.task.system.internal.ExecWithLoggingHelpers.execWithLogging(ExecWithLoggingHelpers.java:164)
>         at org.apache.brooklyn.util.core.task.system.internal.ExecWithLoggingHelpers.execScript(ExecWithLoggingHelpers.java:80)
>         at org.apache.brooklyn.location.ssh.SshMachineLocation.execScript(SshMachineLocation.java:774)
>         at org.apache.brooklyn.util.core.task.ssh.internal.AbstractSshExecTaskFactory$1.run(AbstractSshExecTaskFactory.java:52)
>         at org.apache.brooklyn.util.core.task.system.ProcessTaskWrapper$ProcessTaskInternalJob.call(ProcessTaskWrapper.java:99)
>         at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)