You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/06/15 11:38:09 UTC
[jira] [Commented] (BROOKLYN-299) Stopping app takes long time:
blocked on LocalUsageManager.recordLocationEvent, waiting for ssh
connection timeout
[ https://issues.apache.org/jira/browse/BROOKLYN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331567#comment-15331567 ]
ASF GitHub Bot commented on BROOKLYN-299:
-----------------------------------------
Github user neykov commented on the issue:
https://github.com/apache/brooklyn-server/pull/199
I like the fix. Changes look good. Merging.
> Stopping app takes long time: blocked on LocalUsageManager.recordLocationEvent, waiting for ssh connection timeout
> ------------------------------------------------------------------------------------------------------------------
>
> Key: BROOKLYN-299
> URL: https://issues.apache.org/jira/browse/BROOKLYN-299
> Project: Brooklyn
> Issue Type: Bug
> Reporter: Aled Sage
>
> With brooklyn 0.10.0-snapshot master...
> It is sometimes taking a long time to stop locations (and this is also affecting starting them) - a few minutes extra.
> The reason is that it's blocked, waiting on the mutex in {{LocalUsageManager.recordLocationEvent}}, or {{LocalUsageManager.recordApplicationEvent}}!
> While holding the lock, it calls {{location.toMetadataRecord}}. For some location types, this causes an ssh call to infer the machine details. That in itself is a bad idea. However, this is made even worse because we call it when unmanaging a {{JcloudsSshMachineLocation}} - i.e. when the VM has already been terminated. This means the ssh call times out, potentially after a couple of minutes.
> The stacktrace for two such threads are shown below (collected using jstack):
> {noformat}
> "brooklyn-execmanager-hgWZmzF2-7587" daemon prio=5 tid=0x00007f94ee04b800 nid=0x14b53 waiting for monitor entry [0x0000700009d91000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.apache.brooklyn.core.mgmt.internal.LocalUsageManager.recordApplicationEvent(LocalUsageManager.java:247)
> - waiting to lock <0x00000007c06ca2f0> (a java.lang.Object)
> at org.apache.brooklyn.core.entity.AbstractApplication.recordApplicationEvent(AbstractApplication.java:263)
> at org.apache.brooklyn.core.entity.AbstractApplication.setExpectedStateAndRecordLifecycleEvent(AbstractApplication.java:258)
> at org.apache.brooklyn.core.entity.AbstractApplication.stop(AbstractApplication.java:221)
> at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
> at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
> at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1207)
> at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1016)
> at groovy.lang.DelegatingMetaClass.invokeMethod(DelegatingMetaClass.java:149)
> at groovy.lang.MetaObjectProtocol$invokeMethod.call(Unknown Source)
> at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
> at groovy.lang.MetaObjectProtocol$invokeMethod.call(Unknown Source)
> at org.apache.brooklyn.util.groovy.GroovyJavaMethods.invokeMethodOnMetaClass(GroovyJavaMethods.java:191)
> at org.apache.brooklyn.core.mgmt.internal.AbstractManagementContext.invokeEffectorMethodLocal(AbstractManagementContext.java:309)
> at org.apache.brooklyn.core.mgmt.internal.AbstractManagementContext.invokeEffectorMethodSync(AbstractManagementContext.java:333)
> at org.apache.brooklyn.core.mgmt.internal.EffectorUtils.invokeMethodEffector(EffectorUtils.java:256)
> at org.apache.brooklyn.core.effector.MethodEffector.call(MethodEffector.java:149)
> at org.apache.brooklyn.core.entity.trait.Startable$StopEffectorBody.call(Startable.java:68)
> at org.apache.brooklyn.core.entity.trait.Startable$StopEffectorBody.call(Startable.java:60)
> at org.apache.brooklyn.core.effector.EffectorTasks$EffectorBodyTaskFactory$1.call(EffectorTasks.java:82)
> at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
> at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "brooklyn-execmanager-hgWZmzF2-7583" daemon prio=5 tid=0x00007f94ef137000 nid=0xf67f waiting on condition [0x0000700009476000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000007efa10000> (a java.util.concurrent.FutureTask)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
> at java.util.concurrent.FutureTask.get(FutureTask.java:187)
> at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
> at org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:361)
> at org.apache.brooklyn.util.core.task.BasicTask.getUnchecked(BasicTask.java:370)
> at org.apache.brooklyn.core.location.BasicMachineDetails.forSshMachineLocationLive(BasicMachineDetails.java:96)
> at org.apache.brooklyn.location.ssh.SshMachineLocation.inferMachineDetails(SshMachineLocation.java:1070)
> at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.inferMachineDetails(JcloudsSshMachineLocation.java:566)
> at org.apache.brooklyn.location.ssh.SshMachineLocation.getMachineDetails(SshMachineLocation.java:1057)
> - locked <0x00000007c1f12560> (a java.lang.Object)
> at org.apache.brooklyn.location.ssh.SshMachineLocation.getOsDetails(SshMachineLocation.java:1038)
> at org.apache.brooklyn.location.jclouds.JcloudsSshMachineLocation.toMetadataRecord(JcloudsSshMachineLocation.java:590)
> at org.apache.brooklyn.core.mgmt.internal.LocalUsageManager.recordLocationEvent(LocalUsageManager.java:312)
> - locked <0x00000007c06ca2f0> (a java.lang.Object)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.recordLocationEvent(LocalLocationManager.java:345)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.access$200(LocalLocationManager.java:53)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager$2.apply(LocalLocationManager.java:318)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager$2.apply(LocalLocationManager.java:306)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.recursively(LocalLocationManager.java:354)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.unmanage(LocalLocationManager.java:306)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.unmanage(LocalLocationManager.java:268)
> at org.apache.brooklyn.core.mgmt.internal.LocalLocationManager.unmanage(LocalLocationManager.java:264)
> at org.apache.brooklyn.core.location.AbstractLocation.removeChild(AbstractLocation.java:714)
> at org.apache.brooklyn.location.jclouds.JcloudsLocation.release(JcloudsLocation.java:2523)
> at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks.stopAnyProvisionedMachines(MachineLifecycleEffectorTasks.java:912)
> at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopAnyProvisionedMachinesTask.call(MachineLifecycleEffectorTasks.java:777)
> at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopAnyProvisionedMachinesTask.call(MachineLifecycleEffectorTasks.java:775)
> at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
> at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> For that second thread, the task that it is waiting on is shown below (i.e. waiting for a timeout when trying to connect to a VM that is already terminated):
> {noformat}
> "brooklyn-execmanager-hgWZmzF2-7542" daemon prio=5 tid=0x00007f94ef17e800 nid=0x131c7 runnable [0x0000700007925000]
> java.lang.Thread.State: RUNNABLE
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> - locked <0x00000007ecfbe900> (a java.net.SocksSocketImpl)
> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:579)
> at net.schmizz.sshj.SocketClient.connect(SocketClient.java:70)
> at net.schmizz.sshj.SocketClient.connect(SocketClient.java:77)
> at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjClientConnection.create(SshjClientConnection.java:188)
> at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjClientConnection.create(SshjClientConnection.java:41)
> at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjTool.acquire(SshjTool.java:630)
> at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjTool.acquire(SshjTool.java:616)
> at org.apache.brooklyn.util.core.internal.ssh.sshj.SshjTool.connect(SshjTool.java:205)
> at org.apache.brooklyn.location.ssh.SshMachineLocation.connectSsh(SshMachineLocation.java:696)
> at org.apache.brooklyn.location.ssh.SshMachineLocation$10.get(SshMachineLocation.java:401)
> at org.apache.brooklyn.location.ssh.SshMachineLocation$10.get(SshMachineLocation.java:399)
> at org.apache.brooklyn.util.pool.BasicPool.leaseObject(BasicPool.java:134)
> at org.apache.brooklyn.util.pool.BasicPool.exec(BasicPool.java:143)
> at org.apache.brooklyn.location.ssh.SshMachineLocation.execSsh(SshMachineLocation.java:611)
> at org.apache.brooklyn.location.ssh.SshMachineLocation$13.execWithTool(SshMachineLocation.java:790)
> at org.apache.brooklyn.util.core.task.system.internal.ExecWithLoggingHelpers.execWithLogging(ExecWithLoggingHelpers.java:164)
> at org.apache.brooklyn.util.core.task.system.internal.ExecWithLoggingHelpers.execScript(ExecWithLoggingHelpers.java:80)
> at org.apache.brooklyn.location.ssh.SshMachineLocation.execScript(SshMachineLocation.java:774)
> at org.apache.brooklyn.util.core.task.ssh.internal.AbstractSshExecTaskFactory$1.run(AbstractSshExecTaskFactory.java:52)
> at org.apache.brooklyn.util.core.task.system.ProcessTaskWrapper$ProcessTaskInternalJob.call(ProcessTaskWrapper.java:99)
> at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)