You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Simon Godard <sg...@cloudops.com> on 2016/04/13 15:18:25 UTC

Very long time to destroy VM

Hi,

I am trying to understand why a destroyVirtualMachine API call would take
around 1 hour to get a successful async job result. From CloudStack log, I
can see that the StopVmCmd occurred right away, but the DestroyVmCmd took 1
hour to complete.

Do you know what could cause such delays?

The only suspicious log entries before the 1 hour delay are:

2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
(API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying vm
VM[User|i-2278-11598-VM]
2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
(API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable to
find lock for the key vm_instance11598 and thread id 1817738544

Then 1 hour later:
2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
(API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
original host id: 111 new host id: null host id before state transition:
null

Thanks!
Simon Godard

Re: Very long time to destroy VM

Posted by ilya <il...@gmail.com>.
Just curious - is ACPId installed on VM?


On 4/13/16 6:18 AM, Simon Godard wrote:
> Hi,
> 
> I am trying to understand why a destroyVirtualMachine API call would take
> around 1 hour to get a successful async job result. From CloudStack log, I
> can see that the StopVmCmd occurred right away, but the DestroyVmCmd took 1
> hour to complete.
> 
> Do you know what could cause such delays?
> 
> The only suspicious log entries before the 1 hour delay are:
> 
> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying vm
> VM[User|i-2278-11598-VM]
> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable to
> find lock for the key vm_instance11598 and thread id 1817738544
> 
> Then 1 hour later:
> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
> original host id: 111 new host id: null host id before state transition:
> null
> 
> Thanks!
> Simon Godard
> 

Re: Very long time to destroy VM

Posted by Rafael Weingärtner <ra...@gmail.com>.
Well, here goes one possible explanation. If I had to bet, I would bet on
this one, and not on some chunk of code that might be synchronized.

When you use the destroy command, first the ACS stops the VM. The stop
process is the one that can be slow. The OS of the VM might have taken time
long time to stop process nicely. If you use the Stop force, or the destroy
command (with expunge=true), I do not remember the whole code; but, ACS use
the “force” flag on the stop command. That tells for the hypervisor simply
to remove the VMM and do not care about the consequences.

Sto, when you use a normal stop/destroy command, you rely on the shutdown
process of the OS and that can be slow sometimes, giving that it will try
to stop all process nicely.

On Wed, Apr 13, 2016 at 11:56 AM, Simon Godard <sg...@cloudops.com> wrote:

> The stop operation seems to be as quick as usual. Again, we don’t have
> slow destroy on all VMs. It occurred twice in a short time frame but we
> didn’t experience it since then. I just want to understand the root cause
> to see if the management server performance was at fault or if it’s a
> concurrency bug in the code.
>
> --
> Simon
>
> > On Apr 13, 2016, at 10:18, Rafael Weingärtner <
> rafaelweingartner@gmail.com> wrote:
> >
> > If you just use the stop option? Is it taking a long time too?
> >
> > On Wed, Apr 13, 2016 at 10:37 AM, Simon Godard <sg...@cloudops.com>
> wrote:
> >
> >> We are using XenServer 6.2.
> >>
> >> Most VM destroy (expunge=true) are fairly quick. Is there anything else
> I
> >> could be looking for? At the time of the slow destroy, there weren’t a
> very
> >> high number of async jobs ongoing. I suspect it could be related to a DB
> >> concurrency issue, looking at this log I just found:
> >>
> >> 2016-04-11 15:24:30,777 ERROR
> >> [org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl]
> >> (Work-Job-Executor-17:ctx-4492ef70 job-1480365/job-1480367) Unexpected
> >> exception:
> >> DB Exception on: com.mysql.jdbc.PreparedStatement@3f603c42: UPDATE
> >> sync_queue SET sync_queue.queue_size=0,
> sync_queue.last_updated='2016-04-11
> >> 19:24:30' WHERE sync_queue.id = 905339
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:851)
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:804)
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:1313)
> >>    at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
> >>    at
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>    at java.lang.reflect.Method.invoke(Method.java:606)
> >>    at
> >>
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> >>    at
> >>
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> >>    at
> >>
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> >>    at
> >>
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> >>    at
> >>
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> >>    at com.sun.proxy.$Proxy164.update(Unknown Source)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl$4.doInTransactionWithoutResult(SyncQueueManagerImpl.java:197)
> >>    at
> >>
> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> >>    at
> >> com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
> >>    at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
> >>    at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl.purgeItem(SyncQueueManagerImpl.java:182)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:524)
> >>    at
> >>
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> >>    at
> >>
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> >>    at
> >>
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> >>    at
> >>
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> >>    at
> >>
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> >>    at
> >>
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:459)
> >>    at
> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >>    at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>    at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>    at java.lang.Thread.run(Thread.java:745)
> >> Caused by:
> >> com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException:
> Deadlock
> >> found when trying to get lock; try restarting transaction
> >>    at sun.reflect.GeneratedConstructorAccessor825.newInstance(Unknown
> >> Source)
> >>    at
> >>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >>    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> >>    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
> >>    at com.mysql.jdbc.Util.getInstance(Util.java:386)
> >>    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1065)
> >>    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
> >>    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
> >>    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
> >>    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
> >>    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
> >>    at
> >>
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
> >>    at
> >>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
> >>    at
> >>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
> >>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:843)
> >>    ... 32 more
> >> --
> >> Simon
> >>
> >>> On Apr 13, 2016, at 09:20, Rafael Weingärtner <
> >> rafaelweingartner@gmail.com> wrote:
> >>>
> >>> What hypervisor are you using?
> >>>
> >>> Every single VM in your environment is presenting this behavior?
> >>>
> >>>
> >>>
> >>> On Wed, Apr 13, 2016 at 10:18 AM, Simon Godard <sg...@cloudops.com>
> >> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I am trying to understand why a destroyVirtualMachine API call would
> >> take
> >>>> around 1 hour to get a successful async job result. From CloudStack
> >> log, I
> >>>> can see that the StopVmCmd occurred right away, but the DestroyVmCmd
> >> took 1
> >>>> hour to complete.
> >>>>
> >>>> Do you know what could cause such delays?
> >>>>
> >>>> The only suspicious log entries before the 1 hour delay are:
> >>>>
> >>>> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
> >>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying
> >> vm
> >>>> VM[User|i-2278-11598-VM]
> >>>> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
> >>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable
> >> to
> >>>> find lock for the key vm_instance11598 and thread id 1817738544
> >>>>
> >>>> Then 1 hour later:
> >>>> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
> >>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
> >>>> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
> >>>> original host id: 111 new host id: null host id before state
> transition:
> >>>> null
> >>>>
> >>>> Thanks!
> >>>> Simon Godard
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Rafael Weingärtner
> >>
> >>
> >
> >
> > --
> > Rafael Weingärtner
>
>


-- 
Rafael Weingärtner

Re: Very long time to destroy VM

Posted by Simon Godard <sg...@cloudops.com>.
The stop operation seems to be as quick as usual. Again, we don’t have slow destroy on all VMs. It occurred twice in a short time frame but we didn’t experience it since then. I just want to understand the root cause to see if the management server performance was at fault or if it’s a concurrency bug in the code.

-- 
Simon

> On Apr 13, 2016, at 10:18, Rafael Weingärtner <ra...@gmail.com> wrote:
> 
> If you just use the stop option? Is it taking a long time too?
> 
> On Wed, Apr 13, 2016 at 10:37 AM, Simon Godard <sg...@cloudops.com> wrote:
> 
>> We are using XenServer 6.2.
>> 
>> Most VM destroy (expunge=true) are fairly quick. Is there anything else I
>> could be looking for? At the time of the slow destroy, there weren’t a very
>> high number of async jobs ongoing. I suspect it could be related to a DB
>> concurrency issue, looking at this log I just found:
>> 
>> 2016-04-11 15:24:30,777 ERROR
>> [org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl]
>> (Work-Job-Executor-17:ctx-4492ef70 job-1480365/job-1480367) Unexpected
>> exception:
>> DB Exception on: com.mysql.jdbc.PreparedStatement@3f603c42: UPDATE
>> sync_queue SET sync_queue.queue_size=0, sync_queue.last_updated='2016-04-11
>> 19:24:30' WHERE sync_queue.id = 905339
>>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:851)
>>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:804)
>>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:1313)
>>    at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
>>    at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>    at java.lang.reflect.Method.invoke(Method.java:606)
>>    at
>> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>>    at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>>    at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>>    at
>> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>>    at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>>    at
>> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>>    at
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>>    at
>> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>>    at com.sun.proxy.$Proxy164.update(Unknown Source)
>>    at
>> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl$4.doInTransactionWithoutResult(SyncQueueManagerImpl.java:197)
>>    at
>> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
>>    at
>> com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
>>    at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
>>    at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
>>    at
>> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl.purgeItem(SyncQueueManagerImpl.java:182)
>>    at
>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:524)
>>    at
>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>    at
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>    at
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>    at
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>    at
>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>    at
>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:459)
>>    at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>    at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>    at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>    at java.lang.Thread.run(Thread.java:745)
>> Caused by:
>> com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock
>> found when trying to get lock; try restarting transaction
>>    at sun.reflect.GeneratedConstructorAccessor825.newInstance(Unknown
>> Source)
>>    at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>>    at com.mysql.jdbc.Util.getInstance(Util.java:386)
>>    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1065)
>>    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
>>    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
>>    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
>>    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
>>    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
>>    at
>> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
>>    at
>> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
>>    at
>> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
>>    at
>> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
>>    at
>> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
>>    at
>> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
>>    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:843)
>>    ... 32 more
>> --
>> Simon
>> 
>>> On Apr 13, 2016, at 09:20, Rafael Weingärtner <
>> rafaelweingartner@gmail.com> wrote:
>>> 
>>> What hypervisor are you using?
>>> 
>>> Every single VM in your environment is presenting this behavior?
>>> 
>>> 
>>> 
>>> On Wed, Apr 13, 2016 at 10:18 AM, Simon Godard <sg...@cloudops.com>
>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am trying to understand why a destroyVirtualMachine API call would
>> take
>>>> around 1 hour to get a successful async job result. From CloudStack
>> log, I
>>>> can see that the StopVmCmd occurred right away, but the DestroyVmCmd
>> took 1
>>>> hour to complete.
>>>> 
>>>> Do you know what could cause such delays?
>>>> 
>>>> The only suspicious log entries before the 1 hour delay are:
>>>> 
>>>> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
>>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying
>> vm
>>>> VM[User|i-2278-11598-VM]
>>>> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
>>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable
>> to
>>>> find lock for the key vm_instance11598 and thread id 1817738544
>>>> 
>>>> Then 1 hour later:
>>>> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
>>>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
>>>> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
>>>> original host id: 111 new host id: null host id before state transition:
>>>> null
>>>> 
>>>> Thanks!
>>>> Simon Godard
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Rafael Weingärtner
>> 
>> 
> 
> 
> -- 
> Rafael Weingärtner


Re: Very long time to destroy VM

Posted by Rafael Weingärtner <ra...@gmail.com>.
If you just use the stop option? Is it taking a long time too?

On Wed, Apr 13, 2016 at 10:37 AM, Simon Godard <sg...@cloudops.com> wrote:

> We are using XenServer 6.2.
>
> Most VM destroy (expunge=true) are fairly quick. Is there anything else I
> could be looking for? At the time of the slow destroy, there weren’t a very
> high number of async jobs ongoing. I suspect it could be related to a DB
> concurrency issue, looking at this log I just found:
>
> 2016-04-11 15:24:30,777 ERROR
> [org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl]
> (Work-Job-Executor-17:ctx-4492ef70 job-1480365/job-1480367) Unexpected
> exception:
> DB Exception on: com.mysql.jdbc.PreparedStatement@3f603c42: UPDATE
> sync_queue SET sync_queue.queue_size=0, sync_queue.last_updated='2016-04-11
> 19:24:30' WHERE sync_queue.id = 905339
>     at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:851)
>     at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:804)
>     at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:1313)
>     at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>     at
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>     at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>     at
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
>     at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>     at
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>     at
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>     at
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>     at com.sun.proxy.$Proxy164.update(Unknown Source)
>     at
> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl$4.doInTransactionWithoutResult(SyncQueueManagerImpl.java:197)
>     at
> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
>     at
> com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
>     at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
>     at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
>     at
> org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl.purgeItem(SyncQueueManagerImpl.java:182)
>     at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:524)
>     at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>     at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>     at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>     at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>     at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>     at
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:459)
>     at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> Caused by:
> com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock
> found when trying to get lock; try restarting transaction
>     at sun.reflect.GeneratedConstructorAccessor825.newInstance(Unknown
> Source)
>     at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>     at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>     at com.mysql.jdbc.Util.getInstance(Util.java:386)
>     at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1065)
>     at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
>     at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
>     at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
>     at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
>     at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
>     at
> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
>     at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
>     at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
>     at
> com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
>     at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
>     at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
>     at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:843)
>     ... 32 more
> --
> Simon
>
> > On Apr 13, 2016, at 09:20, Rafael Weingärtner <
> rafaelweingartner@gmail.com> wrote:
> >
> > What hypervisor are you using?
> >
> > Every single VM in your environment is presenting this behavior?
> >
> >
> >
> > On Wed, Apr 13, 2016 at 10:18 AM, Simon Godard <sg...@cloudops.com>
> wrote:
> >
> >> Hi,
> >>
> >> I am trying to understand why a destroyVirtualMachine API call would
> take
> >> around 1 hour to get a successful async job result. From CloudStack
> log, I
> >> can see that the StopVmCmd occurred right away, but the DestroyVmCmd
> took 1
> >> hour to complete.
> >>
> >> Do you know what could cause such delays?
> >>
> >> The only suspicious log entries before the 1 hour delay are:
> >>
> >> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
> >> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying
> vm
> >> VM[User|i-2278-11598-VM]
> >> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
> >> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable
> to
> >> find lock for the key vm_instance11598 and thread id 1817738544
> >>
> >> Then 1 hour later:
> >> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
> >> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
> >> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
> >> original host id: 111 new host id: null host id before state transition:
> >> null
> >>
> >> Thanks!
> >> Simon Godard
> >>
> >
> >
> >
> > --
> > Rafael Weingärtner
>
>


-- 
Rafael Weingärtner

Re: Very long time to destroy VM

Posted by Simon Godard <sg...@cloudops.com>.
We are using XenServer 6.2.

Most VM destroy (expunge=true) are fairly quick. Is there anything else I could be looking for? At the time of the slow destroy, there weren’t a very high number of async jobs ongoing. I suspect it could be related to a DB concurrency issue, looking at this log I just found:

2016-04-11 15:24:30,777 ERROR [org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl] (Work-Job-Executor-17:ctx-4492ef70 job-1480365/job-1480367) Unexpected exception: 
DB Exception on: com.mysql.jdbc.PreparedStatement@3f603c42: UPDATE sync_queue SET sync_queue.queue_size=0, sync_queue.last_updated='2016-04-11 19:24:30' WHERE sync_queue.id = 905339 
    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:851)
    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:804)
    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:1313)
    at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
    at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
    at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
    at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
    at com.sun.proxy.$Proxy164.update(Unknown Source)
    at org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl$4.doInTransactionWithoutResult(SyncQueueManagerImpl.java:197)
    at com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
    at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
    at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
    at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
    at org.apache.cloudstack.framework.jobs.impl.SyncQueueManagerImpl.purgeItem(SyncQueueManagerImpl.java:182)
    at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:524)
    at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
    at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
    at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
    at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
    at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
    at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:459)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
    at sun.reflect.GeneratedConstructorAccessor825.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
    at com.mysql.jdbc.Util.getInstance(Util.java:386)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1065)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
    at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
    at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
    at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
    at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
    at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
    at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
    at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
    at com.cloud.utils.db.GenericDaoBase.update(GenericDaoBase.java:843)
    ... 32 more
-- 
Simon

> On Apr 13, 2016, at 09:20, Rafael Weingärtner <ra...@gmail.com> wrote:
> 
> What hypervisor are you using?
> 
> Every single VM in your environment is presenting this behavior?
> 
> 
> 
> On Wed, Apr 13, 2016 at 10:18 AM, Simon Godard <sg...@cloudops.com> wrote:
> 
>> Hi,
>> 
>> I am trying to understand why a destroyVirtualMachine API call would take
>> around 1 hour to get a successful async job result. From CloudStack log, I
>> can see that the StopVmCmd occurred right away, but the DestroyVmCmd took 1
>> hour to complete.
>> 
>> Do you know what could cause such delays?
>> 
>> The only suspicious log entries before the 1 hour delay are:
>> 
>> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying vm
>> VM[User|i-2278-11598-VM]
>> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable to
>> find lock for the key vm_instance11598 and thread id 1817738544
>> 
>> Then 1 hour later:
>> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
>> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
>> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
>> original host id: 111 new host id: null host id before state transition:
>> null
>> 
>> Thanks!
>> Simon Godard
>> 
> 
> 
> 
> -- 
> Rafael Weingärtner


Re: Very long time to destroy VM

Posted by Rafael Weingärtner <ra...@gmail.com>.
What hypervisor are you using?

Every single VM in your environment is presenting this behavior?



On Wed, Apr 13, 2016 at 10:18 AM, Simon Godard <sg...@cloudops.com> wrote:

> Hi,
>
> I am trying to understand why a destroyVirtualMachine API call would take
> around 1 hour to get a successful async job result. From CloudStack log, I
> can see that the StopVmCmd occurred right away, but the DestroyVmCmd took 1
> hour to complete.
>
> Do you know what could cause such delays?
>
> The only suspicious log entries before the 1 hour delay are:
>
> 2016-04-11 15:23:32,896 DEBUG [com.cloud.vm.VirtualMachineManagerImpl]
> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Destroying vm
> VM[User|i-2278-11598-VM]
> 2016-04-11 15:23:32,920 WARN [com.cloud.utils.db.Merovingian2]
> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) Was unable to
> find lock for the key vm_instance11598 and thread id 1817738544
>
> Then 1 hour later:
> 2016-04-11 16:23:36,839 DEBUG [com.cloud.capacity.CapacityManagerImpl]
> (API-Job-Executor-11:ctx-2a68671a job-1480365 ctx-a0c26dd1) VM state
> transitted from :Stopped to Destroyed with event: DestroyRequestedvm's
> original host id: 111 new host id: null host id before state transition:
> null
>
> Thanks!
> Simon Godard
>



-- 
Rafael Weingärtner