You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Hiroki Ohashi (JIRA)" <ji...@apache.org> on 2014/09/12 13:13:34 UTC

[jira] [Created] (CLOUDSTACK-7539) [S3] Parallel deployment makes reference count of a cache in nfs secondary staging store negative(-1)

Hiroki Ohashi created CLOUDSTACK-7539:
-----------------------------------------

             Summary: [S3] Parallel deployment makes reference count of a cache in nfs secondary staging store negative(-1)
                 Key: CLOUDSTACK-7539
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7539
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Storage Controller, Template
    Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0
         Environment:     OS:                  CentOS 6.5
    CloudStack Version:  4.3.0
    Hypervisor:          KVM
    Primary Storage:     Local Storage
    Secondary Storage:   S3(RADOS)
    Zone:                Advanced
            Reporter: Hiroki Ohashi
            Priority: Critical


When we create some instances in parallel in an environment shown above, an exception like ([#2]) happens. After that, reference count of a cache in NFS Secondary Staging Store(SSS) becomes negative([#3]).

In this situcation, we can't use a template used for creating the instances because negative reference count will cause another
exception([#4]). Furthermore, template caches in NFS SSS aren't cleaned up.

I think this is related to CLOUDSTACK-6236. I'm using CloudStack 4.3.0 branch. The code was applied a patch of CLOUDSTACK-6236 to and also fixed about volume reference count setter problem by myself. But the reference count problem still happens.

Conditions to trigger
  - A cache of a template isn't on NFS SSS.
  - Choose compute offering that occupies all resources in a host.(ex. Use all cores and almost all memory)
  - Create multiple instances ([#1]).

Other behaviors
  - Management server inserts multiple entries for template cache(ImageCache) to cloud.template_store_ref.
  - SSVM probably downloads same template from S3 and creates multiple caches to NFS SSS concurrently. Sometimes, SSVM retries download  and cache creation.

(1)
{anchor:1}
{noformat}
#! /bin/bash

COMPUTE_OFFERING="b6fc0598-6903-48cc-b894-ead3bb0e39f5"
TEMPLATE="ef1d5a8e-5951-4036-a236-fe2d47224fe4"
ZONE="23156857-4722-4fc4-86d4-a12ca1028197"
NETWORK="4ba56179-f7b9-4560-a00e-80c946856ff8"
KEYPAIR=mykey

NAME1="test-template-error-0003"
NAME2="test-template-error-0004"

cloudmonkey deploy virtualmachine serviceofferingid=${COMPUTE_OFFERING} templateid=${TEMPLATE} zoneid=${ZONE} networkids=${NETWORK} displayname=${NAME1} name=${NAME1} keypair=${KEYPAIR}
sleep 1
cloudmonkey deploy virtualmachine serviceofferingid=${COMPUTE_OFFERING} templateid=${TEMPLATE} zoneid=${ZONE} networkids=${NETWORK} displayname=${NAME2} name=${NAME2} keypair=${KEYPAIR}

{noformat}

(2)
{anchor:2}
{noformat}
2014-09-10 17:22:51,249 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 171-1789133096: Processing:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/3/330/22e69a40-0916-4d11-a07a-63d38e76887f.qcow2","id":0,"accountId":0,"hvm":false,"name":"22e69a40-0916-4d11-a07a-63d38e76887f.qcow2","size":14340259840,"physicalSize":14340259840}},"result":true,"wait":0}}] }
2014-09-10 17:22:51,249 DEBUG [c.c.a.t.Request] (Job-Executor-162:ctx-4a2eb345 ctx-fbf59f27) Seq 171-1789133096: Received:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, { CopyCmdAnswer } }
2014-09-10 17:22:51,257 DEBUG [o.a.c.s.i.s.TemplateObject] (Job-Executor-162:ctx-4a2eb345 ctx-fbf59f27) failed to update state
com.cloud.utils.fsm.NoTransitionException: Unable to transition to a new state from Allocated via OperationSuccessed
        at com.cloud.utils.fsm.StateMachine2.getNextState(StateMachine2.java:83)
        at com.cloud.utils.fsm.StateMachine2.transitTo(StateMachine2.java:100)
        at org.apache.cloudstack.storage.datastore.ObjectInDataStoreManagerImpl.update(ObjectInDataStoreManagerImpl.java:297)
        at org.apache.cloudstack.storage.image.store.TemplateObject.processEvent(TemplateObject.java:225)
        at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:240)
        at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:267)
        at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyObject(AncientDataMotionStrategy.java:165)
        at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:428)
        at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
        at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createBaseImageAsync(VolumeServiceImpl.java:473)
        at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:604)
        at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1192)
        at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1256)
        at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:963)
        at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761)
        at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:601)
        at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:228)
        at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:207)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3554)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3161)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3147)
        at sun.reflect.GeneratedMethodAccessor463.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy170.startVirtualMachine(Unknown Source)
        at org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:443)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
        at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
        at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:701)

{noformat}

(3)
{anchor:3}
{noformat}
*************************** 166. row ***************************
         id: 431
 store_role: Image
template_id: 330
    ref_cnt: 0
*************************** 167. row ***************************
         id: 432
 store_role: ImageCache
template_id: 330
    ref_cnt: 0
*************************** 168. row ***************************
         id: 433
 store_role: ImageCache
template_id: 330
    ref_cnt: 0
*************************** 169. row ***************************
         id: 434
 store_role: ImageCache
template_id: 330
    ref_cnt: 0
*************************** 170. row ***************************
         id: 435
 store_role: ImageCache
template_id: 330
    ref_cnt: 1
*************************** 171. row ***************************
         id: 436
 store_role: ImageCache
template_id: 330
    ref_cnt: 0
*************************** 172. row ***************************
         id: 437
 store_role: ImageCache
template_id: 330
    ref_cnt: 18446744073709551615
*************************** 173. row ***************************
         id: 438
 store_role: ImageCache
template_id: 330
    ref_cnt: 0

{noformat}

(4)
{anchor:4}
{noformat}
2014-09-08 17:42:48,144 DEBUG [c.c.a.t.Request] (AgentManager-Handler-6:null) Seq 171-1789133066: Processing:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/3/320/07360420-1206-4054-946d-86567b646f98.qcow2","id":0,"accountId":0,"hvm":false,"name":"07360420-1206-4054-946d-86567b646f98.qcow2","size":14340653056,"physicalSize":14340653056}},"result":true,"wait":0}}] }
2014-09-08 17:42:48,144 DEBUG [c.c.a.t.Request] (Job-Executor-107:ctx-41d757d1 ctx-5fbbe761) Seq 171-1789133066: Received:  { Ans: , MgmtId: 98532524288, via: 171, Ver: v1, Flags: 10, { CopyCmdAnswer } }
2014-09-08 17:42:48,146 DEBUG [o.a.c.s.i.s.TemplateObject] (Job-Executor-107:ctx-41d757d1 ctx-5fbbe761) failed to process event and answer
com.cloud.utils.exception.CloudRuntimeException: DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@307febb8: SELECT template_store_ref.id, template_store_ref.store_id, template_store_ref.template_id, template_store_ref.store_role, template_store_ref.created, template_store_ref.last_updated, template_store_ref.download_pct, template_store_ref.size, template_store_ref.physical_size, template_store_ref.download_state, template_store_ref.local_path, template_store_ref.error_str, template_store_ref.job_id, template_store_ref.install_path, template_store_ref.url, template_store_ref.is_copy, template_store_ref.destroyed, template_store_ref.update_count, template_store_ref.updated, template_store_ref.state, template_store_ref.ref_cnt FROM template_store_ref WHERE template_store_ref.store_id = 1  AND template_store_ref.template_id = 320  AND template_store_ref.destroyed = 0  ORDER BY RAND() LIMIT 1
        at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:421)
        at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:356)
        at com.cloud.utils.db.GenericDaoBase.findOneIncludingRemovedBy(GenericDaoBase.java:869)
        at org.apache.cloudstack.storage.image.db.TemplateDataStoreDaoImpl.findByStoreTemplate(TemplateDataStoreDaoImpl.java:320)
        at sun.reflect.GeneratedMethodAccessor125.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:33)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy109.findByStoreTemplate(Unknown Source)
        at org.apache.cloudstack.storage.image.store.TemplateObject.processEvent(TemplateObject.java:202)
        at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:240)
        at org.apache.cloudstack.storage.cache.manager.StorageCacheManagerImpl.createCacheObject(StorageCacheManagerImpl.java:267)
        at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyObject(AncientDataMotionStrategy.java:165)
        at org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.copyAsync(AncientDataMotionStrategy.java:428)
        at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:70)
        at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createBaseImageAsync(VolumeServiceImpl.java:473)
        at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:604)
        at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1192)
        at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1256)
        at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:963)
        at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761)
        at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:601)
        at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:228)
        at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:207)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3554)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3161)
        at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3147)
        at sun.reflect.GeneratedMethodAccessor463.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
        at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
        at com.sun.proxy.$Proxy170.startVirtualMachine(Unknown Source)
        at org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:443)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
        at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
        at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:701)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLDataException: '18446744073709551615' in column '21' is outside valid range for the datatype BIGINT.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:534)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1026)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:987)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:982)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:927)
        at com.mysql.jdbc.ResultSetImpl.throwRangeException(ResultSetImpl.java:7964)
        at com.mysql.jdbc.ResultSetImpl.parseLongAsDouble(ResultSetImpl.java:7248)
        at com.mysql.jdbc.ResultSetImpl.getLong(ResultSetImpl.java:2946)
        at com.mysql.jdbc.ResultSetImpl.getLong(ResultSetImpl.java:2911)
        at org.apache.commons.dbcp.DelegatingResultSet.getLong(DelegatingResultSet.java:228)
        at org.apache.commons.dbcp.DelegatingResultSet.getLong(DelegatingResultSet.java:228)
        at com.cloud.utils.db.GenericDaoBase.setField(GenericDaoBase.java:541)
        at com.cloud.utils.db.GenericDaoBase.setField(GenericDaoBase.java:1702)
        at com.cloud.utils.db.GenericDaoBase.toEntityBean(GenericDaoBase.java:1582)
        at com.cloud.utils.db.GenericDaoBase.toEntityBean(GenericDaoBase.java:1543)
        at com.cloud.utils.db.GenericDaoBase.searchIncludingRemoved(GenericDaoBase.java:417)
        ... 65 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)