You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "manasaveloori (JIRA)" <ji...@apache.org> on 2014/01/28 12:46:39 UTC

[jira] [Reopened] (CLOUDSTACK-5401) VM migration during host maintenance fails if pool.storage.capacity.disablethreshold is lowered

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

manasaveloori reopened CLOUDSTACK-5401:
---------------------------------------


Steps:

1. Created a cluster with 2 hosts( xen 6.0.2)
2. Created 8VMs in host1 and 4 VMs in host2.
3. Now changed the pool.storage.capacity.disablethreshold to 0.1
4. Kept host 1 in maintenance.

Observation.
Few VMs got migrated successfully.Observing the following message in logs while migrating some VMs:

2014-01-28 22:24:00,358 DEBUG [o.a.c.s.a.AbstractStoragePoolAllocator] (secstorage-1:ctx-6c023d59) Checking if storage pool is suitable, name: null ,poolId: 1
2014-01-28 22:24:00,365 DEBUG [c.c.s.StorageManagerImpl] (secstorage-1:ctx-6c023d59) Checking pool 1 for storage, totalSize: 5902284816384, usedBytes: 3683417358336, usedPct: 0.6240663527641528, disable threshold: 0.1
2014-01-28 22:24:00,365 DEBUG [c.c.s.StorageManagerImpl] (secstorage-1:ctx-6c023d59) Insufficient space on pool: 1 since its usage percentage: 0.6240663527641528 has crossed the pool.storage.capacity.disablethreshold: 0.1
2014-01-28 22:24:00,365 DEBUG [o.a.c.s.a.ClusterScopeStoragePoolAllocator] (secstorage-1:ctx-6c023d59) ClusterScopeStoragePoolAllocator returning 0 suitable storage pools
2014-01-28 22:24:00,365 DEBUG [o.a.c.s.a.ZoneWideStoragePoolAllocator] (secstorage-1:ctx-6c023d59) ZoneWideStoragePoolAllocator to find storage pool
2014-01-28 22:24:00,369 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (secstorage-1:ctx-6c023d59) No suitable pools found for volume: Vol[18|vm=18|ROOT] under cluster: 1
2014-01-28 22:24:00,369 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (secstorage-1:ctx-6c023d59) No suitable pools found
2014-01-28 22:24:00,369 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (secstorage-1:ctx-6c023d59) No suitable storagePools found under this Cluster: 1
2014-01-28 22:24:00,374 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (secstorage-1:ctx-6c023d59) Could not find suitable Deployment Destination for this VM under any clusters, returning.
2014-01-28 22:24:00,375 DEBUG [c.c.d.FirstFitPlanner] (secstorage-1:ctx-6c023d59) Searching all possible resources under this Zone: 1
2014-01-28 22:24:00,376 DEBUG [c.c.d.FirstFitPlanner] (secstorage-1:ctx-6c023d59) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Zone: 1
2014-01-28 22:24:00,380 DEBUG [c.c.d.FirstFitPlanner] (secstorage-1:ctx-6c023d59) Removing from the clusterId list these clusters from avoid set: [1]
2014-01-28 22:24:00,380 DEBUG [c.c.d.FirstFitPlanner] (secstorage-1:ctx-6c023d59) No clusters found after removing disabled clusters and clusters in avoid list, returning.
2014-01-28 22:24:00,398 DEBUG [c.c.c.CapacityManagerImpl] (secstorage-1:ctx-6c023d59) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: null new host id: null host id before state transition: null
2014-01-28 22:24:00,401 WARN  [c.c.s.s.SecondaryStorageManagerImpl] (secstorage-1:ctx-6c023d59) Exception while trying to start secondary storage vm
com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[SecondaryStorageVm|s-18-VM]Scope=interface com.cloud.dc.DataCenter; id=1
        at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:921)
        at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:761)
        at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:745)
        at com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:261)
        at com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:693)
        at com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1270)
        at com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:123)
        at com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
        at com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:111)
        at com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:35)
        at com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:88)
        at com.cloud.vm.SystemVmLoadScanner$1.runInContext(SystemVmLoadScanner.java:79)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2014-01-28 22:24:00,404 INFO  [c.c.s.s.SecondaryStorageManagerImpl] (secstorage-1:ctx-6c023d59) Unable to start secondary storage vm for standby capacity, secStorageVm vm Id : 18, will recycle it and start a new one




Threshold check is still done during pool allocation.

Attaching the Ms logs.

> VM migration during host maintenance fails if pool.storage.capacity.disablethreshold is lowered
> -----------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5401
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5401
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.2.1, 4.3.0
>            Reporter: Prachi Damle
>            Assignee: Prachi Damle
>            Priority: Critical
>             Fix For: 4.3.0
>
>
> 1. Create a 2 host XS 6.0.2 cluster (H1 and H2)
> 2. Create 6 or more VMs such that they get created in H1
> 3. Lower pool.storage.capacity.disablethreshold to 0.1 (default is 0.85)
> 4. Put H1 into maintenance. Some or all guest VMs fail to migrate to H2
> 2013-11-25 15:41:12,098 DEBUG [cloud.deploy.FirstFitPlanner] (HA-Worker-3:work-28) The specified cluster is in avoid set, returning.
> 2013-11-25 15:41:12,098 DEBUG [cloud.vm.VirtualMachineManagerImpl] (HA-Worker-3:work-28) Unable to find destination for migrating the vm VM[User|z1V6]
> 2013-11-25 15:41:12,098 WARN [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-28) Insufficient capacity for migrating a VM.
> 2013-11-25 15:41:12,099 DEBUG [cloud.resource.ResourceManagerImpl] (HA-Worker-3:work-28) No next resource state for host 5 while current state is ErrorInMaintenance with event UnableToMigrate
> com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state =ErrorInMaintenance event =UnableToMigrate
> at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1178)
> at com.cloud.resource.ResourceManagerImpl.maintenanceFailed(ResourceManagerImpl.java:2313)
> at com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:858)
> 2013-11-25 15:41:12,100 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-28) Rescheduling HAWork[28-Migration-9-Running-Migrating] to try again at Mon Nov 25 15:43:14 PST 2013
> 2013-11-25 15:41:12,100 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) Checking suitable pools for volume (Id, Type): (13,ROOT)
> 2013-11-25 15:41:12,100 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) We need to allocate new storagepool for this volume
> 2013-11-25 15:41:12,102 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) Calling StoragePoolAllocators to find suitable pools
> 2013-11-25 15:41:12,103 DEBUG [storage.allocator.LocalStoragePoolAllocator] (HA-Worker-4:work-29) LocalStoragePoolAllocator trying to find storage pool to fit the vm
> 2013-11-25 15:41:12,103 DEBUG [storage.allocator.ClusterScopeStoragePoolAllocator] (HA-Worker-4:work-29) ClusterScopeStoragePoolAllocator looking for storage pool
> 2013-11-25 15:41:12,103 DEBUG [storage.allocator.ClusterScopeStoragePoolAllocator] (HA-Worker-4:work-29) Looking for pools in dc: 1 pod:1 cluster:1
> 2013-11-25 15:41:12,107 DEBUG [storage.allocator.AbstractStoragePoolAllocator] (HA-Worker-4:work-29) Checking if storage pool is suitable, name: null ,poolId: 200
> 2013-11-25 15:41:12,111 DEBUG [cloud.storage.StorageManagerImpl] (HA-Worker-4:work-29) Checking pool 200 for storage, totalSize: 11810778316800, usedBytes: 9755417411584, usedPct: 0.8259758290194649, disable threshold: 0.1
> 2013-11-25 15:41:12,111 DEBUG [cloud.storage.StorageManagerImpl] (HA-Worker-4:work-29) Insufficient space on pool: 200 since its usage percentage: 0.8259758290194649 has crossed the pool.storage.capacity.disablethreshold: 0.1
> 2013-11-25 15:41:12,111 DEBUG [storage.allocator.ClusterScopeStoragePoolAllocator] (HA-Worker-4:work-29) ClusterScopeStoragePoolAllocator returning 0 suitable storage pools
> 2013-11-25 15:41:12,111 DEBUG [storage.allocator.ZoneWideStoragePoolAllocator] (HA-Worker-4:work-29) ZoneWideStoragePoolAllocator to find storage pool
> 2013-11-25 15:41:12,113 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) No suitable pools found for volume: Vol[13|vm=12|ROOT] under cluster: 1
> 2013-11-25 15:41:12,113 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) No suitable pools found
> 2013-11-25 15:41:12,113 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) No suitable storagePools found under this Cluster: 1
> 2013-11-25 15:41:12,117 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-29) Could not find suitable Deployment Destination for this VM under any clusters, returning.
> -2013-11-25 15:32:09,784 DEBUG [cloud.vm.VirtualMachineManagerImpl] (HA-Worker-3:work-27) Unable to find destination for migrating the vm VM[User|z1V5]
> 2013-11-25 15:32:09,784 WARN [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-27) Insufficient capacity for migrating a VM.
> 2013-11-25 15:32:09,784 DEBUG [cloud.deploy.FirstFitPlanner] (HA-Worker-1:work-29) The specified cluster is in avoid set, returning.
> 2013-11-25 15:32:09,784 DEBUG [cloud.vm.VirtualMachineManagerImpl] (HA-Worker-1:work-29) Unable to find destination for migrating the vm VM[SecondaryStorageVm|s-12-VM]
> 2013-11-25 15:32:09,784 WARN [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-1:work-29) Insufficient capacity for migrating a VM.
> 2013-11-25 15:32:09,786 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (HA-Worker-4:work-28) Could not find suitable Deployment Destination for this VM under any clusters, returning.
> 2013-11-25 15:32:09,786 DEBUG [cloud.deploy.FirstFitPlanner] (HA-Worker-4:work-28) Searching resources only under specified Cluster: 1
> 2013-11-25 15:32:09,788 DEBUG [cloud.resource.ResourceManagerImpl] (HA-Worker-3:work-27) No next resource state for host 5 while current state is ErrorInMaintenance with event UnableToMigrate
> com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state =ErrorInMaintenance event =UnableToMigrate
> at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1178)
> at com.cloud.resource.ResourceManagerImpl.maintenanceFailed(ResourceManagerImpl.java:2313)
> at com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:858)
> 2013-11-25 15:32:09,789 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-27) Rescheduling HAWork[27-Migration-8-Running-Migrating] to try again at Mon Nov 25 15:34:11 PST 2013
> 2013-11-25 15:32:09,790 DEBUG [cloud.deploy.FirstFitPlanner] (HA-Worker-4:work-28) The specified cluster is in avoid set, returning.
> 2013-11-25 15:32:09,790 DEBUG [cloud.vm.VirtualMachineManagerImpl] (HA-Worker-4:work-28) Unable to find destination for migrating the vm VM[User|z1V6]
> 2013-11-25 15:32:09,790 WARN [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-28) Insufficient capacity for migrating a VM.
> 2013-11-25 15:32:09,790 DEBUG [cloud.resource.ResourceManagerImpl] (HA-Worker-1:work-29) No next resource state for host 5 while current state is ErrorInMaintenance with event UnableToMigrate
> com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state =ErrorInMaintenance event =UnableToMigrate
> at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1178)
> at com.cloud.resource.ResourceManagerImpl.maintenanceFailed(ResourceManagerImpl.java:2313)
> at com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:858)
> 2013-11-25 15:32:09,790 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-1:work-29) Rescheduling HAWork[29-Migration-12-Running-Migrating] to try again at Mon Nov 25 15:34:11 PST 2013
> 2013-11-25 15:32:09,793 DEBUG [cloud.resource.ResourceManagerImpl] (HA-Worker-4:work-28) No next resource state for host 5 while current state is ErrorInMaintenance with event UnableToMigrate
> com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state =ErrorInMaintenance event =UnableToMigrate
> at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1178)
> at com.cloud.resource.ResourceManagerImpl.maintenanceFailed(ResourceManagerImpl.java:2313)
> at com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:858)
> 2013-11-25 15:32:09,793 INFO [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-4:work-28) Rescheduling HAWork[28-Migration-9-Running-Migrating] to try again at Mon Nov 25 15:34:11 PST 2013
> 2013-11-25 15:32:15,047 DEBUG [cloud.server.StatsCollector] (StatsCollector-2:null) HostStatsCollector is running...
> 2013-11-25 15:32:15,059 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-85:null) Seq 1-2115240044: Executing request
> 2013-11-25 15:32:15,162 DEBUG [agent.manager.AgentManagerImpl] (AgentManager-Handler-6:null) SeqA 3-1080: Processing Seq 3-1080: { Cmd , MgmtId: -1, via: 3, Ver: v1, Flags: 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand
> ":{"_proxyVmId":2,"_loadInfo":"
> {\n \"connections\": []\n} 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)