You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Chip Childers (JIRA)" <ji...@apache.org> on 2013/05/22 15:43:20 UTC
[jira] [Reopened] (CLOUDSTACK-2568) ACS41 regression in storage subsystem (seen with local storage and 2 or more hosts)

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chip Childers reopened CLOUDSTACK-2568:
---------------------------------------


reopening due to cherry-pick conflicts.  let's only resolve when the code is in the correct branch.  Thanks Prachi!
                
> ACS41 regression in storage subsystem (seen with local storage and 2 or more hosts)
> -----------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-2568
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-2568
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.1.0
>         Environment: RHES64 as in OEL64. Install from RPM built from latest GIT on OEL64.
> 2 or more KVM hypervisors with local storage in one cluster that has one primary NFS storage pool.
>            Reporter: Ove Ewerlid
>            Assignee: Prachi Damle
>            Priority: Blocker
>             Fix For: 4.1.0
>
>         Attachments: var-log-cloudstack-management.tar.gz
>
>
> ACS402 works with no issues when tested in exactly the setup where ACS41 fails.
> Identical configuration (the same setup program is used for testing both versions).
> In ACS410 startVM fails if and only if the advanceStart: log line picks a poolID that is not valid.
> E.g., the poolID reported in this logline appears random across a large number of tests.
> If a poolID that can not be reached by the host selected for deployment, the startVM fails.
> This is blocking upgrade from 4.0 to 4.1 since there is no  reliable way to start VMs that have been deployed. If a deployed VM fails to start, giving the startVM command multiple times, will eventually make the VM start.
> The more hosts there are, the less likely it is a startVM will succeed. It is less likely that the poolID is correct.
> The below log portion conveys how the VM has a "correct" Deployment Destination reported and the advanceStart reports a poolID that is different and since the selected hypervisor can not reach the poolID the startVM fails.
> The bug never triggers if there is only one KVM with local storage since the poolId can not be wrong, there is just one (and the NFS pool is always valid).
> -------------------
> 2013-05-20 06:49:29,477 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Found a potential host id: 1 name: vm3-net0-s0-14.test.devops and associated storage pools for this VM
> 2013-05-20 06:49:29,478 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Returning Deployment Destination: Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(1)-Cluster(1)-Host\
> (1)-Storage(Volume(10|ROOT-->Pool(200), Volume(11|DATADISK-->Pool(200))]
> 2013-05-20 06:49:29,495 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-34:job-34) VM state transitted from :Stopped to Starting with event: StartRequestedvm's original host id: null new host id: null host id before state trans\
> ition: null
> 2013-05-20 06:49:29,495 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34) Successfully transitioned to start state for VM[User|testvm-a] reservation id = e644d55e-3627-4395-9f89-639e6fc2f261
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34) Trying to deploy VM, vm has dcId: 1 and podId: null
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34) advanceStart: DeploymentPlan is provided, using dcId:1, podId: 1, clusterId: 1, hostId: 1, poolId: 201
> 2013-05-20 06:49:29,502 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-34:job-34) Deploy avoids pods: null, clusters: null, hosts: null
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) DeploymentPlanner allocation algorithm: random
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Trying to allocate a host and storage pools from dc:1, pod:1,cluster:1, requested cpu: 4000, requested ram: 2147483648
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Is ROOT volume READY (pool already allocated)?: Yes
> 2013-05-20 06:49:29,504 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) DeploymentPlan has host_id specified, making no checks on this host, looks like admin test: 1
> 2013-05-20 06:49:29,505 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Looking for suitable pools for this host under zone: 1, pod: 1, cluster: 1
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Checking suitable pools for volume (Id, Type): (10,ROOT)
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Volume has pool(201) already allocated, checking if pool can be reused, poolId: null
> 2013-05-20 06:49:29,506 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) finding pool by id '201'
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Planner need not allocate a pool for this volume since its READY
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Checking suitable pools for volume (Id, Type): (11,DATADISK)
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Volume has pool(201) already allocated, checking if pool can be reused, poolId: null
> 2013-05-20 06:49:29,507 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) finding pool by id '201'
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Planner need not allocate a pool for this volume since its READY
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Trying to find a potenial host and associated storage pools from the suitable host/pool lists for this VM
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Checking if host: 1 can access any suitable storage pool for volume: DATADISK
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Host: 1 cannot access pool: 201
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Could not find a potential host that has associated storage pools from the suitable host/pool lists for this VM
> 2013-05-20 06:49:29,508 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-34:job-34) Cannnot deploy to specified host, returning.
> 2013-05-20 06:49:29,524 DEBUG [cloud.capacity.CapacityManagerImpl] (Job-Executor-34:job-34) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: null new host id: null host id before state tran\
> sition: null
> 2013-05-20 06:49:29,533 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-34:job-34) Unexpected exception while executing org.apache.cloudstack.api.command.user.vm.StartVMCmd
> com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[User|testvm-a]Scope=interface com.cloud.dc.DataCenter; id=1
>         at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:728)
>         at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:471)
>         at org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:212)
>         at org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
>         at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3865)
>         at com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2573)
>         at com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:162)
>         at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:437)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira