You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/05/24 10:13:04 UTC

[jira] [Commented] (CLOUDSTACK-9660) NPE while destroying volumes during 1000 VMs deploy and destroy tests

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022654#comment-16022654 ] 

ASF subversion and git services commented on CLOUDSTACK-9660:
-------------------------------------------------------------

Commit 0506fe60862c1d211082281de5c28414d042abf5 in cloudstack's branch refs/heads/master from [~mike-tutkowski]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=0506fe6 ]

Fix for CLOUDSTACK-9660

A root volume can be replaced by a different root volume without the VM it belongs to being expunged.

From dev@:

For example: Let’s say we have a system VM running on NFS primary storage. We then put this primary storage into maintenance mode, which creates the system VM (with the same name) on a different primary storage (we do not create a new row in the cloud.vm_instance table for this VM). While this VM works, the original root disk of the system VM remains on the original primary storage and is not destroyed by the code in StorageManagerImpl.cleanupStorage(boolean) in 4.10 because 4.10 (as shown above) only asks for non-root volumes to consider for deletion. In the 4.9 version of the code, the original root disk is cleaned up in StorageManagerImpl.cleanupStorage(boolean). The problem with 4.10 relying on a root disk always being deleted when the VM it belongs to is deleted is that in a situation like this that the system VM doesn’t get deleted at this point – it gets a new root disk that’s hosted by a different primary storage (so now it’s original root disk is stranded).

> NPE while destroying volumes during 1000 VMs deploy and destroy tests
> ---------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9660
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9660
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.10.0.0
>            Reporter: Koushik Das
>            Assignee: Koushik Das
>             Fix For: 4.10.0.0
>
>
> Steps:
> 1. Install and configure a zone (advanced or basic).
> 2. Set config storage.cleanup.enabled = true and storage.cleanup.interval = 10 seconds
> 3. Deploy 1000 VMs and then destroy over multiple iterations.
> NPE seen in MS logs while deleting volume:
> 2015-06-18 16:27:47,797 DEBUG [c.c.v.VirtualMachineManagerImpl] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Cleaning up hypervisor data structures (ex. SRs in XenServer) for managed storage
> 2015-06-18 16:27:47,799 DEBUG [o.a.c.e.o.VolumeOrchestrator] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Cleaning storage for vm: 2894
> 2015-06-18 16:27:47,823 INFO [o.a.c.s.v.VolumeServiceImpl] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Expunge volume with no data store specified
> 2015-06-18 16:27:47,828 DEBUG [c.c.s.StorageManagerImpl] (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Storage pool garbage collector found 0 templates to clean up in storage pool: XenRT-Zone-0-Pod-0-Cluster-0-Primary-Store-0
> 2015-06-18 16:27:47,828 INFO [o.a.c.s.v.VolumeServiceImpl] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Volume 2894 is not referred anywhere, remove it from volumes table
> 2015-06-18 16:27:47,829 DEBUG [c.c.s.StorageManagerImpl] (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Storage pool garbage collector found 0 templates to clean up in storage pool: XenRT-Zone-0-Pod-0-Cluster-1-Primary-Store-0
> 2015-06-18 16:27:47,832 DEBUG [c.c.s.StorageManagerImpl] (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Secondary storage garbage collector found 0 templates to cleanup on template_store_ref for store: nfs://10.81.56.7/xenrtnfs/1092931-dycPsK
> 2015-06-18 16:27:47,833 DEBUG [c.c.s.StorageManagerImpl] (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Secondary storage garbage collector found 0 snapshots to cleanup on snapshot_store_ref for store: nfs://10.81.56.7/xenrtnfs/1092931-dycPsK
> 2015-06-18 16:27:47,834 DEBUG [c.c.s.StorageManagerImpl] (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Secondary storage garbage collector found 0 volumes to cleanup on volume_store_ref for store: nfs://10.81.56.7/xenrtnfs/1092931-dycPsK
> 2015-06-18 16:27:47,842 DEBUG [c.c.v.VirtualMachineManagerImpl] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Expunged VM[User|i-10-2894-VM]
> 2015-06-18 16:27:47,844 WARN [c.c.s.StorageManagerImpl] (StorageManager-Scavenger-1:ctx-5e7b4eda) (logid:bb642325) Unable to destroy volume 0b22f54b-3242-49ef-b16d-1c7801d5c2bd
> java.lang.NullPointerException
> at org.apache.cloudstack.storage.volume.VolumeServiceImpl.expungeVolumeAsync(VolumeServiceImpl.java:276)
> at com.cloud.storage.StorageManagerImpl.cleanupStorage(StorageManagerImpl.java:1121)
> at com.cloud.storage.StorageManagerImpl$StorageGarbageCollector.runInContext(StorageManagerImpl.java:1481)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> 2015-06-18 16:27:47,850 DEBUG [c.c.u.AccountManagerImpl] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Access granted to Acct[ed48b7f2-15a0-11e5-96dd-d275a7df156a-system] to Domain:1/ by AffinityGroupAccessChecker
> 2015-06-18 16:27:47,871 DEBUG [c.c.v.UserVmManagerImpl] (UserVm-Scavenger-1:ctx-5ecc886e) (logid:132e3ff8) Starting cleaning up vm VM[User|i-10-2894-VM] resources...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)