You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Indra Pramana <in...@sg.or.id> on 2016/01/25 04:02:13 UTC

VM deployment from new template fails

Dear all,

We are using CloudStack 4.2.0 with KVM hypervisor and Ceph RBD primary
storage.

Since earlier this month, we encountered a strange problem where we cannot
deploy new VM from any template which are newly created. Deploying VMs from
existing template still works.

This is a new problem we experienced, which only started since earlier this
month -- we didn't have any issues creating new templates and deploying VMs
from the new templates until late last month (December 2015).

This is the error messages from the CloudStack management server. It seems
that the process of copying the template (in .qcow format) from the
secondary storage to the primary storage fails.

===
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(AgentManager-Handler-5:null) Seq 70-1791707721: Processing:  { Ans: ,
MgmtId: 161342671900, via: 70, Ver: v1, Flags: 110,
[{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"
result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7","wait":0}}] }
2016-01-24 20:54:52,044 DEBUG [agent.manager.AgentAttache]
(AgentManager-Handler-5:null) Seq 70-1791707730: Sending now.  is current
sequence.
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Seq
70-1791707721: Received:  { Ans: , MgmtId: 161342671900, via: 70, Ver: v1,
Flags: 110, { CopyCmdAnswer } }
2016-01-24 20:54:52,065 DEBUG [agent.transport.Request]
(AgentManager-Handler-6:null) Seq 37-2102353958: Processing:  { Ans: ,
MgmtId: 161342671900, via: 37, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2469)\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1230)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)\n\tat
com.cloud.utils.nio.Task.run(Task.java:83)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
2016-01-24 20:54:52,066 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 37-2102353958: Received:  { Ans: , MgmtId:
161342671900, via: 37, Ver: v1, Flags: 10, { Answer } }
2016-01-24 20:54:52,123 INFO  [storage.volume.VolumeServiceImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
releasing lock for VMTemplateStoragePool 323
2016-01-24 20:54:52,123 WARN  [utils.db.Merovingian2]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Was
unable to find lock for the key template_spool_ref323 and thread id
1999363220
2016-01-24 20:54:52,123 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
2016-01-24 20:54:52,124 INFO  [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:214]
is unreachable: Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
        at
com.cloud.storage.VolumeManagerImpl.recreateVolume(VolumeManagerImpl.java:2544)
        at
com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2592)
        at
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:889)
        at
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
        at
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:227)
        at
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2966)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2952)
        at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
        at
org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:420)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
        at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:679)
2016-01-24 20:54:52,152 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Cleaning up resources for the vm VM[User|test-ubuntu14-20160124a] in
Starting state
===

On the hypervisor host, I can't find any useful clues from the agent.log
file other than some messages stating that it tries to stop the VM (which
doesn't exist yet since it's still in creating process).

===
2016-01-24 20:54:52,161 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: KVM, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,162 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: LXC, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find which hypervisor the vm used ,
then use the default hypervisor
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,164 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
destroy_network_rules_for_vm --vmname i-1939-4019-VM
2016-01-24 20:54:54,682 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Execution is successful.
2016-01-24 20:54:54,683 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Try to stop the vm at first
2016-01-24 20:54:54,685 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) VM i-1939-4019-VM doesn't exist, no need to
stop it
===

Connectivity from the particular hypervisor host (as well as our other
hypervisor hosts) to both the primary and secondary storages are fine. I
can mount folders to the secondary storage, and connection to the RBD
primary storage is also OK since our current running VMs are having no
issues.

Any advice on how I can further troubleshoot and resolve this problem is
greatly appreciated. We are not able to release new templates because of
this problem.

Thank you.

Best regards,
-ip-

Re: VM deployment from new template fails

Posted by Simon Weller <sw...@ena.com>.
Hi Indra,


Can you check whether the template on the secondary storage is actually valid?

You can do this by running: qemu-img check <templatename>.qcow2


If you look at the logs (/var/log/cloudstack/agent/agent.log or agent.out) on the host that the copy is occurring on (the host is often randomly selected, so you might need to tail logs on all the hosts in the cluster and run the job again), do you see anything odd being reported back when the agent attempts to copy the template from secondary storage to primary storage?


If you look in MySQL , take a look at the vm_template table and see what format is being reported for your new template.


I've seen conversion issues before when qemu-img isn't provided the correct file formats. Since you're using RBD, the destination format should be raw.


- Si

________________________________
From: Indra Pramana <in...@sg.or.id>
Sent: Thursday, February 18, 2016 5:05 AM
To: users@cloudstack.apache.org
Cc: Simon Weller
Subject: Re: VM deployment from new template fails

Hi Simon,

Good day to you, and thank you for your reply. I apologise for the delay in my reply.

I have run the health check on the SSVM and the result seems to be OK.

===
root@s-2985-VM:/var/log/cloud# /usr/local/cloud/systemvm/ssvm-check.sh
================================================
First DNS server is  8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8<http://8.8.8.8>: icmp_seq=0 ttl=58 time=2.173 ms
64 bytes from 8.8.8.8<http://8.8.8.8>: icmp_seq=1 ttl=58 time=2.126 ms
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.126/2.150/2.173/0.024 ms
Good: Can ping DNS server
================================================
Good: DNS resolves download.cloud.com<http://download.cloud.com>
================================================
NFS is currently mounted
Mount point is /mnt/SecStorage/cb10c4c8-84a1-3eee-b983-36b6fc8f7057
Good: Can write to mount point
Mount point is /mnt/SecStorage/41cab17f-4dbc-37a5-b8d2-cdc153301319
Good: Can write to mount point
================================================
Management server is *.*.3.3. Checking connectivity.
Good: Can connect to management server port 8250
================================================
Good: Java process is running
================================================
Tests Complete. Look for ERROR or WARNING above.
===

No error and warning. I also checked /var/log/cloud/cloud.out log file and can't find anything. Do you think destroying and recreating the SSVM would help?

>From what I can see, deploying VM from existing templates are OK, only deploying VM from new templates (which was just created and not yet copied over to the primary storage, I think) fails. So I think the problem is caused during the copying from secondary storage to primary storage. But I don't know what causes the copying to fail. This is another attempt I did today:

====
2016-02-18 17:44:06,328 DEBUG [agent.transport.Request] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq 34-1602142919: Received:  { Ans: , MgmtId: 161342671900, via: 34, Ver: v1, Flags: 110, { Answer, Answer } }
2016-02-18 17:44:06,330 DEBUG [cloud.network.NetworkModelImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Service SecurityGroup is not supported in the network id=238
2016-02-18 17:44:06,333 DEBUG [cloud.storage.VolumeManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Checking if we need to prepare 2 volumes for VM[User|test-centos67-20160218]
2016-02-18 17:44:06,348 DEBUG [storage.image.TemplateDataFactoryImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) template 514 is already in store:46, type:Image
2016-02-18 17:44:06,354 DEBUG [storage.datastore.PrimaryDataStoreImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Not found (templateId:514poolId:214) in template_spool_ref, persisting it
2016-02-18 17:44:06,384 DEBUG [storage.image.TemplateDataFactoryImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) template 514 is already in store:214, type:Primary
2016-02-18 17:44:06,387 DEBUG [storage.volume.VolumeServiceImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Found template ebe8d28c-03bc-49b6-8b21-442ecfff8270 in storage pool 214 with VMTemplateStoragePool id: 327
2016-02-18 17:44:06,398 DEBUG [storage.volume.VolumeServiceImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Acquire lock on VMTemplateStoragePool 327 with timeout 3600 seconds
2016-02-18 17:44:06,409 INFO  [storage.volume.VolumeServiceImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) lock is acquired for VMTemplateStoragePool 327
2016-02-18 17:44:06,449 DEBUG [storage.motion.AncientDataMotionStrategy] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type TEMPLATE
2016-02-18 17:44:06,466 DEBUG [agent.transport.Request] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq 37-292022280: Sending  { Cmd , MgmtId: 161342671900, via: 37, Ver: v1, Flags: 100111, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/2/514/ebe8d28c-03bc-49b6-8b21-442ecfff8270.qcow2","uuid":"790ef1f6-8b80-4305-b561-5397d43295ad","id":514,"format":"QCOW2","accountId":2,"hvm":true,"displayText":"CentOS 6.7 (64-bit) 20160218","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://*.*.11.31/mnt/vol1/sec-storage2","_role":"Image"}},"name":"ebe8d28c-03bc-49b6-8b21-442ecfff8270","hypervisorType":"KVM"}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"uuid":"790ef1f6-8b80-4305-b561-5397d43295ad","id":514,"format":"QCOW2","accountId":2,"hvm":true,"displayText":"CentOS 6.7 (64-bit) 20160218","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"d433809b-01ea-3947-ba0f-48077244e4d6","id":214,"poolType":"RBD","host":"*.*.com","path":"*-*-01","port":6789}},"name":"ebe8d28c-03bc-49b6-8b21-442ecfff8270","hypervisorType":"KVM"}},"executeInSequence":true,"wait":10800}}] }
2016-02-18 17:47:24,121 DEBUG [agent.transport.Request] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq 37-292022280: Received:  { Ans: , MgmtId: 161342671900, via: 37, Ver: v1, Flags: 110, { CopyCmdAnswer } }
2016-02-18 17:47:24,176 INFO  [storage.volume.VolumeServiceImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) releasing lock for VMTemplateStoragePool 327
2016-02-18 17:47:24,177 WARN  [utils.db.Merovingian2] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Was unable to find lock for the key template_spool_ref327 and thread id 1847536824
2016-02-18 17:47:24,177 DEBUG [cloud.storage.VolumeManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Unable to create Vol[7687|vm=4070|ROOT]:com.cloud.utils.exception.CloudRuntimeException: Failed to copy /mnt/45f0a5a0-2ae4-319d-9dae-f11b1c2c585e/ebe8d28c-03bc-49b6-8b21-442ecfff8270.qcow2 to 062cc44b-2339-49a4-9142-b55f5e0dee7d
2016-02-18 17:47:24,177 INFO  [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Unable to contact resource.
2016-02-18 17:47:24,199 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Cleaning up resources for the vm VM[User|test-centos67-20160218] in Starting state
2016-02-18 17:47:24,202 DEBUG [agent.transport.Request] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq 68-1775429003: Sending  { Cmd , MgmtId: 161342671900, via: 68, Ver: v1, Flags: 100111, [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"i-1976-4070-VM","wait":0}}] }
2016-02-18 17:47:24,860 DEBUG [agent.transport.Request] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq 68-1775429003: Received:  { Ans: , MgmtId: 161342671900, via: 68, Ver: v1, Flags: 110, { StopAnswer } }
2016-02-18 17:47:24,877 DEBUG [cloud.network.NetworkManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Changing active number of nics for network id=238 on -1
2016-02-18 17:47:24,890 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Successfully released network resources for the vm VM[User|test-centos67-20160218]
2016-02-18 17:47:24,890 DEBUG [cloud.vm.VirtualMachineManagerImpl] (Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Successfully cleanued up resources for the vm VM[User|test-centos67-20160218] in Starting state
=====

The two significant error messages I noted were:
- Failed to copy /mnt/45f0a5a0-2ae4-319d-9dae-f11b1c2c585e/ebe8d28c-03bc-49b6-8b21-442ecfff8270.qcow2 to 062cc44b-2339-49a4-9142-b55f5e0dee7d ; and
- Was unable to find lock for the key template_spool_ref327 and thread id 1847536824

Not too sure if the copying failed because of locking mechanism issue? Anyone can provide any advise?

Looking forward to your reply, thank you.

Cheers.



On Mon, Jan 25, 2016 at 9:29 PM, Simon Weller <sw...@ena.com>> wrote:
Have you done any debugging on the Secondary Storage VM yet?
That's probably where I would focus.

Here's a guide:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting

- Si


________________________________________
From: Indra Pramana <in...@sg.or.id>>
Sent: Sunday, January 24, 2016 9:02 PM
To: users@cloudstack.apache.org<ma...@cloudstack.apache.org>; dev@cloudstack.apache.org<ma...@cloudstack.apache.org>
Subject: VM deployment from new template fails

Dear all,

We are using CloudStack 4.2.0 with KVM hypervisor and Ceph RBD primary
storage.

Since earlier this month, we encountered a strange problem where we cannot
deploy new VM from any template which are newly created. Deploying VMs from
existing template still works.

This is a new problem we experienced, which only started since earlier this
month -- we didn't have any issues creating new templates and deploying VMs
from the new templates until late last month (December 2015).

This is the error messages from the CloudStack management server. It seems
that the process of copying the template (in .qcow format) from the
secondary storage to the primary storage fails.

===
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(AgentManager-Handler-5:null) Seq 70-1791707721: Processing:  { Ans: ,
MgmtId: 161342671900, via: 70, Ver: v1, Flags: 110,
[{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"
result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7","wait":0}}] }
2016-01-24 20:54:52,044 DEBUG [agent.manager.AgentAttache]
(AgentManager-Handler-5:null) Seq 70-1791707730: Sending now.  is current
sequence.
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Seq
70-1791707721: Received:  { Ans: , MgmtId: 161342671900, via: 70, Ver: v1,
Flags: 110, { CopyCmdAnswer } }
2016-01-24 20:54:52,065 DEBUG [agent.transport.Request]
(AgentManager-Handler-6:null) Seq 37-2102353958: Processing:  { Ans: ,
MgmtId: 161342671900, via: 37, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2469)\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1230)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)\n\tat
com.cloud.utils.nio.Task.run(Task.java:83)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
2016-01-24 20:54:52,066 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 37-2102353958: Received:  { Ans: , MgmtId:
161342671900, via: 37, Ver: v1, Flags: 10, { Answer } }
2016-01-24 20:54:52,123 INFO  [storage.volume.VolumeServiceImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
releasing lock for VMTemplateStoragePool 323
2016-01-24 20:54:52,123 WARN  [utils.db.Merovingian2]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Was
unable to find lock for the key template_spool_ref323 and thread id
1999363220
2016-01-24 20:54:52,123 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
2016-01-24 20:54:52,124 INFO  [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:214]
is unreachable: Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
        at
com.cloud.storage.VolumeManagerImpl.recreateVolume(VolumeManagerImpl.java:2544)
        at
com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2592)
        at
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:889)
        at
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
        at
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:227)
        at
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2966)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2952)
        at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
        at
org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:420)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
        at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:679)
2016-01-24 20:54:52,152 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Cleaning up resources for the vm VM[User|test-ubuntu14-20160124a] in
Starting state
===

On the hypervisor host, I can't find any useful clues from the agent.log
file other than some messages stating that it tries to stop the VM (which
doesn't exist yet since it's still in creating process).

===
2016-01-24 20:54:52,161 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: KVM, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,162 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: LXC, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find which hypervisor the vm used ,
then use the default hypervisor
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,164 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
destroy_network_rules_for_vm --vmname i-1939-4019-VM
2016-01-24 20:54:54,682 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Execution is successful.
2016-01-24 20:54:54,683 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Try to stop the vm at first
2016-01-24 20:54:54,685 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) VM i-1939-4019-VM doesn't exist, no need to
stop it
===

Connectivity from the particular hypervisor host (as well as our other
hypervisor hosts) to both the primary and secondary storages are fine. I
can mount folders to the secondary storage, and connection to the RBD
primary storage is also OK since our current running VMs are having no
issues.

Any advice on how I can further troubleshoot and resolve this problem is
greatly appreciated. We are not able to release new templates because of
this problem.

Thank you.

Best regards,
-ip-


Re: VM deployment from new template fails

Posted by Indra Pramana <in...@sg.or.id>.
Hi Simon,

Good day to you, and thank you for your reply. I apologise for the delay in
my reply.

I have run the health check on the SSVM and the result seems to be OK.

===
root@s-2985-VM:/var/log/cloud# /usr/local/cloud/systemvm/ssvm-check.sh
================================================
First DNS server is  8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=58 time=2.173 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=2.126 ms
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.126/2.150/2.173/0.024 ms
Good: Can ping DNS server
================================================
Good: DNS resolves download.cloud.com
================================================
NFS is currently mounted
Mount point is /mnt/SecStorage/cb10c4c8-84a1-3eee-b983-36b6fc8f7057
Good: Can write to mount point
Mount point is /mnt/SecStorage/41cab17f-4dbc-37a5-b8d2-cdc153301319
Good: Can write to mount point
================================================
Management server is *.*.3.3. Checking connectivity.
Good: Can connect to management server port 8250
================================================
Good: Java process is running
================================================
Tests Complete. Look for ERROR or WARNING above.
===

No error and warning. I also checked /var/log/cloud/cloud.out log file and
can't find anything. Do you think destroying and recreating the SSVM would
help?

>From what I can see, deploying VM from existing templates are OK, only
deploying VM from new templates (which was just created and not yet copied
over to the primary storage, I think) fails. So I think the problem is
caused during the copying from secondary storage to primary storage. But I
don't know what causes the copying to fail. This is another attempt I did
today:

====
2016-02-18 17:44:06,328 DEBUG [agent.transport.Request]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq
34-1602142919: Received:  { Ans: , MgmtId: 161342671900, via: 34, Ver: v1,
Flags: 110, { Answer, Answer } }
2016-02-18 17:44:06,330 DEBUG [cloud.network.NetworkModelImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Service SecurityGroup is not supported in the network id=238
2016-02-18 17:44:06,333 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Checking if we need to prepare 2 volumes for VM[User|test-centos67-20160218]
2016-02-18 17:44:06,348 DEBUG [storage.image.TemplateDataFactoryImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
template 514 is already in store:46, type:Image
2016-02-18 17:44:06,354 DEBUG [storage.datastore.PrimaryDataStoreImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Not
found (templateId:514poolId:214) in template_spool_ref, persisting it
2016-02-18 17:44:06,384 DEBUG [storage.image.TemplateDataFactoryImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
template 514 is already in store:214, type:Primary
2016-02-18 17:44:06,387 DEBUG [storage.volume.VolumeServiceImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Found template ebe8d28c-03bc-49b6-8b21-442ecfff8270 in storage pool 214
with VMTemplateStoragePool id: 327
2016-02-18 17:44:06,398 DEBUG [storage.volume.VolumeServiceImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Acquire lock on VMTemplateStoragePool 327 with timeout 3600 seconds
2016-02-18 17:44:06,409 INFO  [storage.volume.VolumeServiceImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) lock
is acquired for VMTemplateStoragePool 327
2016-02-18 17:44:06,449 DEBUG [storage.motion.AncientDataMotionStrategy]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
copyAsync inspecting src type TEMPLATE copyAsync inspecting dest type
TEMPLATE
2016-02-18 17:44:06,466 DEBUG [agent.transport.Request]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq
37-292022280: Sending  { Cmd , MgmtId: 161342671900, via: 37, Ver: v1,
Flags: 100111,
[{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/2/514/ebe8d28c-03bc-49b6-8b21-442ecfff8270.qcow2","uuid":"790ef1f6-8b80-4305-b561-5397d43295ad","id":514,"format":"QCOW2","accountId":2,"hvm":true,"displayText":"CentOS
6.7 (64-bit)
20160218","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://*.*.11.31/mnt/vol1/sec-storage2","_role":"Image"}},"name":"ebe8d28c-03bc-49b6-8b21-442ecfff8270","hypervisorType":"KVM"}},"destTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"uuid":"790ef1f6-8b80-4305-b561-5397d43295ad","id":514,"format":"QCOW2","accountId":2,"hvm":true,"displayText":"CentOS
6.7 (64-bit)
20160218","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"d433809b-01ea-3947-ba0f-48077244e4d6","id":214,"poolType":"RBD","host":"*.*.com","path":"*-*-01","port":6789}},"name":"ebe8d28c-03bc-49b6-8b21-442ecfff8270","hypervisorType":"KVM"}},"executeInSequence":true,"wait":10800}}]
}
2016-02-18 17:47:24,121 DEBUG [agent.transport.Request]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq
37-292022280: Received:  { Ans: , MgmtId: 161342671900, via: 37, Ver: v1,
Flags: 110, { CopyCmdAnswer } }
2016-02-18 17:47:24,176 INFO  [storage.volume.VolumeServiceImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
releasing lock for VMTemplateStoragePool 327
2016-02-18 17:47:24,177 WARN  [utils.db.Merovingian2]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Was
unable to find lock for the key template_spool_ref327 and thread id
1847536824
2016-02-18 17:47:24,177 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Unable to create
Vol[7687|vm=4070|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/45f0a5a0-2ae4-319d-9dae-f11b1c2c585e/ebe8d28c-03bc-49b6-8b21-442ecfff8270.qcow2
to 062cc44b-2339-49a4-9142-b55f5e0dee7d
2016-02-18 17:47:24,177 INFO  [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Unable to contact resource.
2016-02-18 17:47:24,199 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Cleaning up resources for the vm VM[User|test-centos67-20160218] in
Starting state
2016-02-18 17:47:24,202 DEBUG [agent.transport.Request]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq
68-1775429003: Sending  { Cmd , MgmtId: 161342671900, via: 68, Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"i-1976-4070-VM","wait":0}}]
}
2016-02-18 17:47:24,860 DEBUG [agent.transport.Request]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ]) Seq
68-1775429003: Received:  { Ans: , MgmtId: 161342671900, via: 68, Ver: v1,
Flags: 110, { StopAnswer } }
2016-02-18 17:47:24,877 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Changing active number of nics for network id=238 on -1
2016-02-18 17:47:24,890 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Successfully released network resources for the vm
VM[User|test-centos67-20160218]
2016-02-18 17:47:24,890 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-36:job-41152 = [ c5e1cd55-ed2b-4935-9442-961c53de4c3f ])
Successfully cleanued up resources for the vm
VM[User|test-centos67-20160218] in Starting state
=====

The two significant error messages I noted were:
- Failed to copy
/mnt/45f0a5a0-2ae4-319d-9dae-f11b1c2c585e/ebe8d28c-03bc-49b6-8b21-442ecfff8270.qcow2
to 062cc44b-2339-49a4-9142-b55f5e0dee7d ; and
- Was unable to find lock for the key template_spool_ref327 and thread id
1847536824

Not too sure if the copying failed because of locking mechanism issue?
Anyone can provide any advise?

Looking forward to your reply, thank you.

Cheers.



On Mon, Jan 25, 2016 at 9:29 PM, Simon Weller <sw...@ena.com> wrote:

> Have you done any debugging on the Secondary Storage VM yet?
> That's probably where I would focus.
>
> Here's a guide:
>
>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting
>
> - Si
>
>
> ________________________________________
> From: Indra Pramana <in...@sg.or.id>
> Sent: Sunday, January 24, 2016 9:02 PM
> To: users@cloudstack.apache.org; dev@cloudstack.apache.org
> Subject: VM deployment from new template fails
>
> Dear all,
>
> We are using CloudStack 4.2.0 with KVM hypervisor and Ceph RBD primary
> storage.
>
> Since earlier this month, we encountered a strange problem where we cannot
> deploy new VM from any template which are newly created. Deploying VMs from
> existing template still works.
>
> This is a new problem we experienced, which only started since earlier this
> month -- we didn't have any issues creating new templates and deploying VMs
> from the new templates until late last month (December 2015).
>
> This is the error messages from the CloudStack management server. It seems
> that the process of copying the template (in .qcow format) from the
> secondary storage to the primary storage fails.
>
> ===
> 2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
> (AgentManager-Handler-5:null) Seq 70-1791707721: Processing:  { Ans: ,
> MgmtId: 161342671900, via: 70, Ver: v1, Flags: 110,
> [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"
> result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
> Failed to copy
>
> /mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
> to 0aa1ac29-cec2-4a61-883c-27bd15f577c7","wait":0}}] }
> 2016-01-24 20:54:52,044 DEBUG [agent.manager.AgentAttache]
> (AgentManager-Handler-5:null) Seq 70-1791707730: Sending now.  is current
> sequence.
> 2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
> (Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Seq
> 70-1791707721: Received:  { Ans: , MgmtId: 161342671900, via: 70, Ver: v1,
> Flags: 110, { CopyCmdAnswer } }
> 2016-01-24 20:54:52,065 DEBUG [agent.transport.Request]
> (AgentManager-Handler-6:null) Seq 37-2102353958: Processing:  { Ans: ,
> MgmtId: 161342671900, via: 37, Ver: v1, Flags: 10,
>
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
>
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2469)\n\tat
>
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1230)\n\tat
> com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
> com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)\n\tat
> com.cloud.utils.nio.Task.run(Task.java:83)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
> java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
> 2016-01-24 20:54:52,066 DEBUG [agent.transport.Request]
> (StatsCollector-3:null) Seq 37-2102353958: Received:  { Ans: , MgmtId:
> 161342671900, via: 37, Ver: v1, Flags: 10, { Answer } }
> 2016-01-24 20:54:52,123 INFO  [storage.volume.VolumeServiceImpl]
> (Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
> releasing lock for VMTemplateStoragePool 323
> 2016-01-24 20:54:52,123 WARN  [utils.db.Merovingian2]
> (Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Was
> unable to find lock for the key template_spool_ref323 and thread id
> 1999363220
> 2016-01-24 20:54:52,123 DEBUG [cloud.storage.VolumeManagerImpl]
> (Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
> Unable to create
> Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
> Failed to copy
>
> /mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
> to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
> 2016-01-24 20:54:52,124 INFO  [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
> Unable to contact resource.
> com.cloud.exception.StorageUnavailableException: Resource [StoragePool:214]
> is unreachable: Unable to create
> Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
> Failed to copy
>
> /mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
> to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
>         at
>
> com.cloud.storage.VolumeManagerImpl.recreateVolume(VolumeManagerImpl.java:2544)
>         at
> com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2592)
>         at
>
> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:889)
>         at
>
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
>         at
>
> org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:227)
>         at
>
> org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
>         at
>
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
>         at
>
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2966)
>         at
>
> com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2952)
>         at
>
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at
>
> org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:420)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
>         at
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:679)
> 2016-01-24 20:54:52,152 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
> Cleaning up resources for the vm VM[User|test-ubuntu14-20160124a] in
> Starting state
> ===
>
> On the hypervisor host, I can't find any useful clues from the agent.log
> file other than some messages stating that it tries to stop the VM (which
> doesn't exist yet since it's still in creating process).
>
> ===
> 2016-01-24 20:54:52,161 DEBUG [kvm.resource.LibvirtConnection]
> (agentRequest-Handler-2:null) can't find connection: KVM, for vm:
> i-1939-4019-VM, continue
> 2016-01-24 20:54:52,162 DEBUG [kvm.resource.LibvirtConnection]
> (agentRequest-Handler-2:null) can't find connection: LXC, for vm:
> i-1939-4019-VM, continue
> 2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtConnection]
> (agentRequest-Handler-2:null) can't find which hypervisor the vm used ,
> then use the default hypervisor
> 2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get dom xml:
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> name 'i-1939-4019-VM'
> 2016-01-24 20:54:52,164 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get dom xml:
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> name 'i-1939-4019-VM'
> 2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Failed to get dom xml:
> org.libvirt.LibvirtException: Domain not found: no domain with matching
> name 'i-1939-4019-VM'
> 2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Executing:
> /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> destroy_network_rules_for_vm --vmname i-1939-4019-VM
> 2016-01-24 20:54:54,682 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Execution is successful.
> 2016-01-24 20:54:54,683 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Try to stop the vm at first
> 2016-01-24 20:54:54,685 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) VM i-1939-4019-VM doesn't exist, no need to
> stop it
> ===
>
> Connectivity from the particular hypervisor host (as well as our other
> hypervisor hosts) to both the primary and secondary storages are fine. I
> can mount folders to the secondary storage, and connection to the RBD
> primary storage is also OK since our current running VMs are having no
> issues.
>
> Any advice on how I can further troubleshoot and resolve this problem is
> greatly appreciated. We are not able to release new templates because of
> this problem.
>
> Thank you.
>
> Best regards,
> -ip-
>

Re: VM deployment from new template fails

Posted by Simon Weller <sw...@ena.com>.
Have you done any debugging on the Secondary Storage VM yet?
That's probably where I would focus.

Here's a guide:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting

- Si


________________________________________
From: Indra Pramana <in...@sg.or.id>
Sent: Sunday, January 24, 2016 9:02 PM
To: users@cloudstack.apache.org; dev@cloudstack.apache.org
Subject: VM deployment from new template fails

Dear all,

We are using CloudStack 4.2.0 with KVM hypervisor and Ceph RBD primary
storage.

Since earlier this month, we encountered a strange problem where we cannot
deploy new VM from any template which are newly created. Deploying VMs from
existing template still works.

This is a new problem we experienced, which only started since earlier this
month -- we didn't have any issues creating new templates and deploying VMs
from the new templates until late last month (December 2015).

This is the error messages from the CloudStack management server. It seems
that the process of copying the template (in .qcow format) from the
secondary storage to the primary storage fails.

===
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(AgentManager-Handler-5:null) Seq 70-1791707721: Processing:  { Ans: ,
MgmtId: 161342671900, via: 70, Ver: v1, Flags: 110,
[{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"
result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7","wait":0}}] }
2016-01-24 20:54:52,044 DEBUG [agent.manager.AgentAttache]
(AgentManager-Handler-5:null) Seq 70-1791707730: Sending now.  is current
sequence.
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Seq
70-1791707721: Received:  { Ans: , MgmtId: 161342671900, via: 70, Ver: v1,
Flags: 110, { CopyCmdAnswer } }
2016-01-24 20:54:52,065 DEBUG [agent.transport.Request]
(AgentManager-Handler-6:null) Seq 37-2102353958: Processing:  { Ans: ,
MgmtId: 161342671900, via: 37, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2469)\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1230)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)\n\tat
com.cloud.utils.nio.Task.run(Task.java:83)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
2016-01-24 20:54:52,066 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 37-2102353958: Received:  { Ans: , MgmtId:
161342671900, via: 37, Ver: v1, Flags: 10, { Answer } }
2016-01-24 20:54:52,123 INFO  [storage.volume.VolumeServiceImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
releasing lock for VMTemplateStoragePool 323
2016-01-24 20:54:52,123 WARN  [utils.db.Merovingian2]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Was
unable to find lock for the key template_spool_ref323 and thread id
1999363220
2016-01-24 20:54:52,123 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
2016-01-24 20:54:52,124 INFO  [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:214]
is unreachable: Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
        at
com.cloud.storage.VolumeManagerImpl.recreateVolume(VolumeManagerImpl.java:2544)
        at
com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2592)
        at
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:889)
        at
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
        at
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:227)
        at
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2966)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2952)
        at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
        at
org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:420)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
        at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:679)
2016-01-24 20:54:52,152 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Cleaning up resources for the vm VM[User|test-ubuntu14-20160124a] in
Starting state
===

On the hypervisor host, I can't find any useful clues from the agent.log
file other than some messages stating that it tries to stop the VM (which
doesn't exist yet since it's still in creating process).

===
2016-01-24 20:54:52,161 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: KVM, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,162 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: LXC, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find which hypervisor the vm used ,
then use the default hypervisor
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,164 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
destroy_network_rules_for_vm --vmname i-1939-4019-VM
2016-01-24 20:54:54,682 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Execution is successful.
2016-01-24 20:54:54,683 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Try to stop the vm at first
2016-01-24 20:54:54,685 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) VM i-1939-4019-VM doesn't exist, no need to
stop it
===

Connectivity from the particular hypervisor host (as well as our other
hypervisor hosts) to both the primary and secondary storages are fine. I
can mount folders to the secondary storage, and connection to the RBD
primary storage is also OK since our current running VMs are having no
issues.

Any advice on how I can further troubleshoot and resolve this problem is
greatly appreciated. We are not able to release new templates because of
this problem.

Thank you.

Best regards,
-ip-

Re: VM deployment from new template fails

Posted by Simon Weller <sw...@ena.com>.
Have you done any debugging on the Secondary Storage VM yet?
That's probably where I would focus.

Here's a guide:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting

- Si


________________________________________
From: Indra Pramana <in...@sg.or.id>
Sent: Sunday, January 24, 2016 9:02 PM
To: users@cloudstack.apache.org; dev@cloudstack.apache.org
Subject: VM deployment from new template fails

Dear all,

We are using CloudStack 4.2.0 with KVM hypervisor and Ceph RBD primary
storage.

Since earlier this month, we encountered a strange problem where we cannot
deploy new VM from any template which are newly created. Deploying VMs from
existing template still works.

This is a new problem we experienced, which only started since earlier this
month -- we didn't have any issues creating new templates and deploying VMs
from the new templates until late last month (December 2015).

This is the error messages from the CloudStack management server. It seems
that the process of copying the template (in .qcow format) from the
secondary storage to the primary storage fails.

===
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(AgentManager-Handler-5:null) Seq 70-1791707721: Processing:  { Ans: ,
MgmtId: 161342671900, via: 70, Ver: v1, Flags: 110,
[{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"
result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7","wait":0}}] }
2016-01-24 20:54:52,044 DEBUG [agent.manager.AgentAttache]
(AgentManager-Handler-5:null) Seq 70-1791707730: Sending now.  is current
sequence.
2016-01-24 20:54:52,044 DEBUG [agent.transport.Request]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Seq
70-1791707721: Received:  { Ans: , MgmtId: 161342671900, via: 70, Ver: v1,
Flags: 110, { CopyCmdAnswer } }
2016-01-24 20:54:52,065 DEBUG [agent.transport.Request]
(AgentManager-Handler-6:null) Seq 37-2102353958: Processing:  { Ans: ,
MgmtId: 161342671900, via: 37, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:2469)\n\tat
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1230)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)\n\tat
com.cloud.utils.nio.Task.run(Task.java:83)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
2016-01-24 20:54:52,066 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 37-2102353958: Received:  { Ans: , MgmtId:
161342671900, via: 37, Ver: v1, Flags: 10, { Answer } }
2016-01-24 20:54:52,123 INFO  [storage.volume.VolumeServiceImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
releasing lock for VMTemplateStoragePool 323
2016-01-24 20:54:52,123 WARN  [utils.db.Merovingian2]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ]) Was
unable to find lock for the key template_spool_ref323 and thread id
1999363220
2016-01-24 20:54:52,123 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
2016-01-24 20:54:52,124 INFO  [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Unable to contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:214]
is unreachable: Unable to create
Vol[7508|vm=4019|ROOT]:com.cloud.utils.exception.CloudRuntimeException:
Failed to copy
/mnt/a04d624b-e428-3448-aa22-e1e30f3a36e5/816449ef-dab0-4f4e-887f-7c51b2e508f7.qcow2
to 0aa1ac29-cec2-4a61-883c-27bd15f577c7
        at
com.cloud.storage.VolumeManagerImpl.recreateVolume(VolumeManagerImpl.java:2544)
        at
com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2592)
        at
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:889)
        at
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
        at
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:227)
        at
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2966)
        at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2952)
        at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
        at
org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:420)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
        at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:679)
2016-01-24 20:54:52,152 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-29:job-39302 = [ 9d226ae4-02b7-4550-9ec5-cafca1fa736e ])
Cleaning up resources for the vm VM[User|test-ubuntu14-20160124a] in
Starting state
===

On the hypervisor host, I can't find any useful clues from the agent.log
file other than some messages stating that it tries to stop the VM (which
doesn't exist yet since it's still in creating process).

===
2016-01-24 20:54:52,161 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: KVM, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,162 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find connection: LXC, for vm:
i-1939-4019-VM, continue
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtConnection]
(agentRequest-Handler-2:null) can't find which hypervisor the vm used ,
then use the default hypervisor
2016-01-24 20:54:52,163 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,164 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Failed to get dom xml:
org.libvirt.LibvirtException: Domain not found: no domain with matching
name 'i-1939-4019-VM'
2016-01-24 20:54:52,165 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
destroy_network_rules_for_vm --vmname i-1939-4019-VM
2016-01-24 20:54:54,682 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Execution is successful.
2016-01-24 20:54:54,683 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Try to stop the vm at first
2016-01-24 20:54:54,685 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) VM i-1939-4019-VM doesn't exist, no need to
stop it
===

Connectivity from the particular hypervisor host (as well as our other
hypervisor hosts) to both the primary and secondary storages are fine. I
can mount folders to the secondary storage, and connection to the RBD
primary storage is also OK since our current running VMs are having no
issues.

Any advice on how I can further troubleshoot and resolve this problem is
greatly appreciated. We are not able to release new templates because of
this problem.

Thank you.

Best regards,
-ip-