You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Indra Pramana <in...@sg.or.id> on 2015/07/01 21:00:51 UTC

Running VMs stopped upon CloudStack agent disconnected and re-connecting to management server

Dear all,

I am using CloudStack version 4.2.0 with KVM hypervisor. I notice a strange
behaviour when an agent got disconnected from the management server and I
restarted the cloudstack-agent service to reconnect, it takes very long
time to reconnect. And most of the time, it will stop most -- if not all --
of the running VMs on the host before finally manage to re-connect.

I checked the management server logs, and it seems these are the entries
which caused  the VMs to be stopped:

====
2015-07-02 02:12:13,556 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
destroy_network_rules_for_vm --vmname i-648-2613-VM --vif vnet9
2015-07-02 02:12:13,711 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Execution is successful.
2015-07-02 02:12:13,712 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Try to stop the vm at first
2015-07-02 02:12:15,716 DEBUG [utils.script.Script]
(agentRequest-Handler-3:null) Executing: /bin/bash -c ls
/sys/class/net/breth1-8/brif | grep vnet
2015-07-02 02:12:15,741 DEBUG [utils.script.Script]
(agentRequest-Handler-3:null) Execution is successful.
2015-07-02 02:12:15,742 DEBUG [cloud.agent.Agent]
(agentRequest-Handler-3:null) Processing command:
com.cloud.agent.api.StopCommand
2015-07-02 02:12:16,253 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Executing:
/usr/share/cloudstack-common/scripts/vm/network/security_group.py
destroy_network_rules_for_vm --vmname i-2-1779-VM
2015-07-02 02:12:16,423 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Execution is successful.
2015-07-02 02:12:16,424 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Try to stop the vm at first
2015-07-02 02:12:16,426 DEBUG [utils.script.Script]
(agentRequest-Handler-3:null) Executing: /bin/bash -c ls
/sys/class/net/breth1-8/brif | grep vnet
2015-07-02 02:12:16,456 DEBUG [utils.script.Script]
(agentRequest-Handler-3:null) Execution is successful.
2015-07-02 02:12:16,457 DEBUG [cloud.agent.Agent]
(agentRequest-Handler-3:null) Processing command:
com.cloud.agent.api.StopCommand
====

Any reason why the network rules need to be destroyed? How can I prevent
VMs to be stopped upon agent re-connecting to the management server? Anyone
seeing similar behaviour?

Looking forward to your reply, thank you.

Cheers.