You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Taylor <ts...@live.com> on 2017/06/23 03:09:15 UTC

Cloudstack 4.9 fails to start system VMs

Hello.


I am trying to do a basic install but the system vms never start. They appear to be stuck in a starting state.


My logs are as follows:



[root@cloudstack ~]# tail -f /var/log/cloudstack/management/management-server.log | grep -i -E 'exception|unable|fail|invalid|leak|warn|error'
2017-06-22 21:02:44,504 WARN  [c.c.c.ConsoleProxyManagerImpl] (localhost-startStop-1:null) (logid:) Empty console proxy domain, explicitly disabling SSL
2017-06-22 21:02:44,540 WARN  [c.c.s.d.DownloadMonitorImpl] (localhost-startStop-1:null) (logid:) Only realhostip.com ssl cert is supported, ignoring self-signed and other certs
2017-06-22 21:02:46,506 ERROR [c.c.u.PropertiesUtil] (localhost-startStop-1:null) (logid:) Unable to find properties file: commands.properties
2017-06-22 21:02:55,791 INFO  [c.c.h.x.r.XenServerConnectionPool] (localhost-startStop-1:null) (logid:) XenServer Connection Pool Configs: sleep.interval.on.error=10000
2017-06-22 21:02:58,660 WARN  [o.a.c.alerts] (Cluster-Notification-1:ctx-3d0b1cac) (logid:9146a1c1)  alertType:: 14 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: Management server node 10.4.5.1 is up
2017-06-22 21:02:59,381 WARN  [c.c.c.ClusterManagerImpl] (Cluster-Notification-1:ctx-3d0b1cac) (logid:9146a1c1) Notifying management server join event took 735 ms
2017-06-22 21:30:09,592 DEBUG [c.c.c.ClusterManagerImpl] (Cluster-Heartbeat-1:ctx-b14a2bb9) (logid:c9a4e195) Peer scan takes too long to finish. profiler: Done. Duration: 4862ms, profilerQueryActiveList: Done. Duration: 4861ms, profilerSyncClusterInfo: Done. Duration: 0ms, profilerInvalidatedNodeList: Done. Duration: 0ms, profilerRemovedList: Done. Duration: 0ms
2017-06-22 22:02:57,770 WARN  [o.a.c.alerts] (HA-1:ctx-5ba8c378) (logid:82b60ca7)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 22:12:56,217 WARN  [o.a.c.alerts] (HA-1:ctx-58d79fd5) (logid:a3d6c010)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 22:22:56,221 WARN  [o.a.c.alerts] (HA-2:ctx-622bd3c9) (logid:98d2c185)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 22:32:56,217 WARN  [o.a.c.alerts] (HA-1:ctx-0760401f) (logid:7b7a8104)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 22:42:56,217 WARN  [o.a.c.alerts] (HA-1:ctx-9dd2cbd0) (logid:c9c3b276)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 22:52:56,217 WARN  [o.a.c.alerts] (HA-1:ctx-f6b399cb) (logid:73d23c5c)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 22:55:31,317 WARN  [c.c.a.d.ParamGenericValidationWorker] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) Received unknown parameters for command addHost. Unknown parameters : clustertype
2017-06-22 22:55:32,614 WARN  [c.c.h.x.d.XcpServerDiscoverer] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) defaulting to xenserver650 resource for product brand: XenServer with product version: 6.5.0
2017-06-22 22:55:32,642 DEBUG [c.c.n.NetworkModelImpl] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) Failed to retrive the default label for storage traffic:zone: 1 hypervisor: XenServer due to:Unable to find the default physical network with traffic=Storage in the specified zone id
2017-06-22 22:55:34,392 ERROR [c.c.u.s.SshHelper] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) SSH execution of command xe sm-list | grep "resigning of duplicates" has an error status code in return. Result output:
2017-06-22 22:55:35,618 WARN  [c.c.h.x.d.XcpServerDiscoverer] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) defaulting to xenserver650 resource for product brand: XenServer with product version: 6.5.0
2017-06-22 22:57:35,906 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-1:ctx-a794052b) (logid:3e731ac5) VM state was updated but update time is null?! vm id: 1
2017-06-22 22:57:35,908 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-1:ctx-a794052b) (logid:3e731ac5) VM state was updated but update time is null?! vm id: 2
2017-06-22 22:58:35,902 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-3:ctx-fcbf36da) (logid:b36ad1b5) VM state was updated but update time is null?! vm id: 1
2017-06-22 22:58:35,910 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-3:ctx-fcbf36da) (logid:b36ad1b5) VM state was updated but update time is null?! vm id: 2
2017-06-22 22:59:35,893 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-2:ctx-3bead64a) (logid:736e1311) VM state was updated but update time is null?! vm id: 1
2017-06-22 22:59:35,895 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-2:ctx-3bead64a) (logid:736e1311) VM state was updated but update time is null?! vm id: 2
2017-06-22 22:59:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-b420b7b6) (logid:939cb8f8) Task (job-12) has been pending for 146 seconds
2017-06-22 22:59:44,466 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-b420b7b6) (logid:939cb8f8) Task (job-13) has been pending for 141 seconds
2017-06-22 23:00:35,903 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-5:ctx-3a1425b1) (logid:9708b0b2) VM state was updated but update time is null?! vm id: 1
2017-06-22 23:00:35,904 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-5:ctx-3a1425b1) (logid:9708b0b2) VM state was updated but update time is null?! vm id: 2
2017-06-22 23:00:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-0c7110b2) (logid:7b4a2428) Task (job-12) has been pending for 206 seconds
2017-06-22 23:00:44,466 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-0c7110b2) (logid:7b4a2428) Task (job-13) has been pending for 201 seconds
2017-06-22 23:01:35,897 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-6:ctx-888358ef) (logid:f9f96388) VM state was updated but update time is null?! vm id: 1
2017-06-22 23:01:35,899 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-6:ctx-888358ef) (logid:f9f96388) VM state was updated but update time is null?! vm id: 2
2017-06-22 23:01:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-70884ef1) (logid:c982b0a6) Task (job-12) has been pending for 266 seconds
2017-06-22 23:01:44,466 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-70884ef1) (logid:c982b0a6) Task (job-13) has been pending for 261 seconds
2017-06-22 23:02:35,880 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-7:ctx-14e0cf6f) (logid:6d13a9ce) VM state was updated but update time is null?! vm id: 1
2017-06-22 23:02:35,882 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-7:ctx-14e0cf6f) (logid:6d13a9ce) VM state was updated but update time is null?! vm id: 2
2017-06-22 23:02:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-cae439e5) (logid:33827ca9) Task (job-12) has been pending for 326 seconds
2017-06-22 23:02:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-cae439e5) (logid:33827ca9) Task (job-13) has been pending for 321 seconds
2017-06-22 23:02:56,217 WARN  [o.a.c.alerts] (HA-1:ctx-9b22d72c) (logid:67c239e8)  alertType:: 13 // dataCenterId:: 0 // podId:: 0 // clusterId:: null // message:: No usage server process running
2017-06-22 23:03:35,877 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-1:ctx-8f952915) (logid:8c4e8729) VM state was updated but update time is null?! vm id: 1
2017-06-22 23:03:35,879 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-1:ctx-8f952915) (logid:8c4e8729) VM state was updated but update time is null?! vm id: 2
2017-06-22 23:03:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-110d52c7) (logid:78477ef9) Task (job-12) has been pending for 386 seconds
2017-06-22 23:03:44,466 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-110d52c7) (logid:78477ef9) Task (job-13) has been pending for 381 seconds
2017-06-22 23:04:35,896 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-1:ctx-1fbcdc5c) (logid:31b20e4f) VM state was updated but update time is null?! vm id: 1
2017-06-22 23:04:35,899 WARN  [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgentCronJob-1:ctx-1fbcdc5c) (logid:31b20e4f) VM state was updated but update time is null?! vm id: 2
2017-06-22 23:04:44,465 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-20dbf144) (logid:d217f716) Task (job-12) has been pending for 446 seconds
2017-06-22 23:04:44,466 WARN  [o.a.c.f.j.i.AsyncJobMonitor] (Timer-1:ctx-20dbf144) (logid:d217f716) Task (job-13) has been pending for 441 seconds


I think the important log segment is likely here:


2017-06-22 22:55:32,642 DEBUG [c.c.n.NetworkModelImpl] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) Failed to retrive the default label for storage traffic:zone: 1 hypervisor: XenServer due to:Unable to find the default physical network with traffic=Storage in the specified zone id
2017-06-22 22:55:34,392 ERROR [c.c.u.s.SshHelper] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) SSH execution of command xe sm-list | grep "resigning of duplicates" has an error status code in return. Result output:
2017-06-22 22:55:35,618 WARN  [c.c.h.x.d.XcpServerDiscoverer] (catalina-exec-10:ctx-724755da ctx-0ea77950) (logid:d415ff05) defaulting to xenserver650 resource for product brand: XenServer with product version: 6.5.0



I think the issue is due to me not setting the label for the NIC. I was not sure what value to set for this as mentioned here: http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/4.9/hypervisor/xenserver.html#configuring-public-network-with-a-dedicated-nic-for-xenserver-optional


I thought this was optional though.


Can anyone help me with this?


Thanks,


Taylor