You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Tejas Gadaria <re...@gmail.com> on 2014/08/22 08:20:29 UTC

Guest VMs not able to start on Cloudstak HA

Hi,

I am running ACS 4.3 with 3 xenserver 6.2 SP1 host in single cluster,
primary & secondary storage is nfs, exposed from glusterfs, Also using
Cloudstack HA feature for guest VMs.

underneath we have 2 glusterfs node (node-1, node-2) for replication & ctdb
service on top of glusterfs is providing virtual ip for failover &
failback.

we are using ctdb virtual ip for storage (primary & secondary).

While testing it appears that,

1) when gluster node-1 was down, two hypervisor host reboots &  xenserver
it was showing below logs..

Aug  6 18:43:41 swiftproxy-xen2 tapdisk[19318]: ERROR: errno -5 at
vhd_complete:
/var/run/sr-mount/67012b80-c1d9-257f-62d2-0c901168820a/d3a4b80b-006c-4653-a6dd-35f3a66e29f3.vhd:
op: 2, lsec: 1635176, secs: 1, nbytes: 512, blk: 399, blk_offset: 718375
Aug  6 18:43:41 swiftproxy-xen2 tapdisk[19318]: Res=-5, image->type=4
Aug  6 18:43:41 swiftproxy-xen2 kernel: [15281.174879] nfs: server
192.168.161.73 not responding, timed out
Aug  6 18:43:59 swiftproxy-xen2 heartbeat: Problem with
/var/run/sr-mount/67012b80-c1d9-257f-62d2-0c901168820a/hb-dc1d7e1c-b09b-45a5-a10e-5479f50aa16f:
not reachable for 63 seconds, rebooting system!


2)  3rd xenserver host also reboots, but after coming up it  doesn't join
the pool because network bridge was destroyed, that was surprising.

3) When both xenserver host comes up, Only system vms were running on that
(as they were on default HA). As guest VMs were also on HA, In logs we can
observer that cloudstack was trying to start guest VMs but some how it was
not able to start..

Any help will be appreciated.

Regards,
Tejas