You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Rohan T <ro...@gmail.com> on 2016/07/12 00:33:19 UTC

Is com.cloud.hypervisor.kvm.resource.KVMHAChecker used by CloudStack?

Hi All,

Having been smashed by the unexpected behaviour of the KVM Heartbeat / HA
process, we've been working through the logic of the process, and  I now
believe the intent of the process is sumarised by:


=================
The heartbeat process consists of 3 parts:

1. a shell script that's distributed to each of the hypervisors during the
CloudStack installation process:
/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
2. Two java classes, built into CloudStack
com.cloud.hypervisor.kvm.resource.KVMHAMonitor
com.cloud.hypervisor.kvm.resource.KVMHAChecker

Behaviour

Each of the classes periodically calls the kvmheartbeat.sh script with
different arguments, the script is used to confirm the existence of NFS
mounts,  remount any that are missing, clean up (i.e. kill) VMs in
indeterminate state, read and write heartbeats to NFS volumes and force the
host hypervisor to reboot (as part of a "shoot the node in the head"
approach to restoring sanity to the cluster).

The KVMHAMonitor script writes a timestamp to each of the NFS volumes
(pools), each minute,  if this process times out  (4 times), then calls the
script once more to force a spontaneous reboot of the host (via: echo b >
/proc/sysrq_trigger).

The KVMHAChecker is responsible for triggering the script to read the
heartbeat value and compare with the current timestamp. Where ALL NFS
volumes are determined to be "DEAD" (i.e timestamp is older than 60
seconds),

================

Is my understanding correct?

The problem is, when testing this logic in my test lab (currently 4.4.4,
but there's been no significant updates committed to these files since),
I've been unable to see any evidence of the KVMHAChecker actually
executing!  I see plenty of evidence of heartbeat writes (and of hypervisor
reboots triggered when this process timesout).


Thanks,
Rohan

Re: Is com.cloud.hypervisor.kvm.resource.KVMHAChecker used by CloudStack?

Posted by Rohan T <ro...@gmail.com>.
Hi Ilya, How are you doing Ivan?,


Glad to hear that there's work being done on the HA piece for KVM -
unfortunately until this hits 'production ready' we have to struggle along
with what we've got.

I'm well aware what happens when NFS is disconnected (as this behaviour was
the trigger for us having to dig so deeply into the KVM HA behaviour).

I've attached the patch I've developed to help protect against the worst
excesses of the kvmheartbeat.sh (without totally disabling the reboots).
The essence of the change is that the script now checks for KVM (qemu)
processes using the mountpoint being checked, and only allows a host reboot
to be conducted when VMs are impacted.


Cheers,
Rohan

On Tue, Jul 12, 2016 at 1:15 PM, ilya <il...@gmail.com> wrote:

> Rohan
>
> As of now:
> Disconnect the primary NFS from your KVM and see what happens.
>
> In the future release:
>
> Also, HA piece is being rewritten now. The specs are posted by John
> Burwell (and me to a smaller extent) if you search cloudstack mailing
> lists via markmail.org for "KVM HA" you can see the thread with many
> details.
>
> In summary, we will be changing the behavior to something more precise -
> similar to how VmWare does it.
>
> Example: host A, B and C are part of 1 cluster that use a common
> clustered storage
>
> host A hangs and halts the VMs ability to write to disk (or crash the vms)
>
> CloudStack MS will retreive the list of volumes used by VMs for host A
> ask the neighbor host B to check for when the last write has been
> performed.
>
> If all VMs with their disks have no disk activity for predefined
> interval (several intervals), cloudstack MS will use IMPI interface to
> shoot the node in the head.
>
> This is a very high level overview - there is alot more to this with
> many safeguards and tun-able parameters.
>
> Regards
> ilya
>
>
> On 7/11/16 5:33 PM, Rohan T wrote:
> > Hi All,
> >
> > Having been smashed by the unexpected behaviour of the KVM Heartbeat / HA
> > process, we've been working through the logic of the process, and  I now
> > believe the intent of the process is sumarised by:
> >
> >
> > =================
> > The heartbeat process consists of 3 parts:
> >
> > 1. a shell script that's distributed to each of the hypervisors during
> the
> > CloudStack installation process:
> > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
> > 2. Two java classes, built into CloudStack
> > com.cloud.hypervisor.kvm.resource.KVMHAMonitor
> > com.cloud.hypervisor.kvm.resource.KVMHAChecker
> >
> > Behaviour
> >
> > Each of the classes periodically calls the kvmheartbeat.sh script with
> > different arguments, the script is used to confirm the existence of NFS
> > mounts,  remount any that are missing, clean up (i.e. kill) VMs in
> > indeterminate state, read and write heartbeats to NFS volumes and force
> the
> > host hypervisor to reboot (as part of a "shoot the node in the head"
> > approach to restoring sanity to the cluster).
> >
> > The KVMHAMonitor script writes a timestamp to each of the NFS volumes
> > (pools), each minute,  if this process times out  (4 times), then calls
> the
> > script once more to force a spontaneous reboot of the host (via: echo b >
> > /proc/sysrq_trigger).
> >
> > The KVMHAChecker is responsible for triggering the script to read the
> > heartbeat value and compare with the current timestamp. Where ALL NFS
> > volumes are determined to be "DEAD" (i.e timestamp is older than 60
> > seconds),
> >
> > ================
> >
> > Is my understanding correct?
> >
> > The problem is, when testing this logic in my test lab (currently 4.4.4,
> > but there's been no significant updates committed to these files since),
> > I've been unable to see any evidence of the KVMHAChecker actually
> > executing!  I see plenty of evidence of heartbeat writes (and of
> hypervisor
> > reboots triggered when this process timesout).
> >
> >
> > Thanks,
> > Rohan
> >
>

Re: Is com.cloud.hypervisor.kvm.resource.KVMHAChecker used by CloudStack?

Posted by ilya <il...@gmail.com>.
Rohan

As of now:
Disconnect the primary NFS from your KVM and see what happens.

In the future release:

Also, HA piece is being rewritten now. The specs are posted by John
Burwell (and me to a smaller extent) if you search cloudstack mailing
lists via markmail.org for "KVM HA" you can see the thread with many
details.

In summary, we will be changing the behavior to something more precise -
similar to how VmWare does it.

Example: host A, B and C are part of 1 cluster that use a common
clustered storage

host A hangs and halts the VMs ability to write to disk (or crash the vms)

CloudStack MS will retreive the list of volumes used by VMs for host A
ask the neighbor host B to check for when the last write has been
performed.

If all VMs with their disks have no disk activity for predefined
interval (several intervals), cloudstack MS will use IMPI interface to
shoot the node in the head.

This is a very high level overview - there is alot more to this with
many safeguards and tun-able parameters.

Regards
ilya


On 7/11/16 5:33 PM, Rohan T wrote:
> Hi All,
> 
> Having been smashed by the unexpected behaviour of the KVM Heartbeat / HA
> process, we've been working through the logic of the process, and  I now
> believe the intent of the process is sumarised by:
> 
> 
> =================
> The heartbeat process consists of 3 parts:
> 
> 1. a shell script that's distributed to each of the hypervisors during the
> CloudStack installation process:
> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
> 2. Two java classes, built into CloudStack
> com.cloud.hypervisor.kvm.resource.KVMHAMonitor
> com.cloud.hypervisor.kvm.resource.KVMHAChecker
> 
> Behaviour
> 
> Each of the classes periodically calls the kvmheartbeat.sh script with
> different arguments, the script is used to confirm the existence of NFS
> mounts,  remount any that are missing, clean up (i.e. kill) VMs in
> indeterminate state, read and write heartbeats to NFS volumes and force the
> host hypervisor to reboot (as part of a "shoot the node in the head"
> approach to restoring sanity to the cluster).
> 
> The KVMHAMonitor script writes a timestamp to each of the NFS volumes
> (pools), each minute,  if this process times out  (4 times), then calls the
> script once more to force a spontaneous reboot of the host (via: echo b >
> /proc/sysrq_trigger).
> 
> The KVMHAChecker is responsible for triggering the script to read the
> heartbeat value and compare with the current timestamp. Where ALL NFS
> volumes are determined to be "DEAD" (i.e timestamp is older than 60
> seconds),
> 
> ================
> 
> Is my understanding correct?
> 
> The problem is, when testing this logic in my test lab (currently 4.4.4,
> but there's been no significant updates committed to these files since),
> I've been unable to see any evidence of the KVMHAChecker actually
> executing!  I see plenty of evidence of heartbeat writes (and of hypervisor
> reboots triggered when this process timesout).
> 
> 
> Thanks,
> Rohan
>