You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/09/28 08:27:00 UTC

[jira] [Commented] (CLOUDSTACK-9397) Add Watchdog timer to KVM Instances

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183843#comment-16183843 ] 

ASF subversion and git services commented on CLOUDSTACK-9397:
-------------------------------------------------------------

Commit b130e55088ceb392d2c6ff1533c335882be1c9b5 in cloudstack's branch refs/heads/master from [~widodh]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=b130e55 ]

CLOUDSTACK-9397: Add Watchdog timer to KVM Instance (#1707)

The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

In the agent.properties the action to be taken can be defined:

vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

vm.watchdog.model=i6300esb

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

  <watchdog model='i6300esb' action='reset'/>

In the agent.properties the action to be taken can be defined:

  vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

  vm.watchdog.model=i6300esb

Signed-off-by: Wido den Hollander <wi...@widodh.nl>

> Add Watchdog timer to KVM Instances
> -----------------------------------
>
>                 Key: CLOUDSTACK-9397
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9397
>             Project: CloudStack
>          Issue Type: New Feature
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: KVM
>            Reporter: Wido den Hollander
>            Assignee: Wido den Hollander
>              Labels: kvm, libvirt, watchdog
>
> A Watchdog timer can be used by the hypervisor to verify if an Instance is still alive. If not, for example due to a kernel panic the HV can reset the Instance so that it boots again.
> Inside the Instance the 'watchdog' daemon has to run to provide this. If the Watchdog is not running the HV can't verify if the Instance has crashed.
> This is supported by Libvirt and Qemu and can be configured in the XML: https://libvirt.org/formatdomain.html#elementsWatchdog



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)