You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by victor <vi...@ihnetworks.com> on 2018/03/07 17:01:42 UTC

KVM HostHA

Hello Guys,

I have installed cloudstack 4.11. I have enabled HA for each hosts I 
have added. I have also added ipmi successfully (using ipmi driver).   
The hosts are showing like the following.

=======

HA Enabled 	Yes
HA State 	Available
HA Provider 	kvmhaprovider

======

Also the host is showing the following correctly

Resource state --> Enabled
State --> UP
Power state --> On

So I have shutdown one of the hosts to see how the KVM hosts Ha is 
working.  I have waited for half an hour. But nothing has happened. What 
will happen to the VM's in that host, if the host failed to back up. 
There isn't much from logs.

Regards
Victor

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
Could you upload your mgmt. server logs to pastbin.com or similar so that we can have a look please.


Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: victor <vi...@ihnetworks.com> 
Sent: 08 March 2018 08:36
To: users@cloudstack.apache.org; Andrija Panic <an...@gmail.com>
Subject: Re: KVM HostHA

Hello Andrija,

Yes I am doing the same test as you mentioned ie unplug NIC in one of the host and observer the action of VM's in that host. But in my test the VM's didn't get started in another host.

Regards
Victor


On 03/07/2018 11:52 PM, Andrija Panic wrote:
> Hi Victor,
>
> zero experience here with 4.11 in general, but what are you expecting 
> to happen ?
>
> you powered off a host, so nothing for IPMI driver to do - host is 
> down already, no host HA actions are expected afaik.
>
> I guess you might have have wanted to i.e. unplug NIC (cause network 
> issues on MGMT network), or... kill agent service and then observe the actions.
>
> Were VMs started on another host, in your test?
>
> Cheers
>
> On 7 March 2018 at 18:01, victor <vi...@ihnetworks.com> wrote:
>
>> Hello Guys,
>>
>> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
>> added. I have also added ipmi successfully (using ipmi driver).   The hosts
>> are showing like the following.
>>
>> =======
>>
>> HA Enabled      Yes
>> HA State        Available
>> HA Provider     kvmhaprovider
>>
>> ======
>>
>> Also the host is showing the following correctly
>>
>> Resource state --> Enabled
>> State --> UP
>> Power state --> On
>>
>> So I have shutdown one of the hosts to see how the KVM hosts Ha is 
>> working.  I have waited for half an hour. But nothing has happened. 
>> What will happen to the VM's in that host, if the host failed to back 
>> up. There isn't much from logs.
>>
>> Regards
>> Victor
>>
>
>


Re: KVM HostHA

Posted by Parth Patel <pa...@gmail.com>.
Hi Victor,

I too had a similar failover requirement. I also got on the path of making
an HA-enabled KVM host in CS 4.9 after doing the same steps you performed
but in CS 4.9 where the agent got in "Alert" state but not in "Down" state.
However, if your requirement is simply that in case a host executing an
HA-enabled VM goes down in CS 4.11, it should be restarted on another host,
you don't need to make the KVM host HA.

How I replicated failover scenario in Cloudstack 4.11:
- Start an HA-enabled VM on a host.
- Unplug the host
- Make sure at least one suitable host with enough resources is available.
- My CS 4.11 after the durations of ping-duration*ping-timeout (60*2.5 ~
3.5 minutes) decides that the host is down and restarts the VM on another
host.
(NOTE: this assumes that your NFS or storage server is on another machine
and you are not using local storage for the HA-enabled VM)

Your management server logs should show that host id: xxx has disconnected
with event ping timeout and after several of those messages, it should
decide that host is down. If this is not the case, look for insufficient
server capacity and cannot create deployment messages in server logs. If
all of above don't match your scenario, a look at management server logs
would help.

On Thu, 8 Mar 2018 at 14:05 victor <vi...@ihnetworks.com> wrote:

> Hello Andrija,
>
> Yes I am doing the same test as you mentioned ie unplug NIC in one of
> the host and observer the action of VM's in that host. But in my test
> the VM's didn't get started in another host.
>
> Regards
> Victor
>
>
> On 03/07/2018 11:52 PM, Andrija Panic wrote:
> > Hi Victor,
> >
> > zero experience here with 4.11 in general, but what are you expecting to
> > happen ?
> >
> > you powered off a host, so nothing for IPMI driver to do - host is down
> > already, no host HA actions are expected afaik.
> >
> > I guess you might have have wanted to i.e. unplug NIC (cause network
> issues
> > on MGMT network), or... kill agent service and then observe the actions.
> >
> > Were VMs started on another host, in your test?
> >
> > Cheers
> >
> > On 7 March 2018 at 18:01, victor <vi...@ihnetworks.com> wrote:
> >
> >> Hello Guys,
> >>
> >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> have
> >> added. I have also added ipmi successfully (using ipmi driver).   The
> hosts
> >> are showing like the following.
> >>
> >> =======
> >>
> >> HA Enabled      Yes
> >> HA State        Available
> >> HA Provider     kvmhaprovider
> >>
> >> ======
> >>
> >> Also the host is showing the following correctly
> >>
> >> Resource state --> Enabled
> >> State --> UP
> >> Power state --> On
> >>
> >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >> working.  I have waited for half an hour. But nothing has happened. What
> >> will happen to the VM's in that host, if the host failed to back up.
> There
> >> isn't much from logs.
> >>
> >> Regards
> >> Victor
> >>
> >
> >
>
>

Re: KVM HostHA

Posted by victor <vi...@ihnetworks.com>.
Hello Andrija,

Yes I am doing the same test as you mentioned ie unplug NIC in one of 
the host and observer the action of VM's in that host. But in my test 
the VM's didn't get started in another host.

Regards
Victor


On 03/07/2018 11:52 PM, Andrija Panic wrote:
> Hi Victor,
>
> zero experience here with 4.11 in general, but what are you expecting to
> happen ?
>
> you powered off a host, so nothing for IPMI driver to do - host is down
> already, no host HA actions are expected afaik.
>
> I guess you might have have wanted to i.e. unplug NIC (cause network issues
> on MGMT network), or... kill agent service and then observe the actions.
>
> Were VMs started on another host, in your test?
>
> Cheers
>
> On 7 March 2018 at 18:01, victor <vi...@ihnetworks.com> wrote:
>
>> Hello Guys,
>>
>> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
>> added. I have also added ipmi successfully (using ipmi driver).   The hosts
>> are showing like the following.
>>
>> =======
>>
>> HA Enabled      Yes
>> HA State        Available
>> HA Provider     kvmhaprovider
>>
>> ======
>>
>> Also the host is showing the following correctly
>>
>> Resource state --> Enabled
>> State --> UP
>> Power state --> On
>>
>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
>> working.  I have waited for half an hour. But nothing has happened. What
>> will happen to the VM's in that host, if the host failed to back up. There
>> isn't much from logs.
>>
>> Regards
>> Victor
>>
>
>


Re: KVM HostHA

Posted by Andrija Panic <an...@gmail.com>.
Hi Victor,

zero experience here with 4.11 in general, but what are you expecting to
happen ?

you powered off a host, so nothing for IPMI driver to do - host is down
already, no host HA actions are expected afaik.

I guess you might have have wanted to i.e. unplug NIC (cause network issues
on MGMT network), or... kill agent service and then observe the actions.

Were VMs started on another host, in your test?

Cheers

On 7 March 2018 at 18:01, victor <vi...@ihnetworks.com> wrote:

> Hello Guys,
>
> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
> added. I have also added ipmi successfully (using ipmi driver).   The hosts
> are showing like the following.
>
> =======
>
> HA Enabled      Yes
> HA State        Available
> HA Provider     kvmhaprovider
>
> ======
>
> Also the host is showing the following correctly
>
> Resource state --> Enabled
> State --> UP
> Power state --> On
>
> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> working.  I have waited for half an hour. But nothing has happened. What
> will happen to the VM's in that host, if the host failed to back up. There
> isn't much from logs.
>
> Regards
> Victor
>



-- 

Andrija Panić

Re: KVM HostHA

Posted by victor <vi...@ihnetworks.com>.
Hello Guys,

I think it is related

==========
https://github.com/apache/cloudstack/pull/2474
===========


On 03/14/2018 02:05 PM, Jon Marshall wrote:
> Hi Paul
>
>
> My testing does indeed end up with the failed host in maintenance mode but the VMs are never migrated. As I posted earlier the management server seems to be saying there is no other host that the VM can be migrated to.
>
>
> Couple of questions if you have the time to respond -
>
>
> 1) this article seems to suggest a reboot or powering off a host will end result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so does Host HA do something different
>
>
> 2) Whenever one of my two nodes is taken down in testing the active compute nodes HA status goes from Available to Ineligible. Should this happen ie. is it going to Ineligible stopping the manager from migrating the VMs.
>
>
> Apologies for all the questions but I just can't get this to work at the moment. If I do eventually get it working I will do a write up for others with same issue :)
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 14 March 2018 07:45
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
>
> Hi Parth,
>
> Two answer your questions, VM-HA does not restart VMs on an alternate host if the original host goes down.  The management server (without host-HA) cannot tell what happened to the host.  It cannot tell if there was a failure in the agent, loss of connectivity to the management NIC or if the host is truly down.  In the first two scenarios, the guest VMs can still be running perfectly well, and to restart them elsewhere would be very dangerous.  Therefore, the correct thing to do is - nothing but alert the operator.  These scenarios are what Host-HA was introduced for.
>
> Wrt to STONITH, if no disk activity is detected on the host, host-HA will try to restart (via IPMI) the host. If, after a configurable number of attempts, the host agent still does not check in, then host-HA will shut down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.
>
>
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 14 March 2018 05:05
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> Hi Paul,
>
> Thanks for the clarification. I currently don't have an ipmi enabled hardware (in test environment), but it will be beneficial if you can help me clear out some basic concepts of it:
> - If HA-enabled VMs are autostarted on another host when current host goes down, what is the need or purpose of HA-host? (other than management server able to remotely control it's power interfaces)
> - I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS uses to fence the host, but I couldn't find what mechanism or events trigger this?
>
> Thanks and regards,
> Parth Patel
>
> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:
>
>> The management server doesn't ping the host through IPMI.   However if
>> IPMI is not available, you will not be able to use Host HA, as there
>> is no way for CloudStack to 'fence' the host - that is shut it down to
>> be sure that a VM cannot start again on that host.
>>
>> I can explain why that is necessary if you wish.
>>
>>
>> Kind regards,
>>
>> Paul Angus
>>
>> paul.angus@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>
>>
>>
>>
>> -----Original Message-----
>> From: Parth Patel <pa...@gmail.com>
>> Sent: 13 March 2018 16:57
>> To: users@cloudstack.apache.org
>> Cc: Jon Marshall <jm...@hotmail.co.uk>
>> Subject: Re: KVM HostHA
>>
>> Hi Jon and Victor,
>>
>> I think the management server pings your host using ipmi (I really don't
>> hope this is the case).
>> In my case, I did not have OOBM enabled at all (my hardware didn't support
>> it)
>> I think you could disable OOBM and/or HA-Host and give that a try :)
>>
>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>>
>>> Hello Guys,
>>>
>>> I have tried the following two cases.
>>>
>>> 1, "echo c > /proc/sysrq-trigger"
>>>
>>> 2, Pulled the network cable of one of the host
>>>
>>> In both cases, the following happened.
>>>
>>> =====
>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
>>> of to disconnect
>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
>>> disconnecting with event AgentDisconnected
>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
>>> Alert
>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
>>> for
>>> 4 with state Alert
>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
>>> =====
>>>
>>> But nothing happened for the  vm's in that node. I have waited for one
>>> hour and the VM's in that node has been migrated to the other
>>> available hosts. I think the issue is that the management server still
>>> thinks that the VM's in that host is running. Please check the
>>> following logs
>>>
>>> =======
>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
>>> running on host 4 ========
>>>
>>>
>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting into
>>>> the
>>> server but it did not stop the server responding to an ipmitool
>>> request on the manager eg -
>>>>
>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
>> status"
>>>>
>>>> from the management server got an answer saying the chassis power
>>>> was on
>>> so CS never registered the compute node as down.
>>>>
>>>> I am obviously doing something wrong but cannot work it out.
>>>>
>>>>
>>>> The management server has one NIC - 172.16.7.4
>>>>
>>>>
>>>> Each compute node has 3 NICs -
>>>>
>>>>
>>>>                                          cnode1
>>> cnode2
>>>>
>>>> mangement NIC        172.16.7.5                   172.16.7.6
>>>>
>>>> vm NIC                      172.16.6.130                 172.16.6.131
>>>>
>>>> storage -                     172.16.250.4               172.16.250.5
>>>>
>>>>
>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
>>>>
>>>>
>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
>>>>
>>>>
>>>>
>>>> If I pull the storage NIC presumably nothing will happen as the
>>>> ipmitool
>>> check is running across the management NIC so I need to pull both ?
>>>> My understanding of host HA was the management server monitored the
>>> compute nodes using ipmitool and if it did not get a response because
>>> the host was down it would fence off that host and move the VMs to an
>>> active compute node.
>>>> This is obviously too simplistic so could someone explain how it is
>>> meant to work and what it is protecting against ?
>>>> ________________________________
>>>> From: Paul Angus <pa...@shapeblue.com>
>>>> Sent: 13 March 2018 07:01
>>>> To: users@cloudstack.apache.org
>>>> Subject: RE: KVM HostHA
>>>>
>>>> Hi all,
>>>>
>>>> One small note, unplugging the management NIC will only cause an HA
>>> event if the storage is running over that NIC also.
>>>> Is the storage is over a separate NIC then, the guest VMs will
>>>> continue
>>> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
>>> activity and conclude that there is nothing it can do, as the VMs are
>>> still running other than mark the hosts as degraded.
>>>>
>>>> Kind regards,
>>>>
>>>> Paul Angus
>>>>
>>>> paul.angus@shapeblue.com
>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Shapeblue - The CloudStack Company
>> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&source=g>
>> <http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>> CSForge is
>>> a framework developed by ShapeBlue to deli
>>> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
>>> eli&entry=gmail&source=g>ver the rapid deployment of a standardised
>>> ...
>>>>
>>>>
>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Parth Patel <pa...@gmail.com>
>>>> Sent: 12 March 2018 17:35
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: KVM HostHA
>>>>
>>>>> Hi Jon,
>>>>>
>>>>> As I said, in my case, making the host HA didn't work but by just
>>>>> having a HA VM running on host and executing - (WARNING) "echo c >
>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
>>>>> management server registered it as down and started the VM on another
>>>>> host. I know I've suggested this before but I insist you give this a
>>>>> try. Also, you don't need to completely power off the machine manually
>>>>> but just plugging out the network cable works fine. The cloudstack
>>>>> agent after losing connection to management server auto reboots
>>>>> because of KVM heartbeat check shell script mentioned by Rohit Yadav
>>>>> to one of my earlier queries in other thread.
>>>>>
>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
>> wrote:
>>>>> Hi Paul
>>>>>
>>>>>
>>>>> Thanks for the response.
>>>>>
>>>>>
>>>>> I think I am not understanding how it was meant to work then. My
>>>>> understanding was that the manager used ipmitool to just keep querying
>>>>> the compute nodes as to their status so I assumed it didn't matter how
>>>>> you shut the node down, once it was down the manager would get no
>>>>> response and mark it as down (which it does).
>>>>>
>>>>>
>>>>> I am in testing mode so I think I will just go and pull the power and
>>>>> see what happens :)
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> Jon
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: Paul Angus <pa...@shapeblue.com>
>>>>> Sent: 12 March 2018 15:31
>>>>> To: users@cloudstack.apache.org
>>>>> Subject: RE: KVM HostHA
>>>>> Hi Jon,
>>>>>
>>>>> I think that what you guys are finding, is that a controlled host
>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
>>>>> considered an HA event. I wouldn't expect CloudStack to take any
>>>>> action if you shut down a host, only if the host (agent) stops
>>> responding.
>>>>>
>>>>>
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Paul Angus
>>>>>
>>>>> paul.angus@shapeblue.com
>>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is
>>> a framework developed by ShapeBlue to deliver the rapid deployment of a
>>> standardised ...
>>>>
>>>>
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]
>>>> ]<
>>>>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is
>>> a framework developed by ShapeBlue to deliver the rapid deployment of a
>>> standardised ...
>>>>
>>>>
>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is
>>> a framework developed by ShapeBlue to deliver the rapid deployment of a
>>> standardised ...
>>>>
>>>>
>>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is
>>> a framework developed by ShapeBlue to deliver
>>> <
>> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g
>>> the rapid deployment of a standardised ...
>>>>
>>>>
>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>>>>> is a framework developed by ShapeBlue to deliver the rapid deployment
>>>>> of a standardised ...
>>>>>
>>>>>
>>>>>
>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
>>>>> Sent: 12 March 2018 15:15
>>>>> To: users@cloudstack.apache.org
>>>>> Subject: Re: KVM HostHA
>>>>>
>>>>> I have the same issue here and am not entirely sure what the behaviour
>>>>> should be.
>>>>>
>>>>>
>>>>> I have one manager node and 2 compute nodes running 4.11 with ipmi
>>> working
>>>>> correctly.
>>>>>
>>>>>
>>>>>   From the UI under HA -
>>>>>
>>>>>
>>>>> HA Enabled Yes
>>>>> HA State Available
>>>>> HA Provider kvmhaprovider
>>>>>
>>>>>
>>>>> although interestingly from the "Details" tab it shows -
>>>>>
>>>>>
>>>>> HA enabled No
>>>>>
>>>>>
>>>>> which I assume is a cosmetic issue ?
>>>>>
>>>>>
>>>>> On each compute node I have one HA enabled VM and one non HA enabled
>> VM.
>>>>>
>>>>> I power off a compute node and the UI updates the host status and the
>>> VMs
>>>>> on that node stop responding but they never fail over to the other
>> node.
>>>>>
>>>>> Couple of things I noticed -
>>>>>
>>>>>
>>>>> 1) as soon as i power off the compute node the HA state on the other
>>> node
>>>>> shows "Ineligible"
>>>>>
>>>>>
>>>>> 2) In the UI the instances all still show as green even though two of
>>> them
>>>>> are not available
>>>>>
>>>>>
>>>>> Any help much appreciated
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: victor <vi...@ihnetworks.com>
>>>>> Sent: 07 March 2018 17:01
>>>>> To: users@cloudstack.apache.org
>>>>> Subject: KVM HostHA
>>>>>
>>>>> Hello Guys,
>>>>>
>>>>> I have installed cloudstack 4.11. I have enabled HA for each hosts I
>>> have
>>>>> added. I have also added ipmi successfully (using ipmi driver).
>>>>> The hosts are showing like the following.
>>>>>
>>>>> =======
>>>>>
>>>>> HA Enabled Yes
>>>>> HA State Available
>>>>> HA Provider kvmhaprovider
>>>>>
>>>>> ======
>>>>>
>>>>> Also the host is showing the following correctly
>>>>>
>>>>> Resource state --> Enabled
>>>>> State --> UP
>>>>> Power state --> On
>>>>>
>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
>>>>> working. I have waited for half an hour. But nothing has happened.
>> What
>>>>> will happen to the VM's in that host, if the host failed to back up.
>>>>> There isn't much from logs.
>>>>>
>>>>> Regards
>>>>> Victor
>>>>>
>>>


Re: KVM HostHA

Posted by ilya musayev <il...@gmail.com>.
KVM HostHA was developed to bring it to a closer parity with how VMware
vSphere handles its HA.

Basically in old model - there were several corner cases where CloudStack
did not know whether
Hypervisor crashed or just lost connectivity to Management server.

We’ve added a logic to make sure that hypervisor is truly dead - only then
we can choose to use IPMI, power off or reboot hypervisor and bring up
guest VMs elsewhere.

There is a lot more that goes on behind the scenes - Rohit did a talk about
it and it should be available on YouTube.

On Wed, Mar 14, 2018 at 7:36 AM Parth Patel <pa...@gmail.com>
wrote:

> Hi Paul and Adrina,
>
> I don't know the functioning of Host-HA features but what Paul explained,
> my ACS 4.11 does the same without even host HA or ipmi access. As I stated
> earlier multiple times, without host HA and ipmi, my ha-enabled VMs
> executing on a normal host get restarted on another suitable host in
> cluster after approximately 3 minutes of event ping timeout. After which
> the cloudstack agent with no connection to management server because of
> unplugged NIC (all my machines currently have only one NIC / whole zone is
> in a flat network) reboots itself (the reason was explained by Rohit in an
> another thread). The management server marks the host down and only
> Ha-enabled VMs executing on it get restarted on another host (without any
> mention of host HA or ipmi or fencing in management server logs) while
> normal VMs executing on it are stopped.
>
> I don't know if this was a desired outcome, but I think my current ACS 4.11
> installation has features (at least performs some ;) provided by Host HA
> without configuring it or ipmi.
>
> Regards,
> Parth Patel
>
> On Wed 14 Mar, 2018, 18:41 Boris Stoyanov, <bo...@shapeblue.com>
> wrote:
>
> > yes, KVM + NFS shared storage.
> >
> > Boris.
> >
> >
> > boris.stoyanov@shapeblue.com
> > www.shapeblue.com
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> > > On 14 Mar 2018, at 14:51, Andrija Panic <an...@gmail.com>
> wrote:
> > >
> > > Hi Boris,
> > >
> > > ok thanks for the explanation - that makes sense, and covers my
> > "exception
> > > case" that I have.
> > >
> > > This is atm only available for NFS as I could read (KVM on NFS) ?
> > >
> > > Cheers
> > >
> > > On 14 March 2018 at 13:02, Boris Stoyanov <
> boris.stoyanov@shapeblue.com>
> > > wrote:
> > >
> > >> Hi Andrija,
> > >>
> > >> There’s two types of checks Host-HA is doing to determine if host if
> > >> healthy.
> > >>
> > >> 1. Health checks - pings the host as soon as there’s connection issues
> > >> with the agent
> > >>
> > >> If that fails,
> > >>
> > >> 2. Activity checks - checks if there are any writing operations on the
> > >> Disks of the VMs that are running on the hosts. This is to determine
> if
> > the
> > >> VMs are actually alive and executing processes. Only if no disk
> > operations
> > >> are executed on the shared storage, only then it’s trying to Recover
> the
> > >> host with IPMI call, if that eventually fails, it migrates the VMs to
> a
> > >> healthy host and Fences the faulty one.
> > >>
> > >> Hope that explains your case.
> > >>
> > >> Boris.
> > >>
> > >>
> > >> boris.stoyanov@shapeblue.com
> > >> www.shapeblue.com
> > >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >> @shapeblue
> > >>
> > >>
> > >>
> > >>> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com>
> > wrote:
> > >>>
> > >>> Hi Paul,
> > >>>
> > >>> sorry to bump in the middle of the thread, but just curious about the
> > >> idea
> > >>> behing host-HA and why it behaves the way you exlained above:
> > >>>
> > >>>
> > >>> Would it be more sense (or not?), that when MGMT detects agents is
> > >>> unreachable or host unreachable (or after unsuccessful i.e. agent
> > >> restart,
> > >>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> > >>> making sure no VMS running and then to really start all HA-enabled
> VMs
> > on
> > >>> other hosts ?
> > >>>
> > >>> I'm just trying to make parallel to the corosync/pacemaker as
> > clustering
> > >>> suite/services in Linux (RHEL and others), where when majority of
> nodes
> > >>> detect that one node is down, a common thing (especially for shared
> > >>> storage) is to STONITH that node, make sure it;s down, then move
> > >> "resource"
> > >>> (in our case VMs) to other cluster nodes ?
> > >>>
> > >>> I see it's  actually much broader setup per
> > >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> > >> again -
> > >>> whole idea (in my head at least...) is when host get's down, we make
> > sure
> > >>> it's down (avoid VM corruption, by doint STONITH to that node) and
> then
> > >>> start HA VMs on ohter hosts.
> > >>>
> > >>> I understand there might be exceptions as I have right now (4.8) -
> > >> libvirt
> > >>> get stuck (librbd exception or similar) so agent get's disconnected,
> > but
> > >>> VMs are still running fine... (except DB get messed up, all NICs
> loose
> > >>> isolation_uri, VR's loose MAC addresses and other IP addresses
> etc...)
> > >>>
> > >>>
> > >>> Thanks
> > >>> Andrija
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
> > >>>
> > >>>> That would make sense.
> > >>>>
> > >>>>
> > >>>> I have another server being used for something else at the moment
> so I
> > >>>> will add that in and update this thread when I have tested
> > >>>>
> > >>>>
> > >>>> Jon
> > >>>>
> > >>>>
> > >>>> ________________________________
> > >>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>> Sent: 14 March 2018 09:16
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: RE: KVM HostHA
> > >>>>
> > >>>> I'd need to do some testing, but I suspect that your problem is that
> > you
> > >>>> only have two hosts.  At the point that one host is deemed out of
> > >> service,
> > >>>> you only have one host left.  With only one host, CloudStack will
> show
> > >> the
> > >>>> cluster as ineligible.
> > >>>>
> > >>>> It is extremely common for any system working as a cluster to
> require
> > a
> > >>>> minimum starting point of 3 nodes to be able to function.
> > >>>>
> > >>>>
> > >>>> Kind regards,
> > >>>>
> > >>>> Paul Angus
> > >>>>
> > >>>> paul.angus@shapeblue.com
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >>>> @shapeblue
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Jon Marshall <jm...@hotmail.co.uk>
> > >>>> Sent: 14 March 2018 08:36
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: Re: KVM HostHA
> > >>>>
> > >>>> Hi Paul
> > >>>>
> > >>>>
> > >>>> My testing does indeed end up with the failed host in maintenance
> mode
> > >> but
> > >>>> the VMs are never migrated. As I posted earlier the management
> server
> > >> seems
> > >>>> to be saying there is no other host that the VM can be migrated to.
> > >>>>
> > >>>>
> > >>>> Couple of questions if you have the time to respond -
> > >>>>
> > >>>>
> > >>>> 1) this article seems to suggest a reboot or powering off a host
> will
> > >> end
> > >>>> result in the VMs being migrated and this was on CS v 4.2.1 back in
> > >> 2013 so
> > >>>> does Host HA do something different
> > >>>>
> > >>>>
> > >>>> 2) Whenever one of my two nodes is taken down in testing the active
> > >>>> compute nodes HA status goes from Available to Ineligible. Should
> this
> > >>>> happen ie. is it going to Ineligible stopping the manager from
> > migrating
> > >>>> the VMs.
> > >>>>
> > >>>>
> > >>>> Apologies for all the questions but I just can't get this to work at
> > the
> > >>>> moment. If I do eventually get it working I will do a write up for
> > >> others
> > >>>> with same issue :)
> > >>>>
> > >>>>
> > >>>> ________________________________
> > >>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>> Sent: 14 March 2018 07:45
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: RE: KVM HostHA
> > >>>>
> > >>>> Hi Parth,
> > >>>>
> > >>>> Two answer your questions, VM-HA does not restart VMs on an
> alternate
> > >> host
> > >>>> if the original host goes down.  The management server (without
> > host-HA)
> > >>>> cannot tell what happened to the host.  It cannot tell if there was
> a
> > >>>> failure in the agent, loss of connectivity to the management NIC or
> if
> > >> the
> > >>>> host is truly down.  In the first two scenarios, the guest VMs can
> > >> still be
> > >>>> running perfectly well, and to restart them elsewhere would be very
> > >>>> dangerous.  Therefore, the correct thing to do is - nothing but
> alert
> > >> the
> > >>>> operator.  These scenarios are what Host-HA was introd
> > <
> https://maps.google.com/?q=These+scenarios+are+what+Host-HA+was+introd&entry=gmail&source=g
> >uced
> > for.
> > >>>>
> > >>>> Wrt to STONITH, if no disk activity is detected on the host, host-HA
> > >> will
> > >>>> try to restart (via IPMI) the host. If, after a configurable number
> of
> > >>>> attempts, the host agent still does not check in, then host-HA will
> > shut
> > >>>> down the host (via IPMA), trigger VM-HA and mark the host as
> > >> in-maintenance.
> > >>>>
> > >>>>
> > >>>>
> > >>>> paul.angus@shapeblue.com
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Parth Patel <pa...@gmail.com>
> > >>>> Sent: 14 March 2018 05:05
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: Re: KVM HostHA
> > >>>>
> > >>>> Hi Paul,
> > >>>>
> > >>>> Thanks for the clarification. I currently don't have an ipmi enabled
> > >>>> hardware (in test environment), but it will be beneficial if you can
> > >> help
> > >>>> me clear out some basic concepts of it:
> > >>>> - If HA-enabled VMs are autostarted on another host when current
> host
> > >> goes
> > >>>> down, what is the need or purpose of HA-host? (other than management
> > >> server
> > >>>> able to remotely control it's power interfaces)
> > >>>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
> > >> approach
> > >>>> ACS uses to fence the host, but I couldn't find what mechanism or
> > events
> > >>>> trigger this?
> > >>>>
> > >>>> Thanks and regards,
> > >>>> Parth Patel
> > >>>>
> > >>>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
> > >> wrote:
> > >>>>
> > >>>>> The management server doesn't ping the host through IPMI.   However
> > if
> > >>>>> IPMI is not available, you will not be able to use Host HA, as
> there
> > >>>>> is no way for CloudStack to 'fence' the host - that is shut it down
> > to
> > >>>>> be sure that a VM cannot start again on that host.
> > >>>>>
> > >>>>> I can explain why that is necessary if you wish.
> > >>>>>
> > >>>>>
> > >>>>> Kind regards,
> > >>>>>
> > >>>>> Paul Angus
> > >>>>>
> > >>>>> paul.angus@shapeblue.com
> > >>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> -----Original Message-----
> > >>>>> From: Parth Patel <pa...@gmail.com>
> > >>>>> Sent: 13 March 2018 16:57
> > >>>>> To: users@cloudstack.apache.org
> > >>>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
> > >>>>> Subject: Re: KVM HostHA
> > >>>>>
> > >>>>> Hi Jon and Victor,
> > >>>>>
> > >>>>> I think the management server pings your host using ipmi (I really
> > >>>>> don't hope this is the case).
> > >>>>> In my case, I did not have OOBM enabled at all (my hardware didn't
> > >>>>> support
> > >>>>> it)
> > >>>>> I think you could disable OOBM and/or HA-Host and give that a try
> :)
> > >>>>>
> > >>>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> > >>>>>
> > >>>>>> Hello Guys,
> > >>>>>>
> > >>>>>> I have tried the following two cases.
> > >>>>>>
> > >>>>>> 1, "echo c > /proc/sysrq-trigger"
> > >>>>>>
> > >>>>>> 2, Pulled the network cable of one of the host
> > >>>>>>
> > >>>>>> In both cases, the following happened.
> > >>>>>>
> > >>>>>> =====
> > >>>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > >>>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> > >>>>>> nodes of to disconnect
> > >>>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > >>>>>> disconnecting with event AgentDisconnected
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > >>>>>> Alert
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering
> link
> > >>>>>> for
> > >>>>>> 4 with state Alert
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > >>>>>> =====
> > >>>>>>
> > >>>>>> But nothing happened for the  vm's in that node. I have waited for
> > >>>>>> one hour and the VM's in that node has been migrated to the other
> > >>>>>> available hosts. I think the issue is that the management server
> > >>>>>> still thinks that the VM's in that host is running. Please check
> the
> > >>>>>> following logs
> > >>>>>>
> > >>>>>> =======
> > >>>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on
> host
> > >>>>>> 4
> > >>>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > >>>>>> running on host 4 ========
> > >>>>>>
> > >>>>>>
> > >>>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > >>>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> > >>>>>>> into the
> > >>>>>> server but it did not stop the server responding to an ipmitool
> > >>>>>> request on the manager eg -
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> > >>>>> status"
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> from the management server got an answer saying the chassis power
> > >>>>>>> was on
> > >>>>>> so CS never registered the compute node as down.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I am obviously doing something wrong but cannot work it out.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> The management server has one NIC - 172.16.7.4
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Each compute node has 3 NICs -
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>                                       cnode1
> > >>>>>> cnode2
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> mangement NIC        172.16.7.5                   172.16.7.6
> > >>>>>>>
> > >>>>>>> vm NIC                      172.16.6.130
> >  172.16.6.131
> > >>>>>>>
> > >>>>>>> storage -                     172.16.250.4
> >  172.16.250.5
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> If I pull the storage NIC presumably nothing will happen as the
> > >>>>>>> ipmitool
> > >>>>>> check is running across the management NIC so I need to pull both
> ?
> > >>>>>>>
> > >>>>>>> My understanding of host HA was the management server monitored
> > >>>>>>> the
> > >>>>>> compute nodes using ipmitool and if it did not get a response
> > >>>>>> because the host was down it would fence off that host and move
> the
> > >>>>>> VMs to an active compute node.
> > >>>>>>>
> > >>>>>>> This is obviously too simplistic so could someone explain how it
> > >>>>>>> is
> > >>>>>> meant to work and what it is protecting against ?
> > >>>>>>>
> > >>>>>>> ________________________________
> > >>>>>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>>>>> Sent: 13 March 2018 07:01
> > >>>>>>> To: users@cloudstack.apache.org
> > >>>>>>> Subject: RE: KVM HostHA
> > >>>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> One small note, unplugging the management NIC will only cause an
> > >>>>>>> HA
> > >>>>>> event if the storage is running over that NIC also.
> > >>>>>>>
> > >>>>>>> Is the storage is over a separate NIC then, the guest VMs will
> > >>>>>>> continue
> > >>>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the
> disk
> > >>>>>> activity and conclude that there is nothing it can do, as the VMs
> > >>>>>> are still running other than mark the hosts as degra
> <https://maps.google.com/?q=ll+running+other+than+mark+the+hosts+as+degra&entry=gmail&source=g>
> ded.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Kind regards,
> > >>>>>>>
> > >>>>>>> Paul Angus
> > >>>>>>>
> > >>>>>>> paul.angus@shapeblue.com
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company
> > >>>>> <
> > https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> > >>>>> urce=g>
> > >>>>> <http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge is
> > >>>>>> a framework developed by ShapeBlue to deli
> > >>>>>> <
> > https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> > >>>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
> > >>>>>> standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> -----Original Message-----
> > >>>>>>> From: Parth Patel <pa...@gmail.com>
> > >>>>>>> Sent: 12 March 2018 17:35
> > >>>>>>> To: users@cloudstack.apache.org
> > >>>>>>> Subject: Re: KVM HostHA
> > >>>>>>>
> > >>>>>>>> Hi Jon,
> > >>>>>>>>
> > >>>>>>>> As I said, in my case, making the host HA didn't work but by
> just
> > >>>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
> > >>>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >>>>>>>> management server registered it as down and started the VM on
> > >>>>>>>> another host. I know I've suggested this before but I insist you
> > >>>>>>>> give this a try. Also, you don't need to completely power off
> the
> > >>>>>>>> machine manually but just plugging out the network cable works
> > >>>>>>>> fine. The cloudstack agent after losing connection to management
> > >>>>>>>> server auto reboots because of KVM heartbeat check shell script
> > >>>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
> > >>>> thread.
> > >>>>>>>>
> > >>>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jms.123@hotmail.co.uk
> >
> > >>>>> wrote:
> > >>>>>>>> Hi Paul
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks for the response.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I think I am not understanding how it was meant to work then. My
> > >>>>>>>> understanding was that the manager used ipmitool to just keep
> > >>>>>>>> querying the compute nodes as to their status so I assumed it
> > >>>>>>>> didn't matter how you shut the node down, once it was down the
> > >>>>>>>> manager would get no response and mark it as down (which it
> does).
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I am in testing mode so I think I will just go and pull the
> power
> > >>>>>>>> and see what happens :)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Jon
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ________________________________
> > >>>>>>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>>>>>> Sent: 12 March 2018 15:31
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: RE: KVM HostHA
> > >>>>>>>> Hi Jon,
> > >>>>>>>>
> > >>>>>>>> I think that what you guys are finding, is that a controlled
> host
> > >>>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >>>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
> > >>>>>>>> action if you shut down a host, only if the host (agent) stops
> > >>>>>> responding.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Kind regards,
> > >>>>>>>>
> > >>>>>>>> Paul Angus
> > >>>>>>>>
> > >>>>>>>> paul.angus@shapeblue.com
> > >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]
> > >>>>>>>
> > >>>>>>> ]<
> > >>>>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack
> <https://maps.google.com/?q=deployment+framework+for+Apache+CloudStack+&entry=gmail&source=g>IaaS
> Clouds. CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver <
> > >>>>>
> > https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> > >>>>> &entry=gmail&source=g
> > >>>>>>
> > >>>>>> the rapid deployment of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>> CSForge
> > >>>>>>>> is a framework developed by ShapeBlue to deliver the rapid
> > >>>> deployment
> > >>>>>>>> of a standardised ...
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
> > >>>>>>>> Sent: 12 March 2018 15:15
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: Re: KVM HostHA
> > >>>>>>>>
> > >>>>>>>> I have the same issue here and am not entirely sure what the
> > >>>> behaviour
> > >>>>>>>> should be.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I have one manager node and 2 compute nodes running 4.11 with
> ipmi
> > >>>>>> working
> > >>>>>>>> correctly.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> From the UI under HA -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> HA Enabled Yes
> > >>>>>>>> HA State Available
> > >>>>>>>> HA Provider kvmhaprovider
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> although interestingly from the "Details" tab it shows -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> HA enabled No
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> which I assume is a cosmetic issue ?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On each compute node I have one HA enabled VM and one non HA
> > enabled
> > >>>>> VM.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I power off a compute node and the UI updates the host status
> and
> > >>>> the
> > >>>>>> VMs
> > >>>>>>>> on that node stop responding but they never fail over to the
> other
> > >>>>> node.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Couple of things I noticed -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 1) as soon as i power off the compute node the HA state on the
> > other
> > >>>>>> node
> > >>>>>>>> shows "Ineligible"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2) In the UI the instances all still show as green even though
> two
> > >>>> of
> > >>>>>> them
> > >>>>>>>> are not available
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Any help much appreciated
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ________________________________
> > >>>>>>>> From: victor <vi...@ihnetworks.com>
> > >>>>>>>> Sent: 07 March 2018 17:01
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: KVM HostHA
> > >>>>>>>>
> > >>>>>>>> Hello Guys,
> > >>>>>>>>
> > >>>>>>>> I have installed cloudstack 4.11. I have enabled HA for each
> > hosts I
> > >>>>>> have
> > >>>>>>>> added. I have also added ipmi successfully (using ipmi driver).
> > >>>>>>>> The hosts are showing like the following.
> > >>>>>>>>
> > >>>>>>>> =======
> > >>>>>>>>
> > >>>>>>>> HA Enabled Yes
> > >>>>>>>> HA State Available
> > >>>>>>>> HA Provider kvmhaprovider
> > >>>>>>>>
> > >>>>>>>> ======
> > >>>>>>>>
> > >>>>>>>> Also the host is showing the following correctly
> > >>>>>>>>
> > >>>>>>>> Resource state --> Enabled
> > >>>>>>>> State --> UP
> > >>>>>>>> Power state --> On
> > >>>>>>>>
> > >>>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha
> is
> > >>>>>>>> working. I have waited for half an hour. But nothing has
> happened.
> > >>>>> What
> > >>>>>>>> will happen to the VM's in that host, if the host failed to back
> > up.
> > >>>>>>>> There isn't much from logs.
> > >>>>>>>>
> > >>>>>>>> Regards
> > >>>>>>>> Victor
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Andrija Panić
> > >>
> > >>
> > >
> > >
> > > --
> > >
> > > Andrija Panić
> >
> >
>

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Parth


Thanks for that.


I am a beginner too when it comes to this.


Am currently rebuilding so will update this thread when I have retested


Jon


________________________________
From: Parth Patel <pa...@gmail.com>
Sent: 15 March 2018 14:37
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Jon,

I have to admit that I have a beginner/mediocre understanding of cloudstack
overall (especially the host HA feature). But what works for me should work
for everyone. So, to answer your questions:

1) how many compute nodes do you have
>

I have tested using three agents as when using only two nodes, management
server deemed one node which was running system VMs and router as unfit for
migration and stopped the VM. I currently use one node for execution of
system VMs and router, and two agents (compute nodes you can say) out of
which one is running a  HA-enabled VM and one agent running 0 VMs running
as I only have 4GB ram in each of those :| I use one machine (fourth one)
for running management server and MySQL database. I also have the 5th
machine separate purely for NFS. Although, you can easily have management,
MySQL and NFS setup on the same machine (depends on your machine's
configuration/capacity)

>
>
> 2) are you running basic or advanced networking
>

I am using basic (flat) networking where my management IP addresses range
from 172
16.4.131 to 172.16.4.137 and guest IP addresses range from 172.16.4.138 to
172.16.4.149. Both are on a /24 network.

>
> 3) how have you setup your NICs ie. on each compute node I have 3 separate
> NICs, one for management, one for the VMs and one for storage (NFS).
>

I only have 1 NIC per machine (same is used for all 3 types of traffic). I
have seen management server use peer routing from other agents to perform
some operations in my XenCluster but I highly doubt this would be the case
your management server does not mark a host as "Down" (as I said I don't
know about internal working of Cloudstack but just a guess as I've seen in
management server logs) I suggest you remove all three NICs of a host for
simulating my scenario.

>
>
> So far I have not managed to get any failover of VMs no matter what I try
>

I also recommend you update your qemu-kvm, NFS and other packages (there
has just been a recent update for CentOS 7) (again I know this is
superstitious but still, sometimes different package versions have been
known to be the root cause of the issue)

Side note: my ACS 4.11 agent auto reboots itself after it has retried
communicating with management server 4 times, at almost the exact same time
management server decides in its logs that the host and HA-enabled VM has
stopped executing and it restarts the HA-enabled VM on another host.


Hope this helps.

Regards,
Parth Patel.

>
>
> ________________________________
> From: Parth Patel <pa...@gmail.com>
> Sent: 14 March 2018 14:36
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> Hi Paul and Adrina,
>
> I don't know the functioning of Host-HA features but what Paul explained,
> my ACS 4.11 does the same without even host HA or ipmi access. As I stated
> earlier multiple times, without host HA and ipmi, my ha-enabled VMs
> executing on a normal host get restarted on another suitable host in
> cluster after approximately 3 minutes of event ping timeout. After which
> the cloudstack agent with no connection to management server because of
> unplugged NIC (all my machines currently have only one NIC / whole zone is
> in a flat network) reboots itself (the reason was explained by Rohit in an
> another thread). The management server marks the host down and only
> Ha-enabled VMs executing on it get restarted on another host (without any
> mention of host HA or ipmi or fencing in management server logs) while
> normal VMs executing on it are stopped.
>
> I don't know if this was a desired outcome, but I think my current ACS 4.11
> installation has features (at least performs some ;) provided by Host HA
> without configuring it or ipmi.
>
> Regards,
> Parth Patel
>
> On Wed 14 Mar, 2018, 18:41 Boris Stoyanov, <bo...@shapeblue.com>
> wrote:
>
> > yes, KVM + NFS shared storage.
> >
> > Boris.
> >
> >
> > boris.stoyanov@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> > > On 14 Mar 2018, at 14:51, Andrija Panic <an...@gmail.com>
> wrote:
> > >
> > > Hi Boris,
> > >
> > > ok thanks for the explanation - that makes sense, and covers my
> > "exception
> > > case" that I have.
> > >
> > > This is atm only available for NFS as I could read (KVM on NFS) ?
> > >
> > > Cheers
> > >
> > > On 14 March 2018 at 13:02, Boris Stoyanov <
> boris.stoyanov@shapeblue.com>
> > > wrote:
> > >
> > >> Hi Andrija,
> > >>
> > >> There’s two types of checks Host-HA is doing to determine if host if
> > >> healthy.
> > >>
> > >> 1. Health checks - pings the host as soon as there’s connection issues
> > >> with the agent
> > >>
> > >> If that fails,
> > >>
> > >> 2. Activity checks - checks if there are any writing operations on the
> > >> Disks of the VMs that are running on the hosts. This is to determine
> if
> > the
> > >> VMs are actually alive and executing processes. Only if no disk
> > operations
> > >> are executed on the shared storage, only then it’s trying to Recover
> the
> > >> host with IPMI call, if that eventually fails, it migrates the VMs to
> a
> > >> healthy host and Fences the faulty one.
> > >>
> > >> Hope that explains your case.
> > >>
> > >> Boris.
> > >>
> > >>
> > >> boris.stoyanov@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >> @shapeblue
> > >>
> > >>
> > >>
> > >>> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com>
> > wrote:
> > >>>
> > >>> Hi Paul,
> > >>>
> > >>> sorry to bump in the middle of the thread, but just curious about the
> > >> idea
> > >>> behing host-HA and why it behaves the way you exlained above:
> > >>>
> > >>>
> > >>> Would it be more sense (or not?), that when MGMT detects agents is
> > >>> unreachable or host unreachable (or after unsuccessful i.e. agent
> > >> restart,
> > >>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> > >>> making sure no VMS running and then to really start all HA-enabled
> VMs
> > on
> > >>> other hosts ?
> > >>>
> > >>> I'm just trying to make parallel to the corosync/pacemaker as
> > clustering
> > >>> suite/services in Linux (RHEL and others), where when majority of
> nodes
> > >>> detect that one node is down, a common thing (especially for shared
> > >>> storage) is to STONITH that node, make sure it;s down, then move
> > >> "resource"
> > >>> (in our case VMs) to other cluster nodes ?
> > >>>
> > >>> I see it's  actually much broader setup per
> > >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> > >> again -
> > >>> whole idea (in my head at least...) is when host get's down, we make
> > sure
> > >>> it's down (avoid VM corruption, by doint STONITH to that node) and
> then
> > >>> start HA VMs on ohter hosts.
> > >>>
> > >>> I understand there might be exceptions as I have right now (4.8) -
> > >> libvirt
> > >>> get stuck (librbd exception or similar) so agent get's disconnected,
> > but
> > >>> VMs are still running fine... (except DB get messed up, all NICs
> loose
> > >>> isolation_uri, VR's loose MAC addresses and other IP addresses
> etc...)
> > >>>
> > >>>
> > >>> Thanks
> > >>> Andrija
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
> > >>>
> > >>>> That would make sense.
> > >>>>
> > >>>>
> > >>>> I have another server being used for something else at the moment
> so I
> > >>>> will add that in and update this thread when I have tested
> > >>>>
> > >>>>
> > >>>> Jon
> > >>>>
> > >>>>
> > >>>> ________________________________
> > >>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>> Sent: 14 March 2018 09:16
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: RE: KVM HostHA
> > >>>>
> > >>>> I'd need to do some testing, but I suspect that your problem is that
> > you
> > >>>> only have two hosts.  At the point that one host is deemed out of
> > >> service,
> > >>>> you only have one host left.  With only one host, CloudStack will
> show
> > >> the
> > >>>> cluster as ineligible.
> > >>>>
> > >>>> It is extremely common for any system working as a cluster to
> require
> > a
> > >>>> minimum starting point of 3 nodes to be able to function.
> > >>>>
> > >>>>
> > >>>> Kind regards,
> > >>>>
> > >>>> Paul Angus
> > >>>>
> > >>>> paul.angus@shapeblue.com
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >>>> @shapeblue
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Jon Marshall <jm...@hotmail.co.uk>
> > >>>> Sent: 14 March 2018 08:36
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: Re: KVM HostHA
> > >>>>
> > >>>> Hi Paul
> > >>>>
> > >>>>
> > >>>> My testing does indeed end up with the failed host in maintenance
> mode
> > >> but
> > >>>> the VMs are never migrated. As I posted earlier the management
> server
> > >> seems
> > >>>> to be saying there is no other host that the VM can be migrated to.
> > >>>>
> > >>>>
> > >>>> Couple of questions if you have the time to respond -
> > >>>>
> > >>>>
> > >>>> 1) this article seems to suggest a reboot or powering off a host
> will
> > >> end
> > >>>> result in the VMs being migrated and this was on CS v 4.2.1 back in
> > >> 2013 so
> > >>>> does Host HA do something different
> > >>>>
> > >>>>
> > >>>> 2) Whenever one of my two nodes is taken down in testing the active
> > >>>> compute nodes HA status goes from Available to Ineligible. Should
> this
> > >>>> happen ie. is it going to Ineligible stopping the manager from
> > migrating
> > >>>> the VMs.
> > >>>>
> > >>>>
> > >>>> Apologies for all the questions but I just can't get this to work at
> > the
> > >>>> moment. If I do eventually get it working I will do a write up for
> > >> others
> > >>>> with same issue :)
> > >>>>
> > >>>>
> > >>>> ________________________________
> > >>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>> Sent: 14 March 2018 07:45
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: RE: KVM HostHA
> > >>>>
> > >>>> Hi Parth,
> > >>>>
> > >>>> Two answer your questions, VM-HA does not restart VMs on an
> alternate
> > >> host
> > >>>> if the original host goes down.  The management server (without
> > host-HA)
> > >>>> cannot tell what happened to the host.  It cannot tell if there was
> a
> > >>>> failure in the agent, loss of connectivity to the management NIC or
> if
> > >> the
> > >>>> host is truly down.  In the first two scenarios, the guest VMs can
> > >> still be
> > >>>> running perfectly well, and to restart them elsewhere would be very
> > >>>> dangerous.  Therefore, the correct thing to do is - nothing but
> alert
> > >> the
> > >>>> operator.  These scenarios are what Host-HA was introd
> > <
> https://maps.google.com/?q=These+scenarios+are+what+Host-HA+was+introd&entry=gmail&source=g
> >uced
> > for.
> > >>>>
> > >>>> Wrt to STONITH, if no disk activity is detected on the host, host-HA
> > >> will
> > >>>> try to restart (via IPMI) the host. If, after a configurable number
> of
> > >>>> attempts, the host agent still does not check in, then host-HA will
> > shut
> > >>>> down the host (via IPMA), trigger VM-HA and mark the host as
> > >> in-maintenance.
> > >>>>
> > >>>>
> > >>>>
> > >>>> paul.angus@shapeblue.com
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Parth Patel <pa...@gmail.com>
> > >>>> Sent: 14 March 2018 05:05
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: Re: KVM HostHA
> > >>>>
> > >>>> Hi Paul,
> > >>>>
> > >>>> Thanks for the clarification. I currently don't have an ipmi enabled
> > >>>> hardware (in test environment), but it will be beneficial if you can
> > >> help
> > >>>> me clear out some basic concepts of it:
> > >>>> - If HA-enabled VMs are autostarted on another host when current
> host
> > >> goes
> > >>>> down, what is the need or purpose of HA-host? (other than management
> > >> server
> > >>>> able to remotely control it's power interfaces)
> > >>>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
> > >> approach
> > >>>> ACS uses to fence the host, but I couldn't find what mechanism or
> > events
> > >>>> trigger this?
> > >>>>
> > >>>> Thanks and regards,
> > >>>> Parth Patel
> > >>>>
> > >>>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
> > >> wrote:
> > >>>>
> > >>>>> The management server doesn't ping the host through IPMI.   However
> > if
> > >>>>> IPMI is not available, you will not be able to use Host HA, as
> there
> > >>>>> is no way for CloudStack to 'fence' the host - that is shut it down
> > to
> > >>>>> be sure that a VM cannot start again on that host.
> > >>>>>
> > >>>>> I can explain why that is necessary if you wish.
> > >>>>>
> > >>>>>
> > >>>>> Kind regards,
> > >>>>>
> > >>>>> Paul Angus
> > >>>>>
> > >>>>> paul.angus@shapeblue.com
> > >>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> -----Original Message-----
> > >>>>> From: Parth Patel <pa...@gmail.com>
> > >>>>> Sent: 13 March 2018 16:57
> > >>>>> To: users@cloudstack.apache.org
> > >>>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
> > >>>>> Subject: Re: KVM HostHA
> > >>>>>
> > >>>>> Hi Jon and Victor,
> > >>>>>
> > >>>>> I think the management server pings your host using ipmi (I really
> > >>>>> don't hope this is the case).
> > >>>>> In my case, I did not have OOBM enabled at all (my hardware didn't
> > >>>>> support
> > >>>>> it)
> > >>>>> I think you could disable OOBM and/or HA-Host and give that a try
> :)
> > >>>>>
> > >>>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> > >>>>>
> > >>>>>> Hello Guys,
> > >>>>>>
> > >>>>>> I have tried the following two cases.
> > >>>>>>
> > >>>>>> 1, "echo c > /proc/sysrq-trigger"
> > >>>>>>
> > >>>>>> 2, Pulled the network cable of one of the host
> > >>>>>>
> > >>>>>> In both cases, the following happened.
> > >>>>>>
> > >>>>>> =====
> > >>>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > >>>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> > >>>>>> nodes of to disconnect
> > >>>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > >>>>>> disconnecting with event AgentDisconnected
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > >>>>>> Alert
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering
> link
> > >>>>>> for
> > >>>>>> 4 with state Alert
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > >>>>>> =====
> > >>>>>>
> > >>>>>> But nothing happened for the  vm's in that node. I have waited for
> > >>>>>> one hour and the VM's in that node has been migrated to the other
> > >>>>>> available hosts. I think the issue is that the management server
> > >>>>>> still thinks that the VM's in that host is running. Please check
> the
> > >>>>>> following logs
> > >>>>>>
> > >>>>>> =======
> > >>>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on
> host
> > >>>>>> 4
> > >>>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > >>>>>> running on host 4 ========
> > >>>>>>
> > >>>>>>
> > >>>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > >>>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> > >>>>>>> into the
> > >>>>>> server but it did not stop the server responding to an ipmitool
> > >>>>>> request on the manager eg -
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> > >>>>> status"
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> from the management server got an answer saying the chassis power
> > >>>>>>> was on
> > >>>>>> so CS never registered the compute node as down.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I am obviously doing something wrong but cannot work it out.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> The management server has one NIC - 172.16.7.4
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Each compute node has 3 NICs -
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>                                       cnode1
> > >>>>>> cnode2
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> mangement NIC        172.16.7.5                   172.16.7.6
> > >>>>>>>
> > >>>>>>> vm NIC                      172.16.6.130
> >  172.16.6.131
> > >>>>>>>
> > >>>>>>> storage -                     172.16.250.4
> >  172.16.250.5
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> If I pull the storage NIC presumably nothing will happen as the
> > >>>>>>> ipmitool
> > >>>>>> check is running across the management NIC so I need to pull both
> ?
> > >>>>>>>
> > >>>>>>> My understanding of host HA was the management server monitored
> > >>>>>>> the
> > >>>>>> compute nodes using ipmitool and if it did not get a response
> > >>>>>> because the host was down it would fence off that host and move
> the
> > >>>>>> VMs to an active compute node.
> > >>>>>>>
> > >>>>>>> This is obviously too simplistic so could someone explain how it
> > >>>>>>> is
> > >>>>>> meant to work and what it is protecting against ?
> > >>>>>>>
> > >>>>>>> ________________________________
> > >>>>>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>>>>> Sent: 13 March 2018 07:01
> > >>>>>>> To: users@cloudstack.apache.org
> > >>>>>>> Subject: RE: KVM HostHA
> > >>>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> One small note, unplugging the management NIC will only cause an
> > >>>>>>> HA
> > >>>>>> event if the storage is running over that NIC also.
> > >>>>>>>
> > >>>>>>> Is the storage is over a separate NIC then, the guest VMs will
> > >>>>>>> continue
> > >>>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the
> disk
> > >>>>>> activity and conclude that there is nothing it can do, as the VMs
> > >>>>>> are still running other than mark the hosts as degra
> <https://maps.google.com/?q=ll+running+other+than+mark+the+hosts+as+degra&entry=gmail&source=g>
> ded.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Kind regards,
> > >>>>>>>
> > >>>>>>> Paul Angus
> > >>>>>>>
> > >>>>>>> paul.angus@shapeblue.com
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company
> > >>>>> <
> > https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> > >>>>> urce=g>
> > >>>>> <http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge is
> > >>>>>> a framework developed by ShapeBlue to deli
> > >>>>>> <
> > https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> > >>>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
> > >>>>>> standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> -----Original Message-----
> > >>>>>>> From: Parth Patel <pa...@gmail.com>
> > >>>>>>> Sent: 12 March 2018 17:35
> > >>>>>>> To: users@cloudstack.apache.org
> > >>>>>>> Subject: Re: KVM HostHA
> > >>>>>>>
> > >>>>>>>> Hi Jon,
> > >>>>>>>>
> > >>>>>>>> As I said, in my case, making the host HA didn't work but by
> just
> > >>>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
> > >>>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >>>>>>>> management server registered it as down and started the VM on
> > >>>>>>>> another host. I know I've suggested this before but I insist you
> > >>>>>>>> give this a try. Also, you don't need to completely power off
> the
> > >>>>>>>> machine manually but just plugging out the network cable works
> > >>>>>>>> fine. The cloudstack agent after losing connection to management
> > >>>>>>>> server auto reboots because of KVM heartbeat check shell script
> > >>>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
> > >>>> thread.
> > >>>>>>>>
> > >>>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jms.123@hotmail.co.uk
> >
> > >>>>> wrote:
> > >>>>>>>> Hi Paul
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks for the response.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I think I am not understanding how it was meant to work then. My
> > >>>>>>>> understanding was that the manager used ipmitool to just keep
> > >>>>>>>> querying the compute nodes as to their status so I assumed it
> > >>>>>>>> didn't matter how you shut the node down, once it was down the
> > >>>>>>>> manager would get no response and mark it as down (which it
> does).
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I am in testing mode so I think I will just go and pull the
> power
> > >>>>>>>> and see what happens :)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Jon
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ________________________________
> > >>>>>>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>>>>>> Sent: 12 March 2018 15:31
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: RE: KVM HostHA
> > >>>>>>>> Hi Jon,
> > >>>>>>>>
> > >>>>>>>> I think that what you guys are finding, is that a controlled
> host
> > >>>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >>>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
> > >>>>>>>> action if you shut down a host, only if the host (agent) stops
> > >>>>>> responding.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Kind regards,
> > >>>>>>>>
> > >>>>>>>> Paul Angus
> > >>>>>>>>
> > >>>>>>>> paul.angus@shapeblue.com
> > >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]
> > >>>>>>>
> > >>>>>>> ]<
> > >>>>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack
> <https://maps.google.com/?q=deployment+framework+for+Apache+CloudStack+&entry=gmail&source=g>IaaS
> Clouds. CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver <
> > >>>>>
> > https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> > >>>>> &entry=gmail&source=g
> > >>>>>>
> > >>>>>> the rapid deployment of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>> CSForge
> > >>>>>>>> is a framework developed by ShapeBlue to deliver the rapid
> > >>>> deployment
> > >>>>>>>> of a standardised ...
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
> > >>>>>>>> Sent: 12 March 2018 15:15
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: Re: KVM HostHA
> > >>>>>>>>
> > >>>>>>>> I have the same issue here and am not entirely sure what the
> > >>>> behaviour
> > >>>>>>>> should be.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I have one manager node and 2 compute nodes running 4.11 with
> ipmi
> > >>>>>> working
> > >>>>>>>> correctly.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> From the UI under HA -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> HA Enabled Yes
> > >>>>>>>> HA State Available
> > >>>>>>>> HA Provider kvmhaprovider
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> although interestingly from the "Details" tab it shows -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> HA enabled No
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> which I assume is a cosmetic issue ?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On each compute node I have one HA enabled VM and one non HA
> > enabled
> > >>>>> VM.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I power off a compute node and the UI updates the host status
> and
> > >>>> the
> > >>>>>> VMs
> > >>>>>>>> on that node stop responding but they never fail over to the
> other
> > >>>>> node.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Couple of things I noticed -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 1) as soon as i power off the compute node the HA state on the
> > other
> > >>>>>> node
> > >>>>>>>> shows "Ineligible"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2) In the UI the instances all still show as green even though
> two
> > >>>> of
> > >>>>>> them
> > >>>>>>>> are not available
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Any help much appreciated
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ________________________________
> > >>>>>>>> From: victor <vi...@ihnetworks.com>
> > >>>>>>>> Sent: 07 March 2018 17:01
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: KVM HostHA
> > >>>>>>>>
> > >>>>>>>> Hello Guys,
> > >>>>>>>>
> > >>>>>>>> I have installed cloudstack 4.11. I have enabled HA for each
> > hosts I
> > >>>>>> have
> > >>>>>>>> added. I have also added ipmi successfully (using ipmi driver).
> > >>>>>>>> The hosts are showing like the following.
> > >>>>>>>>
> > >>>>>>>> =======
> > >>>>>>>>
> > >>>>>>>> HA Enabled Yes
> > >>>>>>>> HA State Available
> > >>>>>>>> HA Provider kvmhaprovider
> > >>>>>>>>
> > >>>>>>>> ======
> > >>>>>>>>
> > >>>>>>>> Also the host is showing the following correctly
> > >>>>>>>>
> > >>>>>>>> Resource state --> Enabled
> > >>>>>>>> State --> UP
> > >>>>>>>> Power state --> On
> > >>>>>>>>
> > >>>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha
> is
> > >>>>>>>> working. I have waited for half an hour. But nothing has
> happened.
> > >>>>> What
> > >>>>>>>> will happen to the VM's in that host, if the host failed to back
> > up.
> > >>>>>>>> There isn't much from logs.
> > >>>>>>>>
> > >>>>>>>> Regards
> > >>>>>>>> Victor
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Andrija Panic
> > >>
> > >>
> > >
> > >
> > > --
> > >
> > > Andrija Panic
> >
> >
>

Re: KVM HostHA

Posted by Parth Patel <pa...@gmail.com>.
Hi Jon,

I have to admit that I have a beginner/mediocre understanding of cloudstack
overall (especially the host HA feature). But what works for me should work
for everyone. So, to answer your questions:

1) how many compute nodes do you have
>

I have tested using three agents as when using only two nodes, management
server deemed one node which was running system VMs and router as unfit for
migration and stopped the VM. I currently use one node for execution of
system VMs and router, and two agents (compute nodes you can say) out of
which one is running a  HA-enabled VM and one agent running 0 VMs running
as I only have 4GB ram in each of those :| I use one machine (fourth one)
for running management server and MySQL database. I also have the 5th
machine separate purely for NFS. Although, you can easily have management,
MySQL and NFS setup on the same machine (depends on your machine's
configuration/capacity)

>
>
> 2) are you running basic or advanced networking
>

I am using basic (flat) networking where my management IP addresses range
from 172
16.4.131 to 172.16.4.137 and guest IP addresses range from 172.16.4.138 to
172.16.4.149. Both are on a /24 network.

>
> 3) how have you setup your NICs ie. on each compute node I have 3 separate
> NICs, one for management, one for the VMs and one for storage (NFS).
>

I only have 1 NIC per machine (same is used for all 3 types of traffic). I
have seen management server use peer routing from other agents to perform
some operations in my XenCluster but I highly doubt this would be the case
your management server does not mark a host as "Down" (as I said I don't
know about internal working of Cloudstack but just a guess as I've seen in
management server logs) I suggest you remove all three NICs of a host for
simulating my scenario.

>
>
> So far I have not managed to get any failover of VMs no matter what I try
>

I also recommend you update your qemu-kvm, NFS and other packages (there
has just been a recent update for CentOS 7) (again I know this is
superstitious but still, sometimes different package versions have been
known to be the root cause of the issue)

Side note: my ACS 4.11 agent auto reboots itself after it has retried
communicating with management server 4 times, at almost the exact same time
management server decides in its logs that the host and HA-enabled VM has
stopped executing and it restarts the HA-enabled VM on another host.


Hope this helps.

Regards,
Parth Patel.

>
>
> ________________________________
> From: Parth Patel <pa...@gmail.com>
> Sent: 14 March 2018 14:36
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> Hi Paul and Adrina,
>
> I don't know the functioning of Host-HA features but what Paul explained,
> my ACS 4.11 does the same without even host HA or ipmi access. As I stated
> earlier multiple times, without host HA and ipmi, my ha-enabled VMs
> executing on a normal host get restarted on another suitable host in
> cluster after approximately 3 minutes of event ping timeout. After which
> the cloudstack agent with no connection to management server because of
> unplugged NIC (all my machines currently have only one NIC / whole zone is
> in a flat network) reboots itself (the reason was explained by Rohit in an
> another thread). The management server marks the host down and only
> Ha-enabled VMs executing on it get restarted on another host (without any
> mention of host HA or ipmi or fencing in management server logs) while
> normal VMs executing on it are stopped.
>
> I don't know if this was a desired outcome, but I think my current ACS 4.11
> installation has features (at least performs some ;) provided by Host HA
> without configuring it or ipmi.
>
> Regards,
> Parth Patel
>
> On Wed 14 Mar, 2018, 18:41 Boris Stoyanov, <bo...@shapeblue.com>
> wrote:
>
> > yes, KVM + NFS shared storage.
> >
> > Boris.
> >
> >
> > boris.stoyanov@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> > > On 14 Mar 2018, at 14:51, Andrija Panic <an...@gmail.com>
> wrote:
> > >
> > > Hi Boris,
> > >
> > > ok thanks for the explanation - that makes sense, and covers my
> > "exception
> > > case" that I have.
> > >
> > > This is atm only available for NFS as I could read (KVM on NFS) ?
> > >
> > > Cheers
> > >
> > > On 14 March 2018 at 13:02, Boris Stoyanov <
> boris.stoyanov@shapeblue.com>
> > > wrote:
> > >
> > >> Hi Andrija,
> > >>
> > >> There’s two types of checks Host-HA is doing to determine if host if
> > >> healthy.
> > >>
> > >> 1. Health checks - pings the host as soon as there’s connection issues
> > >> with the agent
> > >>
> > >> If that fails,
> > >>
> > >> 2. Activity checks - checks if there are any writing operations on the
> > >> Disks of the VMs that are running on the hosts. This is to determine
> if
> > the
> > >> VMs are actually alive and executing processes. Only if no disk
> > operations
> > >> are executed on the shared storage, only then it’s trying to Recover
> the
> > >> host with IPMI call, if that eventually fails, it migrates the VMs to
> a
> > >> healthy host and Fences the faulty one.
> > >>
> > >> Hope that explains your case.
> > >>
> > >> Boris.
> > >>
> > >>
> > >> boris.stoyanov@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >> @shapeblue
> > >>
> > >>
> > >>
> > >>> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com>
> > wrote:
> > >>>
> > >>> Hi Paul,
> > >>>
> > >>> sorry to bump in the middle of the thread, but just curious about the
> > >> idea
> > >>> behing host-HA and why it behaves the way you exlained above:
> > >>>
> > >>>
> > >>> Would it be more sense (or not?), that when MGMT detects agents is
> > >>> unreachable or host unreachable (or after unsuccessful i.e. agent
> > >> restart,
> > >>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> > >>> making sure no VMS running and then to really start all HA-enabled
> VMs
> > on
> > >>> other hosts ?
> > >>>
> > >>> I'm just trying to make parallel to the corosync/pacemaker as
> > clustering
> > >>> suite/services in Linux (RHEL and others), where when majority of
> nodes
> > >>> detect that one node is down, a common thing (especially for shared
> > >>> storage) is to STONITH that node, make sure it;s down, then move
> > >> "resource"
> > >>> (in our case VMs) to other cluster nodes ?
> > >>>
> > >>> I see it's  actually much broader setup per
> > >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> > >> again -
> > >>> whole idea (in my head at least...) is when host get's down, we make
> > sure
> > >>> it's down (avoid VM corruption, by doint STONITH to that node) and
> then
> > >>> start HA VMs on ohter hosts.
> > >>>
> > >>> I understand there might be exceptions as I have right now (4.8) -
> > >> libvirt
> > >>> get stuck (librbd exception or similar) so agent get's disconnected,
> > but
> > >>> VMs are still running fine... (except DB get messed up, all NICs
> loose
> > >>> isolation_uri, VR's loose MAC addresses and other IP addresses
> etc...)
> > >>>
> > >>>
> > >>> Thanks
> > >>> Andrija
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
> > >>>
> > >>>> That would make sense.
> > >>>>
> > >>>>
> > >>>> I have another server being used for something else at the moment
> so I
> > >>>> will add that in and update this thread when I have tested
> > >>>>
> > >>>>
> > >>>> Jon
> > >>>>
> > >>>>
> > >>>> ________________________________
> > >>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>> Sent: 14 March 2018 09:16
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: RE: KVM HostHA
> > >>>>
> > >>>> I'd need to do some testing, but I suspect that your problem is that
> > you
> > >>>> only have two hosts.  At the point that one host is deemed out of
> > >> service,
> > >>>> you only have one host left.  With only one host, CloudStack will
> show
> > >> the
> > >>>> cluster as ineligible.
> > >>>>
> > >>>> It is extremely common for any system working as a cluster to
> require
> > a
> > >>>> minimum starting point of 3 nodes to be able to function.
> > >>>>
> > >>>>
> > >>>> Kind regards,
> > >>>>
> > >>>> Paul Angus
> > >>>>
> > >>>> paul.angus@shapeblue.com
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >>>> @shapeblue
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Jon Marshall <jm...@hotmail.co.uk>
> > >>>> Sent: 14 March 2018 08:36
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: Re: KVM HostHA
> > >>>>
> > >>>> Hi Paul
> > >>>>
> > >>>>
> > >>>> My testing does indeed end up with the failed host in maintenance
> mode
> > >> but
> > >>>> the VMs are never migrated. As I posted earlier the management
> server
> > >> seems
> > >>>> to be saying there is no other host that the VM can be migrated to.
> > >>>>
> > >>>>
> > >>>> Couple of questions if you have the time to respond -
> > >>>>
> > >>>>
> > >>>> 1) this article seems to suggest a reboot or powering off a host
> will
> > >> end
> > >>>> result in the VMs being migrated and this was on CS v 4.2.1 back in
> > >> 2013 so
> > >>>> does Host HA do something different
> > >>>>
> > >>>>
> > >>>> 2) Whenever one of my two nodes is taken down in testing the active
> > >>>> compute nodes HA status goes from Available to Ineligible. Should
> this
> > >>>> happen ie. is it going to Ineligible stopping the manager from
> > migrating
> > >>>> the VMs.
> > >>>>
> > >>>>
> > >>>> Apologies for all the questions but I just can't get this to work at
> > the
> > >>>> moment. If I do eventually get it working I will do a write up for
> > >> others
> > >>>> with same issue :)
> > >>>>
> > >>>>
> > >>>> ________________________________
> > >>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>> Sent: 14 March 2018 07:45
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: RE: KVM HostHA
> > >>>>
> > >>>> Hi Parth,
> > >>>>
> > >>>> Two answer your questions, VM-HA does not restart VMs on an
> alternate
> > >> host
> > >>>> if the original host goes down.  The management server (without
> > host-HA)
> > >>>> cannot tell what happened to the host.  It cannot tell if there was
> a
> > >>>> failure in the agent, loss of connectivity to the management NIC or
> if
> > >> the
> > >>>> host is truly down.  In the first two scenarios, the guest VMs can
> > >> still be
> > >>>> running perfectly well, and to restart them elsewhere would be very
> > >>>> dangerous.  Therefore, the correct thing to do is - nothing but
> alert
> > >> the
> > >>>> operator.  These scenarios are what Host-HA was introd
> > <
> https://maps.google.com/?q=These+scenarios+are+what+Host-HA+was+introd&entry=gmail&source=g
> >uced
> > for.
> > >>>>
> > >>>> Wrt to STONITH, if no disk activity is detected on the host, host-HA
> > >> will
> > >>>> try to restart (via IPMI) the host. If, after a configurable number
> of
> > >>>> attempts, the host agent still does not check in, then host-HA will
> > shut
> > >>>> down the host (via IPMA), trigger VM-HA and mark the host as
> > >> in-maintenance.
> > >>>>
> > >>>>
> > >>>>
> > >>>> paul.angus@shapeblue.com
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> -----Original Message-----
> > >>>> From: Parth Patel <pa...@gmail.com>
> > >>>> Sent: 14 March 2018 05:05
> > >>>> To: users@cloudstack.apache.org
> > >>>> Subject: Re: KVM HostHA
> > >>>>
> > >>>> Hi Paul,
> > >>>>
> > >>>> Thanks for the clarification. I currently don't have an ipmi enabled
> > >>>> hardware (in test environment), but it will be beneficial if you can
> > >> help
> > >>>> me clear out some basic concepts of it:
> > >>>> - If HA-enabled VMs are autostarted on another host when current
> host
> > >> goes
> > >>>> down, what is the need or purpose of HA-host? (other than management
> > >> server
> > >>>> able to remotely control it's power interfaces)
> > >>>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
> > >> approach
> > >>>> ACS uses to fence the host, but I couldn't find what mechanism or
> > events
> > >>>> trigger this?
> > >>>>
> > >>>> Thanks and regards,
> > >>>> Parth Patel
> > >>>>
> > >>>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
> > >> wrote:
> > >>>>
> > >>>>> The management server doesn't ping the host through IPMI.   However
> > if
> > >>>>> IPMI is not available, you will not be able to use Host HA, as
> there
> > >>>>> is no way for CloudStack to 'fence' the host - that is shut it down
> > to
> > >>>>> be sure that a VM cannot start again on that host.
> > >>>>>
> > >>>>> I can explain why that is necessary if you wish.
> > >>>>>
> > >>>>>
> > >>>>> Kind regards,
> > >>>>>
> > >>>>> Paul Angus
> > >>>>>
> > >>>>> paul.angus@shapeblue.com
> > >>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> -----Original Message-----
> > >>>>> From: Parth Patel <pa...@gmail.com>
> > >>>>> Sent: 13 March 2018 16:57
> > >>>>> To: users@cloudstack.apache.org
> > >>>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
> > >>>>> Subject: Re: KVM HostHA
> > >>>>>
> > >>>>> Hi Jon and Victor,
> > >>>>>
> > >>>>> I think the management server pings your host using ipmi (I really
> > >>>>> don't hope this is the case).
> > >>>>> In my case, I did not have OOBM enabled at all (my hardware didn't
> > >>>>> support
> > >>>>> it)
> > >>>>> I think you could disable OOBM and/or HA-Host and give that a try
> :)
> > >>>>>
> > >>>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> > >>>>>
> > >>>>>> Hello Guys,
> > >>>>>>
> > >>>>>> I have tried the following two cases.
> > >>>>>>
> > >>>>>> 1, "echo c > /proc/sysrq-trigger"
> > >>>>>>
> > >>>>>> 2, Pulled the network cable of one of the host
> > >>>>>>
> > >>>>>> In both cases, the following happened.
> > >>>>>>
> > >>>>>> =====
> > >>>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > >>>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> > >>>>>> nodes of to disconnect
> > >>>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > >>>>>> disconnecting with event AgentDisconnected
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > >>>>>> Alert
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering
> link
> > >>>>>> for
> > >>>>>> 4 with state Alert
> > >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > >>>>>> =====
> > >>>>>>
> > >>>>>> But nothing happened for the  vm's in that node. I have waited for
> > >>>>>> one hour and the VM's in that node has been migrated to the other
> > >>>>>> available hosts. I think the issue is that the management server
> > >>>>>> still thinks that the VM's in that host is running. Please check
> the
> > >>>>>> following logs
> > >>>>>>
> > >>>>>> =======
> > >>>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on
> host
> > >>>>>> 4
> > >>>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > >>>>>> running on host 4 ========
> > >>>>>>
> > >>>>>>
> > >>>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > >>>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> > >>>>>>> into the
> > >>>>>> server but it did not stop the server responding to an ipmitool
> > >>>>>> request on the manager eg -
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> > >>>>> status"
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> from the management server got an answer saying the chassis power
> > >>>>>>> was on
> > >>>>>> so CS never registered the compute node as down.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I am obviously doing something wrong but cannot work it out.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> The management server has one NIC - 172.16.7.4
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Each compute node has 3 NICs -
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>                                       cnode1
> > >>>>>> cnode2
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> mangement NIC        172.16.7.5                   172.16.7.6
> > >>>>>>>
> > >>>>>>> vm NIC                      172.16.6.130
> >  172.16.6.131
> > >>>>>>>
> > >>>>>>> storage -                     172.16.250.4
> >  172.16.250.5
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> If I pull the storage NIC presumably nothing will happen as the
> > >>>>>>> ipmitool
> > >>>>>> check is running across the management NIC so I need to pull both
> ?
> > >>>>>>>
> > >>>>>>> My understanding of host HA was the management server monitored
> > >>>>>>> the
> > >>>>>> compute nodes using ipmitool and if it did not get a response
> > >>>>>> because the host was down it would fence off that host and move
> the
> > >>>>>> VMs to an active compute node.
> > >>>>>>>
> > >>>>>>> This is obviously too simplistic so could someone explain how it
> > >>>>>>> is
> > >>>>>> meant to work and what it is protecting against ?
> > >>>>>>>
> > >>>>>>> ________________________________
> > >>>>>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>>>>> Sent: 13 March 2018 07:01
> > >>>>>>> To: users@cloudstack.apache.org
> > >>>>>>> Subject: RE: KVM HostHA
> > >>>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> One small note, unplugging the management NIC will only cause an
> > >>>>>>> HA
> > >>>>>> event if the storage is running over that NIC also.
> > >>>>>>>
> > >>>>>>> Is the storage is over a separate NIC then, the guest VMs will
> > >>>>>>> continue
> > >>>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the
> disk
> > >>>>>> activity and conclude that there is nothing it can do, as the VMs
> > >>>>>> are still running other than mark the hosts as degra
> <https://maps.google.com/?q=ll+running+other+than+mark+the+hosts+as+degra&entry=gmail&source=g>
> ded.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Kind regards,
> > >>>>>>>
> > >>>>>>> Paul Angus
> > >>>>>>>
> > >>>>>>> paul.angus@shapeblue.com
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company
> > >>>>> <
> > https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> > >>>>> urce=g>
> > >>>>> <http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge is
> > >>>>>> a framework developed by ShapeBlue to deli
> > >>>>>> <
> > https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> > >>>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
> > >>>>>> standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> -----Original Message-----
> > >>>>>>> From: Parth Patel <pa...@gmail.com>
> > >>>>>>> Sent: 12 March 2018 17:35
> > >>>>>>> To: users@cloudstack.apache.org
> > >>>>>>> Subject: Re: KVM HostHA
> > >>>>>>>
> > >>>>>>>> Hi Jon,
> > >>>>>>>>
> > >>>>>>>> As I said, in my case, making the host HA didn't work but by
> just
> > >>>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
> > >>>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >>>>>>>> management server registered it as down and started the VM on
> > >>>>>>>> another host. I know I've suggested this before but I insist you
> > >>>>>>>> give this a try. Also, you don't need to completely power off
> the
> > >>>>>>>> machine manually but just plugging out the network cable works
> > >>>>>>>> fine. The cloudstack agent after losing connection to management
> > >>>>>>>> server auto reboots because of KVM heartbeat check shell script
> > >>>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
> > >>>> thread.
> > >>>>>>>>
> > >>>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jms.123@hotmail.co.uk
> >
> > >>>>> wrote:
> > >>>>>>>> Hi Paul
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks for the response.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I think I am not understanding how it was meant to work then. My
> > >>>>>>>> understanding was that the manager used ipmitool to just keep
> > >>>>>>>> querying the compute nodes as to their status so I assumed it
> > >>>>>>>> didn't matter how you shut the node down, once it was down the
> > >>>>>>>> manager would get no response and mark it as down (which it
> does).
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I am in testing mode so I think I will just go and pull the
> power
> > >>>>>>>> and see what happens :)
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Jon
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ________________________________
> > >>>>>>>> From: Paul Angus <pa...@shapeblue.com>
> > >>>>>>>> Sent: 12 March 2018 15:31
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: RE: KVM HostHA
> > >>>>>>>> Hi Jon,
> > >>>>>>>>
> > >>>>>>>> I think that what you guys are finding, is that a controlled
> host
> > >>>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >>>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
> > >>>>>>>> action if you shut down a host, only if the host (agent) stops
> > >>>>>> responding.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Kind regards,
> > >>>>>>>>
> > >>>>>>>> Paul Angus
> > >>>>>>>>
> > >>>>>>>> paul.angus@shapeblue.com
> > >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]
> > >>>>>>>
> > >>>>>>> ]<
> > >>>>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack
> <https://maps.google.com/?q=deployment+framework+for+Apache+CloudStack+&entry=gmail&source=g>IaaS
> Clouds. CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> > >>>>>> of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > >>>>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >>>>
> > >>>> ]<
> > >>>>>> http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>>
> > >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > >>>> http://www.shapeblue.com/>
> > >>>>
> > >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >>>> www.shapeblue.com<http://www.shapeblue.com>
> > >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > >> is a
> > >>>> framework developed by ShapeBlue to deliver the rapid deployment of
> a
> > >>>> standardised ...
> > >>>>
> > >>>>
> > >>>>
> > >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>>>>> CSForge
> > >>>>> is
> > >>>>>> a framework developed by ShapeBlue to deliver <
> > >>>>>
> > https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> > >>>>> &entry=gmail&source=g
> > >>>>>>
> > >>>>>> the rapid deployment of a standardised ...
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > >>>> CSForge
> > >>>>>>>> is a framework developed by ShapeBlue to deliver the rapid
> > >>>> deployment
> > >>>>>>>> of a standardised ...
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
> > >>>>>>>> Sent: 12 March 2018 15:15
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: Re: KVM HostHA
> > >>>>>>>>
> > >>>>>>>> I have the same issue here and am not entirely sure what the
> > >>>> behaviour
> > >>>>>>>> should be.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I have one manager node and 2 compute nodes running 4.11 with
> ipmi
> > >>>>>> working
> > >>>>>>>> correctly.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> From the UI under HA -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> HA Enabled Yes
> > >>>>>>>> HA State Available
> > >>>>>>>> HA Provider kvmhaprovider
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> although interestingly from the "Details" tab it shows -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> HA enabled No
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> which I assume is a cosmetic issue ?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On each compute node I have one HA enabled VM and one non HA
> > enabled
> > >>>>> VM.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> I power off a compute node and the UI updates the host status
> and
> > >>>> the
> > >>>>>> VMs
> > >>>>>>>> on that node stop responding but they never fail over to the
> other
> > >>>>> node.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Couple of things I noticed -
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 1) as soon as i power off the compute node the HA state on the
> > other
> > >>>>>> node
> > >>>>>>>> shows "Ineligible"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> 2) In the UI the instances all still show as green even though
> two
> > >>>> of
> > >>>>>> them
> > >>>>>>>> are not available
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Any help much appreciated
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> ________________________________
> > >>>>>>>> From: victor <vi...@ihnetworks.com>
> > >>>>>>>> Sent: 07 March 2018 17:01
> > >>>>>>>> To: users@cloudstack.apache.org
> > >>>>>>>> Subject: KVM HostHA
> > >>>>>>>>
> > >>>>>>>> Hello Guys,
> > >>>>>>>>
> > >>>>>>>> I have installed cloudstack 4.11. I have enabled HA for each
> > hosts I
> > >>>>>> have
> > >>>>>>>> added. I have also added ipmi successfully (using ipmi driver).
> > >>>>>>>> The hosts are showing like the following.
> > >>>>>>>>
> > >>>>>>>> =======
> > >>>>>>>>
> > >>>>>>>> HA Enabled Yes
> > >>>>>>>> HA State Available
> > >>>>>>>> HA Provider kvmhaprovider
> > >>>>>>>>
> > >>>>>>>> ======
> > >>>>>>>>
> > >>>>>>>> Also the host is showing the following correctly
> > >>>>>>>>
> > >>>>>>>> Resource state --> Enabled
> > >>>>>>>> State --> UP
> > >>>>>>>> Power state --> On
> > >>>>>>>>
> > >>>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha
> is
> > >>>>>>>> working. I have waited for half an hour. But nothing has
> happened.
> > >>>>> What
> > >>>>>>>> will happen to the VM's in that host, if the host failed to back
> > up.
> > >>>>>>>> There isn't much from logs.
> > >>>>>>>>
> > >>>>>>>> Regards
> > >>>>>>>> Victor
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Andrija Panic
> > >>
> > >>
> > >
> > >
> > > --
> > >
> > > Andrija Panic
> >
> >
>

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Parth


Can I just ask a few questions -


1) how many compute nodes do you have


2) are you running basic or advanced networking


3) how have you setup your NICs ie. on each compute node I have 3 separate NICs, one for management, one for the VMs and one for storage (NFS).


So far I have not managed to get any failover of VMs no matter what I try


Jon


________________________________
From: Parth Patel <pa...@gmail.com>
Sent: 14 March 2018 14:36
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul and Adrina,

I don't know the functioning of Host-HA features but what Paul explained,
my ACS 4.11 does the same without even host HA or ipmi access. As I stated
earlier multiple times, without host HA and ipmi, my ha-enabled VMs
executing on a normal host get restarted on another suitable host in
cluster after approximately 3 minutes of event ping timeout. After which
the cloudstack agent with no connection to management server because of
unplugged NIC (all my machines currently have only one NIC / whole zone is
in a flat network) reboots itself (the reason was explained by Rohit in an
another thread). The management server marks the host down and only
Ha-enabled VMs executing on it get restarted on another host (without any
mention of host HA or ipmi or fencing in management server logs) while
normal VMs executing on it are stopped.

I don't know if this was a desired outcome, but I think my current ACS 4.11
installation has features (at least performs some ;) provided by Host HA
without configuring it or ipmi.

Regards,
Parth Patel

On Wed 14 Mar, 2018, 18:41 Boris Stoyanov, <bo...@shapeblue.com>
wrote:

> yes, KVM + NFS shared storage.
>
> Boris.
>
>
> boris.stoyanov@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> > On 14 Mar 2018, at 14:51, Andrija Panic <an...@gmail.com> wrote:
> >
> > Hi Boris,
> >
> > ok thanks for the explanation - that makes sense, and covers my
> "exception
> > case" that I have.
> >
> > This is atm only available for NFS as I could read (KVM on NFS) ?
> >
> > Cheers
> >
> > On 14 March 2018 at 13:02, Boris Stoyanov <bo...@shapeblue.com>
> > wrote:
> >
> >> Hi Andrija,
> >>
> >> There’s two types of checks Host-HA is doing to determine if host if
> >> healthy.
> >>
> >> 1. Health checks - pings the host as soon as there’s connection issues
> >> with the agent
> >>
> >> If that fails,
> >>
> >> 2. Activity checks - checks if there are any writing operations on the
> >> Disks of the VMs that are running on the hosts. This is to determine if
> the
> >> VMs are actually alive and executing processes. Only if no disk
> operations
> >> are executed on the shared storage, only then it’s trying to Recover the
> >> host with IPMI call, if that eventually fails, it migrates the VMs to a
> >> healthy host and Fences the faulty one.
> >>
> >> Hope that explains your case.
> >>
> >> Boris.
> >>
> >>
> >> boris.stoyanov@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com>
> wrote:
> >>>
> >>> Hi Paul,
> >>>
> >>> sorry to bump in the middle of the thread, but just curious about the
> >> idea
> >>> behing host-HA and why it behaves the way you exlained above:
> >>>
> >>>
> >>> Would it be more sense (or not?), that when MGMT detects agents is
> >>> unreachable or host unreachable (or after unsuccessful i.e. agent
> >> restart,
> >>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> >>> making sure no VMS running and then to really start all HA-enabled VMs
> on
> >>> other hosts ?
> >>>
> >>> I'm just trying to make parallel to the corosync/pacemaker as
> clustering
> >>> suite/services in Linux (RHEL and others), where when majority of nodes
> >>> detect that one node is down, a common thing (especially for shared
> >>> storage) is to STONITH that node, make sure it;s down, then move
> >> "resource"
> >>> (in our case VMs) to other cluster nodes ?
> >>>
> >>> I see it's  actually much broader setup per
> >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> >> again -
> >>> whole idea (in my head at least...) is when host get's down, we make
> sure
> >>> it's down (avoid VM corruption, by doint STONITH to that node) and then
> >>> start HA VMs on ohter hosts.
> >>>
> >>> I understand there might be exceptions as I have right now (4.8) -
> >> libvirt
> >>> get stuck (librbd exception or similar) so agent get's disconnected,
> but
> >>> VMs are still running fine... (except DB get messed up, all NICs loose
> >>> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> >>>
> >>>
> >>> Thanks
> >>> Andrija
> >>>
> >>>
> >>>
> >>>
> >>> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk> wrote:
> >>>
> >>>> That would make sense.
> >>>>
> >>>>
> >>>> I have another server being used for something else at the moment so I
> >>>> will add that in and update this thread when I have tested
> >>>>
> >>>>
> >>>> Jon
> >>>>
> >>>>
> >>>> ________________________________
> >>>> From: Paul Angus <pa...@shapeblue.com>
> >>>> Sent: 14 March 2018 09:16
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: RE: KVM HostHA
> >>>>
> >>>> I'd need to do some testing, but I suspect that your problem is that
> you
> >>>> only have two hosts.  At the point that one host is deemed out of
> >> service,
> >>>> you only have one host left.  With only one host, CloudStack will show
> >> the
> >>>> cluster as ineligible.
> >>>>
> >>>> It is extremely common for any system working as a cluster to require
> a
> >>>> minimum starting point of 3 nodes to be able to function.
> >>>>
> >>>>
> >>>> Kind regards,
> >>>>
> >>>> Paul Angus
> >>>>
> >>>> paul.angus@shapeblue.com
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >>>> @shapeblue
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Jon Marshall <jm...@hotmail.co.uk>
> >>>> Sent: 14 March 2018 08:36
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: Re: KVM HostHA
> >>>>
> >>>> Hi Paul
> >>>>
> >>>>
> >>>> My testing does indeed end up with the failed host in maintenance mode
> >> but
> >>>> the VMs are never migrated. As I posted earlier the management server
> >> seems
> >>>> to be saying there is no other host that the VM can be migrated to.
> >>>>
> >>>>
> >>>> Couple of questions if you have the time to respond -
> >>>>
> >>>>
> >>>> 1) this article seems to suggest a reboot or powering off a host will
> >> end
> >>>> result in the VMs being migrated and this was on CS v 4.2.1 back in
> >> 2013 so
> >>>> does Host HA do something different
> >>>>
> >>>>
> >>>> 2) Whenever one of my two nodes is taken down in testing the active
> >>>> compute nodes HA status goes from Available to Ineligible. Should this
> >>>> happen ie. is it going to Ineligible stopping the manager from
> migrating
> >>>> the VMs.
> >>>>
> >>>>
> >>>> Apologies for all the questions but I just can't get this to work at
> the
> >>>> moment. If I do eventually get it working I will do a write up for
> >> others
> >>>> with same issue :)
> >>>>
> >>>>
> >>>> ________________________________
> >>>> From: Paul Angus <pa...@shapeblue.com>
> >>>> Sent: 14 March 2018 07:45
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: RE: KVM HostHA
> >>>>
> >>>> Hi Parth,
> >>>>
> >>>> Two answer your questions, VM-HA does not restart VMs on an alternate
> >> host
> >>>> if the original host goes down.  The management server (without
> host-HA)
> >>>> cannot tell what happened to the host.  It cannot tell if there was a
> >>>> failure in the agent, loss of connectivity to the management NIC or if
> >> the
> >>>> host is truly down.  In the first two scenarios, the guest VMs can
> >> still be
> >>>> running perfectly well, and to restart them elsewhere would be very
> >>>> dangerous.  Therefore, the correct thing to do is - nothing but alert
> >> the
> >>>> operator.  These scenarios are what Host-HA was introd
> <https://maps.google.com/?q=These+scenarios+are+what+Host-HA+was+introd&entry=gmail&source=g>uced
> for.
> >>>>
> >>>> Wrt to STONITH, if no disk activity is detected on the host, host-HA
> >> will
> >>>> try to restart (via IPMI) the host. If, after a configurable number of
> >>>> attempts, the host agent still does not check in, then host-HA will
> shut
> >>>> down the host (via IPMA), trigger VM-HA and mark the host as
> >> in-maintenance.
> >>>>
> >>>>
> >>>>
> >>>> paul.angus@shapeblue.com
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Parth Patel <pa...@gmail.com>
> >>>> Sent: 14 March 2018 05:05
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: Re: KVM HostHA
> >>>>
> >>>> Hi Paul,
> >>>>
> >>>> Thanks for the clarification. I currently don't have an ipmi enabled
> >>>> hardware (in test environment), but it will be beneficial if you can
> >> help
> >>>> me clear out some basic concepts of it:
> >>>> - If HA-enabled VMs are autostarted on another host when current host
> >> goes
> >>>> down, what is the need or purpose of HA-host? (other than management
> >> server
> >>>> able to remotely control it's power interfaces)
> >>>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
> >> approach
> >>>> ACS uses to fence the host, but I couldn't find what mechanism or
> events
> >>>> trigger this?
> >>>>
> >>>> Thanks and regards,
> >>>> Parth Patel
> >>>>
> >>>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
> >> wrote:
> >>>>
> >>>>> The management server doesn't ping the host through IPMI.   However
> if
> >>>>> IPMI is not available, you will not be able to use Host HA, as there
> >>>>> is no way for CloudStack to 'fence' the host - that is shut it down
> to
> >>>>> be sure that a VM cannot start again on that host.
> >>>>>
> >>>>> I can explain why that is necessary if you wish.
> >>>>>
> >>>>>
> >>>>> Kind regards,
> >>>>>
> >>>>> Paul Angus
> >>>>>
> >>>>> paul.angus@shapeblue.com
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Parth Patel <pa...@gmail.com>
> >>>>> Sent: 13 March 2018 16:57
> >>>>> To: users@cloudstack.apache.org
> >>>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
> >>>>> Subject: Re: KVM HostHA
> >>>>>
> >>>>> Hi Jon and Victor,
> >>>>>
> >>>>> I think the management server pings your host using ipmi (I really
> >>>>> don't hope this is the case).
> >>>>> In my case, I did not have OOBM enabled at all (my hardware didn't
> >>>>> support
> >>>>> it)
> >>>>> I think you could disable OOBM and/or HA-Host and give that a try :)
> >>>>>
> >>>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> >>>>>
> >>>>>> Hello Guys,
> >>>>>>
> >>>>>> I have tried the following two cases.
> >>>>>>
> >>>>>> 1, "echo c > /proc/sysrq-trigger"
> >>>>>>
> >>>>>> 2, Pulled the network cable of one of the host
> >>>>>>
> >>>>>> In both cases, the following happened.
> >>>>>>
> >>>>>> =====
> >>>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> >>>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> >>>>>> nodes of to disconnect
> >>>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> >>>>>> disconnecting with event AgentDisconnected
> >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> >>>>>> Alert
> >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> >>>>>> for
> >>>>>> 4 with state Alert
> >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> >>>>>> =====
> >>>>>>
> >>>>>> But nothing happened for the  vm's in that node. I have waited for
> >>>>>> one hour and the VM's in that node has been migrated to the other
> >>>>>> available hosts. I think the issue is that the management server
> >>>>>> still thinks that the VM's in that host is running. Please check the
> >>>>>> following logs
> >>>>>>
> >>>>>> =======
> >>>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
> >>>>>> 4
> >>>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> >>>>>> running on host 4 ========
> >>>>>>
> >>>>>>
> >>>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> >>>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> >>>>>>> into the
> >>>>>> server but it did not stop the server responding to an ipmitool
> >>>>>> request on the manager eg -
> >>>>>>>
> >>>>>>>
> >>>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> >>>>> status"
> >>>>>>>
> >>>>>>>
> >>>>>>> from the management server got an answer saying the chassis power
> >>>>>>> was on
> >>>>>> so CS never registered the compute node as down.
> >>>>>>>
> >>>>>>>
> >>>>>>> I am obviously doing something wrong but cannot work it out.
> >>>>>>>
> >>>>>>>
> >>>>>>> The management server has one NIC - 172.16.7.4
> >>>>>>>
> >>>>>>>
> >>>>>>> Each compute node has 3 NICs -
> >>>>>>>
> >>>>>>>
> >>>>>>>                                       cnode1
> >>>>>> cnode2
> >>>>>>>
> >>>>>>>
> >>>>>>> mangement NIC        172.16.7.5                   172.16.7.6
> >>>>>>>
> >>>>>>> vm NIC                      172.16.6.130
>  172.16.6.131
> >>>>>>>
> >>>>>>> storage -                     172.16.250.4
>  172.16.250.5
> >>>>>>>
> >>>>>>>
> >>>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> >>>>>>>
> >>>>>>>
> >>>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> If I pull the storage NIC presumably nothing will happen as the
> >>>>>>> ipmitool
> >>>>>> check is running across the management NIC so I need to pull both ?
> >>>>>>>
> >>>>>>> My understanding of host HA was the management server monitored
> >>>>>>> the
> >>>>>> compute nodes using ipmitool and if it did not get a response
> >>>>>> because the host was down it would fence off that host and move the
> >>>>>> VMs to an active compute node.
> >>>>>>>
> >>>>>>> This is obviously too simplistic so could someone explain how it
> >>>>>>> is
> >>>>>> meant to work and what it is protecting against ?
> >>>>>>>
> >>>>>>> ________________________________
> >>>>>>> From: Paul Angus <pa...@shapeblue.com>
> >>>>>>> Sent: 13 March 2018 07:01
> >>>>>>> To: users@cloudstack.apache.org
> >>>>>>> Subject: RE: KVM HostHA
> >>>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> One small note, unplugging the management NIC will only cause an
> >>>>>>> HA
> >>>>>> event if the storage is running over that NIC also.
> >>>>>>>
> >>>>>>> Is the storage is over a separate NIC then, the guest VMs will
> >>>>>>> continue
> >>>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> >>>>>> activity and conclude that there is nothing it can do, as the VMs
> >>>>>> are still running other than mark the hosts as degraded.
> >>>>>>>
> >>>>>>>
> >>>>>>> Kind regards,
> >>>>>>>
> >>>>>>> Paul Angus
> >>>>>>>
> >>>>>>> paul.angus@shapeblue.com
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company
> >>>>> <
> https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> >>>>> urce=g>
> >>>>> <http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge is
> >>>>>> a framework developed by ShapeBlue to deli
> >>>>>> <
> https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> >>>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
> >>>>>> standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Parth Patel <pa...@gmail.com>
> >>>>>>> Sent: 12 March 2018 17:35
> >>>>>>> To: users@cloudstack.apache.org
> >>>>>>> Subject: Re: KVM HostHA
> >>>>>>>
> >>>>>>>> Hi Jon,
> >>>>>>>>
> >>>>>>>> As I said, in my case, making the host HA didn't work but by just
> >>>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
> >>>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> >>>>>>>> management server registered it as down and started the VM on
> >>>>>>>> another host. I know I've suggested this before but I insist you
> >>>>>>>> give this a try. Also, you don't need to completely power off the
> >>>>>>>> machine manually but just plugging out the network cable works
> >>>>>>>> fine. The cloudstack agent after losing connection to management
> >>>>>>>> server auto reboots because of KVM heartbeat check shell script
> >>>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
> >>>> thread.
> >>>>>>>>
> >>>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> >>>>> wrote:
> >>>>>>>> Hi Paul
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks for the response.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I think I am not understanding how it was meant to work then. My
> >>>>>>>> understanding was that the manager used ipmitool to just keep
> >>>>>>>> querying the compute nodes as to their status so I assumed it
> >>>>>>>> didn't matter how you shut the node down, once it was down the
> >>>>>>>> manager would get no response and mark it as down (which it does).
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I am in testing mode so I think I will just go and pull the power
> >>>>>>>> and see what happens :)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Jon
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ________________________________
> >>>>>>>> From: Paul Angus <pa...@shapeblue.com>
> >>>>>>>> Sent: 12 March 2018 15:31
> >>>>>>>> To: users@cloudstack.apache.org
> >>>>>>>> Subject: RE: KVM HostHA
> >>>>>>>> Hi Jon,
> >>>>>>>>
> >>>>>>>> I think that what you guys are finding, is that a controlled host
> >>>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
> >>>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
> >>>>>>>> action if you shut down a host, only if the host (agent) stops
> >>>>>> responding.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Kind regards,
> >>>>>>>>
> >>>>>>>> Paul Angus
> >>>>>>>>
> >>>>>>>> paul.angus@shapeblue.com
> >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>>>> of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]
> >>>>>>>
> >>>>>>> ]<
> >>>>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>>>> of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>>>> of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver <
> >>>>>
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> >>>>> &entry=gmail&source=g
> >>>>>>
> >>>>>> the rapid deployment of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>> CSForge
> >>>>>>>> is a framework developed by ShapeBlue to deliver the rapid
> >>>> deployment
> >>>>>>>> of a standardised ...
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
> >>>>>>>> Sent: 12 March 2018 15:15
> >>>>>>>> To: users@cloudstack.apache.org
> >>>>>>>> Subject: Re: KVM HostHA
> >>>>>>>>
> >>>>>>>> I have the same issue here and am not entirely sure what the
> >>>> behaviour
> >>>>>>>> should be.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I have one manager node and 2 compute nodes running 4.11 with ipmi
> >>>>>> working
> >>>>>>>> correctly.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> From the UI under HA -
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> HA Enabled Yes
> >>>>>>>> HA State Available
> >>>>>>>> HA Provider kvmhaprovider
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> although interestingly from the "Details" tab it shows -
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> HA enabled No
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> which I assume is a cosmetic issue ?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On each compute node I have one HA enabled VM and one non HA
> enabled
> >>>>> VM.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I power off a compute node and the UI updates the host status and
> >>>> the
> >>>>>> VMs
> >>>>>>>> on that node stop responding but they never fail over to the other
> >>>>> node.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Couple of things I noticed -
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 1) as soon as i power off the compute node the HA state on the
> other
> >>>>>> node
> >>>>>>>> shows "Ineligible"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2) In the UI the instances all still show as green even though two
> >>>> of
> >>>>>> them
> >>>>>>>> are not available
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Any help much appreciated
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ________________________________
> >>>>>>>> From: victor <vi...@ihnetworks.com>
> >>>>>>>> Sent: 07 March 2018 17:01
> >>>>>>>> To: users@cloudstack.apache.org
> >>>>>>>> Subject: KVM HostHA
> >>>>>>>>
> >>>>>>>> Hello Guys,
> >>>>>>>>
> >>>>>>>> I have installed cloudstack 4.11. I have enabled HA for each
> hosts I
> >>>>>> have
> >>>>>>>> added. I have also added ipmi successfully (using ipmi driver).
> >>>>>>>> The hosts are showing like the following.
> >>>>>>>>
> >>>>>>>> =======
> >>>>>>>>
> >>>>>>>> HA Enabled Yes
> >>>>>>>> HA State Available
> >>>>>>>> HA Provider kvmhaprovider
> >>>>>>>>
> >>>>>>>> ======
> >>>>>>>>
> >>>>>>>> Also the host is showing the following correctly
> >>>>>>>>
> >>>>>>>> Resource state --> Enabled
> >>>>>>>> State --> UP
> >>>>>>>> Power state --> On
> >>>>>>>>
> >>>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >>>>>>>> working. I have waited for half an hour. But nothing has happened.
> >>>>> What
> >>>>>>>> will happen to the VM's in that host, if the host failed to back
> up.
> >>>>>>>> There isn't much from logs.
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> Victor
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Andrija Panic
> >>
> >>
> >
> >
> > --
> >
> > Andrija Panic
>
>

Re: KVM HostHA

Posted by Parth Patel <pa...@gmail.com>.
Hi Paul and Adrina,

I don't know the functioning of Host-HA features but what Paul explained,
my ACS 4.11 does the same without even host HA or ipmi access. As I stated
earlier multiple times, without host HA and ipmi, my ha-enabled VMs
executing on a normal host get restarted on another suitable host in
cluster after approximately 3 minutes of event ping timeout. After which
the cloudstack agent with no connection to management server because of
unplugged NIC (all my machines currently have only one NIC / whole zone is
in a flat network) reboots itself (the reason was explained by Rohit in an
another thread). The management server marks the host down and only
Ha-enabled VMs executing on it get restarted on another host (without any
mention of host HA or ipmi or fencing in management server logs) while
normal VMs executing on it are stopped.

I don't know if this was a desired outcome, but I think my current ACS 4.11
installation has features (at least performs some ;) provided by Host HA
without configuring it or ipmi.

Regards,
Parth Patel

On Wed 14 Mar, 2018, 18:41 Boris Stoyanov, <bo...@shapeblue.com>
wrote:

> yes, KVM + NFS shared storage.
>
> Boris.
>
>
> boris.stoyanov@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> > On 14 Mar 2018, at 14:51, Andrija Panic <an...@gmail.com> wrote:
> >
> > Hi Boris,
> >
> > ok thanks for the explanation - that makes sense, and covers my
> "exception
> > case" that I have.
> >
> > This is atm only available for NFS as I could read (KVM on NFS) ?
> >
> > Cheers
> >
> > On 14 March 2018 at 13:02, Boris Stoyanov <bo...@shapeblue.com>
> > wrote:
> >
> >> Hi Andrija,
> >>
> >> There’s two types of checks Host-HA is doing to determine if host if
> >> healthy.
> >>
> >> 1. Health checks - pings the host as soon as there’s connection issues
> >> with the agent
> >>
> >> If that fails,
> >>
> >> 2. Activity checks - checks if there are any writing operations on the
> >> Disks of the VMs that are running on the hosts. This is to determine if
> the
> >> VMs are actually alive and executing processes. Only if no disk
> operations
> >> are executed on the shared storage, only then it’s trying to Recover the
> >> host with IPMI call, if that eventually fails, it migrates the VMs to a
> >> healthy host and Fences the faulty one.
> >>
> >> Hope that explains your case.
> >>
> >> Boris.
> >>
> >>
> >> boris.stoyanov@shapeblue.com
> >> www.shapeblue.com
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com>
> wrote:
> >>>
> >>> Hi Paul,
> >>>
> >>> sorry to bump in the middle of the thread, but just curious about the
> >> idea
> >>> behing host-HA and why it behaves the way you exlained above:
> >>>
> >>>
> >>> Would it be more sense (or not?), that when MGMT detects agents is
> >>> unreachable or host unreachable (or after unsuccessful i.e. agent
> >> restart,
> >>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> >>> making sure no VMS running and then to really start all HA-enabled VMs
> on
> >>> other hosts ?
> >>>
> >>> I'm just trying to make parallel to the corosync/pacemaker as
> clustering
> >>> suite/services in Linux (RHEL and others), where when majority of nodes
> >>> detect that one node is down, a common thing (especially for shared
> >>> storage) is to STONITH that node, make sure it;s down, then move
> >> "resource"
> >>> (in our case VMs) to other cluster nodes ?
> >>>
> >>> I see it's  actually much broader setup per
> >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> >> again -
> >>> whole idea (in my head at least...) is when host get's down, we make
> sure
> >>> it's down (avoid VM corruption, by doint STONITH to that node) and then
> >>> start HA VMs on ohter hosts.
> >>>
> >>> I understand there might be exceptions as I have right now (4.8) -
> >> libvirt
> >>> get stuck (librbd exception or similar) so agent get's disconnected,
> but
> >>> VMs are still running fine... (except DB get messed up, all NICs loose
> >>> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> >>>
> >>>
> >>> Thanks
> >>> Andrija
> >>>
> >>>
> >>>
> >>>
> >>> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk> wrote:
> >>>
> >>>> That would make sense.
> >>>>
> >>>>
> >>>> I have another server being used for something else at the moment so I
> >>>> will add that in and update this thread when I have tested
> >>>>
> >>>>
> >>>> Jon
> >>>>
> >>>>
> >>>> ________________________________
> >>>> From: Paul Angus <pa...@shapeblue.com>
> >>>> Sent: 14 March 2018 09:16
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: RE: KVM HostHA
> >>>>
> >>>> I'd need to do some testing, but I suspect that your problem is that
> you
> >>>> only have two hosts.  At the point that one host is deemed out of
> >> service,
> >>>> you only have one host left.  With only one host, CloudStack will show
> >> the
> >>>> cluster as ineligible.
> >>>>
> >>>> It is extremely common for any system working as a cluster to require
> a
> >>>> minimum starting point of 3 nodes to be able to function.
> >>>>
> >>>>
> >>>> Kind regards,
> >>>>
> >>>> Paul Angus
> >>>>
> >>>> paul.angus@shapeblue.com
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >>>> @shapeblue
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Jon Marshall <jm...@hotmail.co.uk>
> >>>> Sent: 14 March 2018 08:36
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: Re: KVM HostHA
> >>>>
> >>>> Hi Paul
> >>>>
> >>>>
> >>>> My testing does indeed end up with the failed host in maintenance mode
> >> but
> >>>> the VMs are never migrated. As I posted earlier the management server
> >> seems
> >>>> to be saying there is no other host that the VM can be migrated to.
> >>>>
> >>>>
> >>>> Couple of questions if you have the time to respond -
> >>>>
> >>>>
> >>>> 1) this article seems to suggest a reboot or powering off a host will
> >> end
> >>>> result in the VMs being migrated and this was on CS v 4.2.1 back in
> >> 2013 so
> >>>> does Host HA do something different
> >>>>
> >>>>
> >>>> 2) Whenever one of my two nodes is taken down in testing the active
> >>>> compute nodes HA status goes from Available to Ineligible. Should this
> >>>> happen ie. is it going to Ineligible stopping the manager from
> migrating
> >>>> the VMs.
> >>>>
> >>>>
> >>>> Apologies for all the questions but I just can't get this to work at
> the
> >>>> moment. If I do eventually get it working I will do a write up for
> >> others
> >>>> with same issue :)
> >>>>
> >>>>
> >>>> ________________________________
> >>>> From: Paul Angus <pa...@shapeblue.com>
> >>>> Sent: 14 March 2018 07:45
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: RE: KVM HostHA
> >>>>
> >>>> Hi Parth,
> >>>>
> >>>> Two answer your questions, VM-HA does not restart VMs on an alternate
> >> host
> >>>> if the original host goes down.  The management server (without
> host-HA)
> >>>> cannot tell what happened to the host.  It cannot tell if there was a
> >>>> failure in the agent, loss of connectivity to the management NIC or if
> >> the
> >>>> host is truly down.  In the first two scenarios, the guest VMs can
> >> still be
> >>>> running perfectly well, and to restart them elsewhere would be very
> >>>> dangerous.  Therefore, the correct thing to do is - nothing but alert
> >> the
> >>>> operator.  These scenarios are what Host-HA was introd
> <https://maps.google.com/?q=These+scenarios+are+what+Host-HA+was+introd&entry=gmail&source=g>uced
> for.
> >>>>
> >>>> Wrt to STONITH, if no disk activity is detected on the host, host-HA
> >> will
> >>>> try to restart (via IPMI) the host. If, after a configurable number of
> >>>> attempts, the host agent still does not check in, then host-HA will
> shut
> >>>> down the host (via IPMA), trigger VM-HA and mark the host as
> >> in-maintenance.
> >>>>
> >>>>
> >>>>
> >>>> paul.angus@shapeblue.com
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Parth Patel <pa...@gmail.com>
> >>>> Sent: 14 March 2018 05:05
> >>>> To: users@cloudstack.apache.org
> >>>> Subject: Re: KVM HostHA
> >>>>
> >>>> Hi Paul,
> >>>>
> >>>> Thanks for the clarification. I currently don't have an ipmi enabled
> >>>> hardware (in test environment), but it will be beneficial if you can
> >> help
> >>>> me clear out some basic concepts of it:
> >>>> - If HA-enabled VMs are autostarted on another host when current host
> >> goes
> >>>> down, what is the need or purpose of HA-host? (other than management
> >> server
> >>>> able to remotely control it's power interfaces)
> >>>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
> >> approach
> >>>> ACS uses to fence the host, but I couldn't find what mechanism or
> events
> >>>> trigger this?
> >>>>
> >>>> Thanks and regards,
> >>>> Parth Patel
> >>>>
> >>>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
> >> wrote:
> >>>>
> >>>>> The management server doesn't ping the host through IPMI.   However
> if
> >>>>> IPMI is not available, you will not be able to use Host HA, as there
> >>>>> is no way for CloudStack to 'fence' the host - that is shut it down
> to
> >>>>> be sure that a VM cannot start again on that host.
> >>>>>
> >>>>> I can explain why that is necessary if you wish.
> >>>>>
> >>>>>
> >>>>> Kind regards,
> >>>>>
> >>>>> Paul Angus
> >>>>>
> >>>>> paul.angus@shapeblue.com
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Parth Patel <pa...@gmail.com>
> >>>>> Sent: 13 March 2018 16:57
> >>>>> To: users@cloudstack.apache.org
> >>>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
> >>>>> Subject: Re: KVM HostHA
> >>>>>
> >>>>> Hi Jon and Victor,
> >>>>>
> >>>>> I think the management server pings your host using ipmi (I really
> >>>>> don't hope this is the case).
> >>>>> In my case, I did not have OOBM enabled at all (my hardware didn't
> >>>>> support
> >>>>> it)
> >>>>> I think you could disable OOBM and/or HA-Host and give that a try :)
> >>>>>
> >>>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> >>>>>
> >>>>>> Hello Guys,
> >>>>>>
> >>>>>> I have tried the following two cases.
> >>>>>>
> >>>>>> 1, "echo c > /proc/sysrq-trigger"
> >>>>>>
> >>>>>> 2, Pulled the network cable of one of the host
> >>>>>>
> >>>>>> In both cases, the following happened.
> >>>>>>
> >>>>>> =====
> >>>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> >>>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> >>>>>> nodes of to disconnect
> >>>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> >>>>>> disconnecting with event AgentDisconnected
> >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> >>>>>> Alert
> >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> >>>>>> for
> >>>>>> 4 with state Alert
> >>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> >>>>>> =====
> >>>>>>
> >>>>>> But nothing happened for the  vm's in that node. I have waited for
> >>>>>> one hour and the VM's in that node has been migrated to the other
> >>>>>> available hosts. I think the issue is that the management server
> >>>>>> still thinks that the VM's in that host is running. Please check the
> >>>>>> following logs
> >>>>>>
> >>>>>> =======
> >>>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
> >>>>>> 4
> >>>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> >>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> >>>>>> running on host 4 ========
> >>>>>>
> >>>>>>
> >>>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> >>>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> >>>>>>> into the
> >>>>>> server but it did not stop the server responding to an ipmitool
> >>>>>> request on the manager eg -
> >>>>>>>
> >>>>>>>
> >>>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> >>>>> status"
> >>>>>>>
> >>>>>>>
> >>>>>>> from the management server got an answer saying the chassis power
> >>>>>>> was on
> >>>>>> so CS never registered the compute node as down.
> >>>>>>>
> >>>>>>>
> >>>>>>> I am obviously doing something wrong but cannot work it out.
> >>>>>>>
> >>>>>>>
> >>>>>>> The management server has one NIC - 172.16.7.4
> >>>>>>>
> >>>>>>>
> >>>>>>> Each compute node has 3 NICs -
> >>>>>>>
> >>>>>>>
> >>>>>>>                                       cnode1
> >>>>>> cnode2
> >>>>>>>
> >>>>>>>
> >>>>>>> mangement NIC        172.16.7.5                   172.16.7.6
> >>>>>>>
> >>>>>>> vm NIC                      172.16.6.130
>  172.16.6.131
> >>>>>>>
> >>>>>>> storage -                     172.16.250.4
>  172.16.250.5
> >>>>>>>
> >>>>>>>
> >>>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> >>>>>>>
> >>>>>>>
> >>>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> If I pull the storage NIC presumably nothing will happen as the
> >>>>>>> ipmitool
> >>>>>> check is running across the management NIC so I need to pull both ?
> >>>>>>>
> >>>>>>> My understanding of host HA was the management server monitored
> >>>>>>> the
> >>>>>> compute nodes using ipmitool and if it did not get a response
> >>>>>> because the host was down it would fence off that host and move the
> >>>>>> VMs to an active compute node.
> >>>>>>>
> >>>>>>> This is obviously too simplistic so could someone explain how it
> >>>>>>> is
> >>>>>> meant to work and what it is protecting against ?
> >>>>>>>
> >>>>>>> ________________________________
> >>>>>>> From: Paul Angus <pa...@shapeblue.com>
> >>>>>>> Sent: 13 March 2018 07:01
> >>>>>>> To: users@cloudstack.apache.org
> >>>>>>> Subject: RE: KVM HostHA
> >>>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> One small note, unplugging the management NIC will only cause an
> >>>>>>> HA
> >>>>>> event if the storage is running over that NIC also.
> >>>>>>>
> >>>>>>> Is the storage is over a separate NIC then, the guest VMs will
> >>>>>>> continue
> >>>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> >>>>>> activity and conclude that there is nothing it can do, as the VMs
> >>>>>> are still running other than mark the hosts as degraded.
> >>>>>>>
> >>>>>>>
> >>>>>>> Kind regards,
> >>>>>>>
> >>>>>>> Paul Angus
> >>>>>>>
> >>>>>>> paul.angus@shapeblue.com
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company
> >>>>> <
> https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> >>>>> urce=g>
> >>>>> <http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge is
> >>>>>> a framework developed by ShapeBlue to deli
> >>>>>> <
> https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> >>>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
> >>>>>> standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Parth Patel <pa...@gmail.com>
> >>>>>>> Sent: 12 March 2018 17:35
> >>>>>>> To: users@cloudstack.apache.org
> >>>>>>> Subject: Re: KVM HostHA
> >>>>>>>
> >>>>>>>> Hi Jon,
> >>>>>>>>
> >>>>>>>> As I said, in my case, making the host HA didn't work but by just
> >>>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
> >>>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> >>>>>>>> management server registered it as down and started the VM on
> >>>>>>>> another host. I know I've suggested this before but I insist you
> >>>>>>>> give this a try. Also, you don't need to completely power off the
> >>>>>>>> machine manually but just plugging out the network cable works
> >>>>>>>> fine. The cloudstack agent after losing connection to management
> >>>>>>>> server auto reboots because of KVM heartbeat check shell script
> >>>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
> >>>> thread.
> >>>>>>>>
> >>>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> >>>>> wrote:
> >>>>>>>> Hi Paul
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks for the response.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I think I am not understanding how it was meant to work then. My
> >>>>>>>> understanding was that the manager used ipmitool to just keep
> >>>>>>>> querying the compute nodes as to their status so I assumed it
> >>>>>>>> didn't matter how you shut the node down, once it was down the
> >>>>>>>> manager would get no response and mark it as down (which it does).
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I am in testing mode so I think I will just go and pull the power
> >>>>>>>> and see what happens :)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Jon
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ________________________________
> >>>>>>>> From: Paul Angus <pa...@shapeblue.com>
> >>>>>>>> Sent: 12 March 2018 15:31
> >>>>>>>> To: users@cloudstack.apache.org
> >>>>>>>> Subject: RE: KVM HostHA
> >>>>>>>> Hi Jon,
> >>>>>>>>
> >>>>>>>> I think that what you guys are finding, is that a controlled host
> >>>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
> >>>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
> >>>>>>>> action if you shut down a host, only if the host (agent) stops
> >>>>>> responding.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Kind regards,
> >>>>>>>>
> >>>>>>>> Paul Angus
> >>>>>>>>
> >>>>>>>> paul.angus@shapeblue.com
> >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>>>> of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]
> >>>>>>>
> >>>>>>> ]<
> >>>>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>>>> of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>>>> of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>>>
> >>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>>
> >>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >>>> http://www.shapeblue.com/>
> >>>>
> >>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a
> >>>> framework developed by ShapeBlue to deliver the rapid deployment of a
> >>>> standardised ...
> >>>>
> >>>>
> >>>>
> >>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>>>> CSForge
> >>>>> is
> >>>>>> a framework developed by ShapeBlue to deliver <
> >>>>>
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> >>>>> &entry=gmail&source=g
> >>>>>>
> >>>>>> the rapid deployment of a standardised ...
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>> CSForge
> >>>>>>>> is a framework developed by ShapeBlue to deliver the rapid
> >>>> deployment
> >>>>>>>> of a standardised ...
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
> >>>>>>>> Sent: 12 March 2018 15:15
> >>>>>>>> To: users@cloudstack.apache.org
> >>>>>>>> Subject: Re: KVM HostHA
> >>>>>>>>
> >>>>>>>> I have the same issue here and am not entirely sure what the
> >>>> behaviour
> >>>>>>>> should be.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I have one manager node and 2 compute nodes running 4.11 with ipmi
> >>>>>> working
> >>>>>>>> correctly.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> From the UI under HA -
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> HA Enabled Yes
> >>>>>>>> HA State Available
> >>>>>>>> HA Provider kvmhaprovider
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> although interestingly from the "Details" tab it shows -
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> HA enabled No
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> which I assume is a cosmetic issue ?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On each compute node I have one HA enabled VM and one non HA
> enabled
> >>>>> VM.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I power off a compute node and the UI updates the host status and
> >>>> the
> >>>>>> VMs
> >>>>>>>> on that node stop responding but they never fail over to the other
> >>>>> node.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Couple of things I noticed -
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 1) as soon as i power off the compute node the HA state on the
> other
> >>>>>> node
> >>>>>>>> shows "Ineligible"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2) In the UI the instances all still show as green even though two
> >>>> of
> >>>>>> them
> >>>>>>>> are not available
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Any help much appreciated
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ________________________________
> >>>>>>>> From: victor <vi...@ihnetworks.com>
> >>>>>>>> Sent: 07 March 2018 17:01
> >>>>>>>> To: users@cloudstack.apache.org
> >>>>>>>> Subject: KVM HostHA
> >>>>>>>>
> >>>>>>>> Hello Guys,
> >>>>>>>>
> >>>>>>>> I have installed cloudstack 4.11. I have enabled HA for each
> hosts I
> >>>>>> have
> >>>>>>>> added. I have also added ipmi successfully (using ipmi driver).
> >>>>>>>> The hosts are showing like the following.
> >>>>>>>>
> >>>>>>>> =======
> >>>>>>>>
> >>>>>>>> HA Enabled Yes
> >>>>>>>> HA State Available
> >>>>>>>> HA Provider kvmhaprovider
> >>>>>>>>
> >>>>>>>> ======
> >>>>>>>>
> >>>>>>>> Also the host is showing the following correctly
> >>>>>>>>
> >>>>>>>> Resource state --> Enabled
> >>>>>>>> State --> UP
> >>>>>>>> Power state --> On
> >>>>>>>>
> >>>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >>>>>>>> working. I have waited for half an hour. But nothing has happened.
> >>>>> What
> >>>>>>>> will happen to the VM's in that host, if the host failed to back
> up.
> >>>>>>>> There isn't much from logs.
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> Victor
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Andrija Panić
> >>
> >>
> >
> >
> > --
> >
> > Andrija Panić
>
>

Re: KVM HostHA

Posted by Boris Stoyanov <bo...@shapeblue.com>.
yes, KVM + NFS shared storage. 

Boris. 


boris.stoyanov@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

> On 14 Mar 2018, at 14:51, Andrija Panic <an...@gmail.com> wrote:
> 
> Hi Boris,
> 
> ok thanks for the explanation - that makes sense, and covers my "exception
> case" that I have.
> 
> This is atm only available for NFS as I could read (KVM on NFS) ?
> 
> Cheers
> 
> On 14 March 2018 at 13:02, Boris Stoyanov <bo...@shapeblue.com>
> wrote:
> 
>> Hi Andrija,
>> 
>> There’s two types of checks Host-HA is doing to determine if host if
>> healthy.
>> 
>> 1. Health checks - pings the host as soon as there’s connection issues
>> with the agent
>> 
>> If that fails,
>> 
>> 2. Activity checks - checks if there are any writing operations on the
>> Disks of the VMs that are running on the hosts. This is to determine if the
>> VMs are actually alive and executing processes. Only if no disk operations
>> are executed on the shared storage, only then it’s trying to Recover the
>> host with IPMI call, if that eventually fails, it migrates the VMs to a
>> healthy host and Fences the faulty one.
>> 
>> Hope that explains your case.
>> 
>> Boris.
>> 
>> 
>> boris.stoyanov@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>>> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> sorry to bump in the middle of the thread, but just curious about the
>> idea
>>> behing host-HA and why it behaves the way you exlained above:
>>> 
>>> 
>>> Would it be more sense (or not?), that when MGMT detects agents is
>>> unreachable or host unreachable (or after unsuccessful i.e. agent
>> restart,
>>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
>>> making sure no VMS running and then to really start all HA-enabled VMs on
>>> other hosts ?
>>> 
>>> I'm just trying to make parallel to the corosync/pacemaker as clustering
>>> suite/services in Linux (RHEL and others), where when majority of nodes
>>> detect that one node is down, a common thing (especially for shared
>>> storage) is to STONITH that node, make sure it;s down, then move
>> "resource"
>>> (in our case VMs) to other cluster nodes ?
>>> 
>>> I see it's  actually much broader setup per
>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
>> again -
>>> whole idea (in my head at least...) is when host get's down, we make sure
>>> it's down (avoid VM corruption, by doint STONITH to that node) and then
>>> start HA VMs on ohter hosts.
>>> 
>>> I understand there might be exceptions as I have right now (4.8) -
>> libvirt
>>> get stuck (librbd exception or similar) so agent get's disconnected, but
>>> VMs are still running fine... (except DB get messed up, all NICs loose
>>> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
>>> 
>>> 
>>> Thanks
>>> Andrija
>>> 
>>> 
>>> 
>>> 
>>> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk> wrote:
>>> 
>>>> That would make sense.
>>>> 
>>>> 
>>>> I have another server being used for something else at the moment so I
>>>> will add that in and update this thread when I have tested
>>>> 
>>>> 
>>>> Jon
>>>> 
>>>> 
>>>> ________________________________
>>>> From: Paul Angus <pa...@shapeblue.com>
>>>> Sent: 14 March 2018 09:16
>>>> To: users@cloudstack.apache.org
>>>> Subject: RE: KVM HostHA
>>>> 
>>>> I'd need to do some testing, but I suspect that your problem is that you
>>>> only have two hosts.  At the point that one host is deemed out of
>> service,
>>>> you only have one host left.  With only one host, CloudStack will show
>> the
>>>> cluster as ineligible.
>>>> 
>>>> It is extremely common for any system working as a cluster to require a
>>>> minimum starting point of 3 nodes to be able to function.
>>>> 
>>>> 
>>>> Kind regards,
>>>> 
>>>> Paul Angus
>>>> 
>>>> paul.angus@shapeblue.com
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>>>> @shapeblue
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Jon Marshall <jm...@hotmail.co.uk>
>>>> Sent: 14 March 2018 08:36
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: KVM HostHA
>>>> 
>>>> Hi Paul
>>>> 
>>>> 
>>>> My testing does indeed end up with the failed host in maintenance mode
>> but
>>>> the VMs are never migrated. As I posted earlier the management server
>> seems
>>>> to be saying there is no other host that the VM can be migrated to.
>>>> 
>>>> 
>>>> Couple of questions if you have the time to respond -
>>>> 
>>>> 
>>>> 1) this article seems to suggest a reboot or powering off a host will
>> end
>>>> result in the VMs being migrated and this was on CS v 4.2.1 back in
>> 2013 so
>>>> does Host HA do something different
>>>> 
>>>> 
>>>> 2) Whenever one of my two nodes is taken down in testing the active
>>>> compute nodes HA status goes from Available to Ineligible. Should this
>>>> happen ie. is it going to Ineligible stopping the manager from migrating
>>>> the VMs.
>>>> 
>>>> 
>>>> Apologies for all the questions but I just can't get this to work at the
>>>> moment. If I do eventually get it working I will do a write up for
>> others
>>>> with same issue :)
>>>> 
>>>> 
>>>> ________________________________
>>>> From: Paul Angus <pa...@shapeblue.com>
>>>> Sent: 14 March 2018 07:45
>>>> To: users@cloudstack.apache.org
>>>> Subject: RE: KVM HostHA
>>>> 
>>>> Hi Parth,
>>>> 
>>>> Two answer your questions, VM-HA does not restart VMs on an alternate
>> host
>>>> if the original host goes down.  The management server (without host-HA)
>>>> cannot tell what happened to the host.  It cannot tell if there was a
>>>> failure in the agent, loss of connectivity to the management NIC or if
>> the
>>>> host is truly down.  In the first two scenarios, the guest VMs can
>> still be
>>>> running perfectly well, and to restart them elsewhere would be very
>>>> dangerous.  Therefore, the correct thing to do is - nothing but alert
>> the
>>>> operator.  These scenarios are what Host-HA was introduced for.
>>>> 
>>>> Wrt to STONITH, if no disk activity is detected on the host, host-HA
>> will
>>>> try to restart (via IPMI) the host. If, after a configurable number of
>>>> attempts, the host agent still does not check in, then host-HA will shut
>>>> down the host (via IPMA), trigger VM-HA and mark the host as
>> in-maintenance.
>>>> 
>>>> 
>>>> 
>>>> paul.angus@shapeblue.com
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Parth Patel <pa...@gmail.com>
>>>> Sent: 14 March 2018 05:05
>>>> To: users@cloudstack.apache.org
>>>> Subject: Re: KVM HostHA
>>>> 
>>>> Hi Paul,
>>>> 
>>>> Thanks for the clarification. I currently don't have an ipmi enabled
>>>> hardware (in test environment), but it will be beneficial if you can
>> help
>>>> me clear out some basic concepts of it:
>>>> - If HA-enabled VMs are autostarted on another host when current host
>> goes
>>>> down, what is the need or purpose of HA-host? (other than management
>> server
>>>> able to remotely control it's power interfaces)
>>>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
>> approach
>>>> ACS uses to fence the host, but I couldn't find what mechanism or events
>>>> trigger this?
>>>> 
>>>> Thanks and regards,
>>>> Parth Patel
>>>> 
>>>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
>> wrote:
>>>> 
>>>>> The management server doesn't ping the host through IPMI.   However if
>>>>> IPMI is not available, you will not be able to use Host HA, as there
>>>>> is no way for CloudStack to 'fence' the host - that is shut it down to
>>>>> be sure that a VM cannot start again on that host.
>>>>> 
>>>>> I can explain why that is necessary if you wish.
>>>>> 
>>>>> 
>>>>> Kind regards,
>>>>> 
>>>>> Paul Angus
>>>>> 
>>>>> paul.angus@shapeblue.com
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Parth Patel <pa...@gmail.com>
>>>>> Sent: 13 March 2018 16:57
>>>>> To: users@cloudstack.apache.org
>>>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
>>>>> Subject: Re: KVM HostHA
>>>>> 
>>>>> Hi Jon and Victor,
>>>>> 
>>>>> I think the management server pings your host using ipmi (I really
>>>>> don't hope this is the case).
>>>>> In my case, I did not have OOBM enabled at all (my hardware didn't
>>>>> support
>>>>> it)
>>>>> I think you could disable OOBM and/or HA-Host and give that a try :)
>>>>> 
>>>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>>>>> 
>>>>>> Hello Guys,
>>>>>> 
>>>>>> I have tried the following two cases.
>>>>>> 
>>>>>> 1, "echo c > /proc/sysrq-trigger"
>>>>>> 
>>>>>> 2, Pulled the network cable of one of the host
>>>>>> 
>>>>>> In both cases, the following happened.
>>>>>> 
>>>>>> =====
>>>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
>>>>>> nodes of to disconnect
>>>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
>>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
>>>>>> disconnecting with event AgentDisconnected
>>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
>>>>>> Alert
>>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
>>>>>> for
>>>>>> 4 with state Alert
>>>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
>>>>>> =====
>>>>>> 
>>>>>> But nothing happened for the  vm's in that node. I have waited for
>>>>>> one hour and the VM's in that node has been migrated to the other
>>>>>> available hosts. I think the issue is that the management server
>>>>>> still thinks that the VM's in that host is running. Please check the
>>>>>> following logs
>>>>>> 
>>>>>> =======
>>>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
>>>>>> 4
>>>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
>>>>>> running on host 4 ========
>>>>>> 
>>>>>> 
>>>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
>>>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
>>>>>>> into the
>>>>>> server but it did not stop the server responding to an ipmitool
>>>>>> request on the manager eg -
>>>>>>> 
>>>>>>> 
>>>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
>>>>> status"
>>>>>>> 
>>>>>>> 
>>>>>>> from the management server got an answer saying the chassis power
>>>>>>> was on
>>>>>> so CS never registered the compute node as down.
>>>>>>> 
>>>>>>> 
>>>>>>> I am obviously doing something wrong but cannot work it out.
>>>>>>> 
>>>>>>> 
>>>>>>> The management server has one NIC - 172.16.7.4
>>>>>>> 
>>>>>>> 
>>>>>>> Each compute node has 3 NICs -
>>>>>>> 
>>>>>>> 
>>>>>>>                                       cnode1
>>>>>> cnode2
>>>>>>> 
>>>>>>> 
>>>>>>> mangement NIC        172.16.7.5                   172.16.7.6
>>>>>>> 
>>>>>>> vm NIC                      172.16.6.130                 172.16.6.131
>>>>>>> 
>>>>>>> storage -                     172.16.250.4               172.16.250.5
>>>>>>> 
>>>>>>> 
>>>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
>>>>>>> 
>>>>>>> 
>>>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> If I pull the storage NIC presumably nothing will happen as the
>>>>>>> ipmitool
>>>>>> check is running across the management NIC so I need to pull both ?
>>>>>>> 
>>>>>>> My understanding of host HA was the management server monitored
>>>>>>> the
>>>>>> compute nodes using ipmitool and if it did not get a response
>>>>>> because the host was down it would fence off that host and move the
>>>>>> VMs to an active compute node.
>>>>>>> 
>>>>>>> This is obviously too simplistic so could someone explain how it
>>>>>>> is
>>>>>> meant to work and what it is protecting against ?
>>>>>>> 
>>>>>>> ________________________________
>>>>>>> From: Paul Angus <pa...@shapeblue.com>
>>>>>>> Sent: 13 March 2018 07:01
>>>>>>> To: users@cloudstack.apache.org
>>>>>>> Subject: RE: KVM HostHA
>>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> One small note, unplugging the management NIC will only cause an
>>>>>>> HA
>>>>>> event if the storage is running over that NIC also.
>>>>>>> 
>>>>>>> Is the storage is over a separate NIC then, the guest VMs will
>>>>>>> continue
>>>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
>>>>>> activity and conclude that there is nothing it can do, as the VMs
>>>>>> are still running other than mark the hosts as degraded.
>>>>>>> 
>>>>>>> 
>>>>>>> Kind regards,
>>>>>>> 
>>>>>>> Paul Angus
>>>>>>> 
>>>>>>> paul.angus@shapeblue.com
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> 
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>> 
>>>> ]<
>>>>>> http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> 
>>>>>>> Shapeblue - The CloudStack Company
>>>>> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
>>>>> urce=g>
>>>>> <http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>>>> CSForge is
>>>>>> a framework developed by ShapeBlue to deli
>>>>>> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
>>>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
>>>>>> standardised ...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Parth Patel <pa...@gmail.com>
>>>>>>> Sent: 12 March 2018 17:35
>>>>>>> To: users@cloudstack.apache.org
>>>>>>> Subject: Re: KVM HostHA
>>>>>>> 
>>>>>>>> Hi Jon,
>>>>>>>> 
>>>>>>>> As I said, in my case, making the host HA didn't work but by just
>>>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
>>>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
>>>>>>>> management server registered it as down and started the VM on
>>>>>>>> another host. I know I've suggested this before but I insist you
>>>>>>>> give this a try. Also, you don't need to completely power off the
>>>>>>>> machine manually but just plugging out the network cable works
>>>>>>>> fine. The cloudstack agent after losing connection to management
>>>>>>>> server auto reboots because of KVM heartbeat check shell script
>>>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
>>>> thread.
>>>>>>>> 
>>>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
>>>>> wrote:
>>>>>>>> Hi Paul
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks for the response.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I think I am not understanding how it was meant to work then. My
>>>>>>>> understanding was that the manager used ipmitool to just keep
>>>>>>>> querying the compute nodes as to their status so I assumed it
>>>>>>>> didn't matter how you shut the node down, once it was down the
>>>>>>>> manager would get no response and mark it as down (which it does).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I am in testing mode so I think I will just go and pull the power
>>>>>>>> and see what happens :)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Jon
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ________________________________
>>>>>>>> From: Paul Angus <pa...@shapeblue.com>
>>>>>>>> Sent: 12 March 2018 15:31
>>>>>>>> To: users@cloudstack.apache.org
>>>>>>>> Subject: RE: KVM HostHA
>>>>>>>> Hi Jon,
>>>>>>>> 
>>>>>>>> I think that what you guys are finding, is that a controlled host
>>>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
>>>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
>>>>>>>> action if you shut down a host, only if the host (agent) stops
>>>>>> responding.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Kind regards,
>>>>>>>> 
>>>>>>>> Paul Angus
>>>>>>>> 
>>>>>>>> paul.angus@shapeblue.com
>>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> 
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>> 
>>>> ]<
>>>>>> http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> 
>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>>>> CSForge
>>>>> is
>>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
>>>>>> of a standardised ...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> 
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>> 
>>>> ]
>>>>>>> 
>>>>>>> ]<
>>>>>>>> http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> 
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>> 
>>>> ]<
>>>>>> http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> 
>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>>>> CSForge
>>>>> is
>>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
>>>>>> of a standardised ...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> 
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>> 
>>>> ]<
>>>>>> http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> 
>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>>>> CSForge
>>>>> is
>>>>>> a framework developed by ShapeBlue to deliver the rapid deployment
>>>>>> of a standardised ...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>> 
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>>>> 
>>>> ]<
>>>>>> http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> 
>>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>>>> http://www.shapeblue.com/>
>>>> 
>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
>>>> framework developed by ShapeBlue to deliver the rapid deployment of a
>>>> standardised ...
>>>> 
>>>> 
>>>> 
>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>>>> CSForge
>>>>> is
>>>>>> a framework developed by ShapeBlue to deliver <
>>>>> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
>>>>> &entry=gmail&source=g
>>>>>> 
>>>>>> the rapid deployment of a standardised ...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>> CSForge
>>>>>>>> is a framework developed by ShapeBlue to deliver the rapid
>>>> deployment
>>>>>>>> of a standardised ...
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
>>>>>>>> Sent: 12 March 2018 15:15
>>>>>>>> To: users@cloudstack.apache.org
>>>>>>>> Subject: Re: KVM HostHA
>>>>>>>> 
>>>>>>>> I have the same issue here and am not entirely sure what the
>>>> behaviour
>>>>>>>> should be.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I have one manager node and 2 compute nodes running 4.11 with ipmi
>>>>>> working
>>>>>>>> correctly.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> From the UI under HA -
>>>>>>>> 
>>>>>>>> 
>>>>>>>> HA Enabled Yes
>>>>>>>> HA State Available
>>>>>>>> HA Provider kvmhaprovider
>>>>>>>> 
>>>>>>>> 
>>>>>>>> although interestingly from the "Details" tab it shows -
>>>>>>>> 
>>>>>>>> 
>>>>>>>> HA enabled No
>>>>>>>> 
>>>>>>>> 
>>>>>>>> which I assume is a cosmetic issue ?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On each compute node I have one HA enabled VM and one non HA enabled
>>>>> VM.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I power off a compute node and the UI updates the host status and
>>>> the
>>>>>> VMs
>>>>>>>> on that node stop responding but they never fail over to the other
>>>>> node.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Couple of things I noticed -
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 1) as soon as i power off the compute node the HA state on the other
>>>>>> node
>>>>>>>> shows "Ineligible"
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2) In the UI the instances all still show as green even though two
>>>> of
>>>>>> them
>>>>>>>> are not available
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Any help much appreciated
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ________________________________
>>>>>>>> From: victor <vi...@ihnetworks.com>
>>>>>>>> Sent: 07 March 2018 17:01
>>>>>>>> To: users@cloudstack.apache.org
>>>>>>>> Subject: KVM HostHA
>>>>>>>> 
>>>>>>>> Hello Guys,
>>>>>>>> 
>>>>>>>> I have installed cloudstack 4.11. I have enabled HA for each hosts I
>>>>>> have
>>>>>>>> added. I have also added ipmi successfully (using ipmi driver).
>>>>>>>> The hosts are showing like the following.
>>>>>>>> 
>>>>>>>> =======
>>>>>>>> 
>>>>>>>> HA Enabled Yes
>>>>>>>> HA State Available
>>>>>>>> HA Provider kvmhaprovider
>>>>>>>> 
>>>>>>>> ======
>>>>>>>> 
>>>>>>>> Also the host is showing the following correctly
>>>>>>>> 
>>>>>>>> Resource state --> Enabled
>>>>>>>> State --> UP
>>>>>>>> Power state --> On
>>>>>>>> 
>>>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
>>>>>>>> working. I have waited for half an hour. But nothing has happened.
>>>>> What
>>>>>>>> will happen to the VM's in that host, if the host failed to back up.
>>>>>>>> There isn't much from logs.
>>>>>>>> 
>>>>>>>> Regards
>>>>>>>> Victor
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> 
>>> Andrija Panić
>> 
>> 
> 
> 
> -- 
> 
> Andrija Panić


Re: KVM HostHA

Posted by Andrija Panic <an...@gmail.com>.
Hi Boris,

ok thanks for the explanation - that makes sense, and covers my "exception
case" that I have.

This is atm only available for NFS as I could read (KVM on NFS) ?

Cheers

On 14 March 2018 at 13:02, Boris Stoyanov <bo...@shapeblue.com>
wrote:

> Hi Andrija,
>
> There’s two types of checks Host-HA is doing to determine if host if
> healthy.
>
> 1. Health checks - pings the host as soon as there’s connection issues
> with the agent
>
> If that fails,
>
> 2. Activity checks - checks if there are any writing operations on the
> Disks of the VMs that are running on the hosts. This is to determine if the
> VMs are actually alive and executing processes. Only if no disk operations
> are executed on the shared storage, only then it’s trying to Recover the
> host with IPMI call, if that eventually fails, it migrates the VMs to a
> healthy host and Fences the faulty one.
>
> Hope that explains your case.
>
> Boris.
>
>
> boris.stoyanov@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> > On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com> wrote:
> >
> > Hi Paul,
> >
> > sorry to bump in the middle of the thread, but just curious about the
> idea
> > behing host-HA and why it behaves the way you exlained above:
> >
> >
> > Would it be more sense (or not?), that when MGMT detects agents is
> > unreachable or host unreachable (or after unsuccessful i.e. agent
> restart,
> > etc...,to be defined), to actually use IPMI to STONITH the node, thus
> > making sure no VMS running and then to really start all HA-enabled VMs on
> > other hosts ?
> >
> > I'm just trying to make parallel to the corosync/pacemaker as clustering
> > suite/services in Linux (RHEL and others), where when majority of nodes
> > detect that one node is down, a common thing (especially for shared
> > storage) is to STONITH that node, make sure it;s down, then move
> "resource"
> > (in our case VMs) to other cluster nodes ?
> >
> > I see it's  actually much broader setup per
> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> again -
> > whole idea (in my head at least...) is when host get's down, we make sure
> > it's down (avoid VM corruption, by doint STONITH to that node) and then
> > start HA VMs on ohter hosts.
> >
> > I understand there might be exceptions as I have right now (4.8) -
> libvirt
> > get stuck (librbd exception or similar) so agent get's disconnected, but
> > VMs are still running fine... (except DB get messed up, all NICs loose
> > isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> >
> >
> > Thanks
> > Andrija
> >
> >
> >
> >
> > On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk> wrote:
> >
> >> That would make sense.
> >>
> >>
> >> I have another server being used for something else at the moment so I
> >> will add that in and update this thread when I have tested
> >>
> >>
> >> Jon
> >>
> >>
> >> ________________________________
> >> From: Paul Angus <pa...@shapeblue.com>
> >> Sent: 14 March 2018 09:16
> >> To: users@cloudstack.apache.org
> >> Subject: RE: KVM HostHA
> >>
> >> I'd need to do some testing, but I suspect that your problem is that you
> >> only have two hosts.  At the point that one host is deemed out of
> service,
> >> you only have one host left.  With only one host, CloudStack will show
> the
> >> cluster as ineligible.
> >>
> >> It is extremely common for any system working as a cluster to require a
> >> minimum starting point of 3 nodes to be able to function.
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >> paul.angus@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Jon Marshall <jm...@hotmail.co.uk>
> >> Sent: 14 March 2018 08:36
> >> To: users@cloudstack.apache.org
> >> Subject: Re: KVM HostHA
> >>
> >> Hi Paul
> >>
> >>
> >> My testing does indeed end up with the failed host in maintenance mode
> but
> >> the VMs are never migrated. As I posted earlier the management server
> seems
> >> to be saying there is no other host that the VM can be migrated to.
> >>
> >>
> >> Couple of questions if you have the time to respond -
> >>
> >>
> >> 1) this article seems to suggest a reboot or powering off a host will
> end
> >> result in the VMs being migrated and this was on CS v 4.2.1 back in
> 2013 so
> >> does Host HA do something different
> >>
> >>
> >> 2) Whenever one of my two nodes is taken down in testing the active
> >> compute nodes HA status goes from Available to Ineligible. Should this
> >> happen ie. is it going to Ineligible stopping the manager from migrating
> >> the VMs.
> >>
> >>
> >> Apologies for all the questions but I just can't get this to work at the
> >> moment. If I do eventually get it working I will do a write up for
> others
> >> with same issue :)
> >>
> >>
> >> ________________________________
> >> From: Paul Angus <pa...@shapeblue.com>
> >> Sent: 14 March 2018 07:45
> >> To: users@cloudstack.apache.org
> >> Subject: RE: KVM HostHA
> >>
> >> Hi Parth,
> >>
> >> Two answer your questions, VM-HA does not restart VMs on an alternate
> host
> >> if the original host goes down.  The management server (without host-HA)
> >> cannot tell what happened to the host.  It cannot tell if there was a
> >> failure in the agent, loss of connectivity to the management NIC or if
> the
> >> host is truly down.  In the first two scenarios, the guest VMs can
> still be
> >> running perfectly well, and to restart them elsewhere would be very
> >> dangerous.  Therefore, the correct thing to do is - nothing but alert
> the
> >> operator.  These scenarios are what Host-HA was introduced for.
> >>
> >> Wrt to STONITH, if no disk activity is detected on the host, host-HA
> will
> >> try to restart (via IPMI) the host. If, after a configurable number of
> >> attempts, the host agent still does not check in, then host-HA will shut
> >> down the host (via IPMA), trigger VM-HA and mark the host as
> in-maintenance.
> >>
> >>
> >>
> >> paul.angus@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Parth Patel <pa...@gmail.com>
> >> Sent: 14 March 2018 05:05
> >> To: users@cloudstack.apache.org
> >> Subject: Re: KVM HostHA
> >>
> >> Hi Paul,
> >>
> >> Thanks for the clarification. I currently don't have an ipmi enabled
> >> hardware (in test environment), but it will be beneficial if you can
> help
> >> me clear out some basic concepts of it:
> >> - If HA-enabled VMs are autostarted on another host when current host
> goes
> >> down, what is the need or purpose of HA-host? (other than management
> server
> >> able to remotely control it's power interfaces)
> >> - I understood the "Shoot-the-other-node-in-the-head" (STONITH)
> approach
> >> ACS uses to fence the host, but I couldn't find what mechanism or events
> >> trigger this?
> >>
> >> Thanks and regards,
> >> Parth Patel
> >>
> >> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com>
> wrote:
> >>
> >>> The management server doesn't ping the host through IPMI.   However if
> >>> IPMI is not available, you will not be able to use Host HA, as there
> >>> is no way for CloudStack to 'fence' the host - that is shut it down to
> >>> be sure that a VM cannot start again on that host.
> >>>
> >>> I can explain why that is necessary if you wish.
> >>>
> >>>
> >>> Kind regards,
> >>>
> >>> Paul Angus
> >>>
> >>> paul.angus@shapeblue.com
> >>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Parth Patel <pa...@gmail.com>
> >>> Sent: 13 March 2018 16:57
> >>> To: users@cloudstack.apache.org
> >>> Cc: Jon Marshall <jm...@hotmail.co.uk>
> >>> Subject: Re: KVM HostHA
> >>>
> >>> Hi Jon and Victor,
> >>>
> >>> I think the management server pings your host using ipmi (I really
> >>> don't hope this is the case).
> >>> In my case, I did not have OOBM enabled at all (my hardware didn't
> >>> support
> >>> it)
> >>> I think you could disable OOBM and/or HA-Host and give that a try :)
> >>>
> >>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> >>>
> >>>> Hello Guys,
> >>>>
> >>>> I have tried the following two cases.
> >>>>
> >>>> 1, "echo c > /proc/sysrq-trigger"
> >>>>
> >>>> 2, Pulled the network cable of one of the host
> >>>>
> >>>> In both cases, the following happened.
> >>>>
> >>>> =====
> >>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> >>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> >>>> nodes of to disconnect
> >>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> >>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> >>>> disconnecting with event AgentDisconnected
> >>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> >>>> Alert
> >>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> >>>> for
> >>>> 4 with state Alert
> >>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> >>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> >>>> =====
> >>>>
> >>>> But nothing happened for the  vm's in that node. I have waited for
> >>>> one hour and the VM's in that node has been migrated to the other
> >>>> available hosts. I think the issue is that the management server
> >>>> still thinks that the VM's in that host is running. Please check the
> >>>> following logs
> >>>>
> >>>> =======
> >>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> >>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
> >>>> 4
> >>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> >>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> >>>> running on host 4 ========
> >>>>
> >>>>
> >>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> >>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> >>>>> into the
> >>>> server but it did not stop the server responding to an ipmitool
> >>>> request on the manager eg -
> >>>>>
> >>>>>
> >>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> >>> status"
> >>>>>
> >>>>>
> >>>>> from the management server got an answer saying the chassis power
> >>>>> was on
> >>>> so CS never registered the compute node as down.
> >>>>>
> >>>>>
> >>>>> I am obviously doing something wrong but cannot work it out.
> >>>>>
> >>>>>
> >>>>> The management server has one NIC - 172.16.7.4
> >>>>>
> >>>>>
> >>>>> Each compute node has 3 NICs -
> >>>>>
> >>>>>
> >>>>>                                        cnode1
> >>>> cnode2
> >>>>>
> >>>>>
> >>>>> mangement NIC        172.16.7.5                   172.16.7.6
> >>>>>
> >>>>> vm NIC                      172.16.6.130                 172.16.6.131
> >>>>>
> >>>>> storage -                     172.16.250.4               172.16.250.5
> >>>>>
> >>>>>
> >>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> >>>>>
> >>>>>
> >>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
> >>>>>
> >>>>>
> >>>>>
> >>>>> If I pull the storage NIC presumably nothing will happen as the
> >>>>> ipmitool
> >>>> check is running across the management NIC so I need to pull both ?
> >>>>>
> >>>>> My understanding of host HA was the management server monitored
> >>>>> the
> >>>> compute nodes using ipmitool and if it did not get a response
> >>>> because the host was down it would fence off that host and move the
> >>>> VMs to an active compute node.
> >>>>>
> >>>>> This is obviously too simplistic so could someone explain how it
> >>>>> is
> >>>> meant to work and what it is protecting against ?
> >>>>>
> >>>>> ________________________________
> >>>>> From: Paul Angus <pa...@shapeblue.com>
> >>>>> Sent: 13 March 2018 07:01
> >>>>> To: users@cloudstack.apache.org
> >>>>> Subject: RE: KVM HostHA
> >>>>>
> >>>>> Hi all,
> >>>>>
> >>>>> One small note, unplugging the management NIC will only cause an
> >>>>> HA
> >>>> event if the storage is running over that NIC also.
> >>>>>
> >>>>> Is the storage is over a separate NIC then, the guest VMs will
> >>>>> continue
> >>>> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> >>>> activity and conclude that there is nothing it can do, as the VMs
> >>>> are still running other than mark the hosts as degraded.
> >>>>>
> >>>>>
> >>>>> Kind regards,
> >>>>>
> >>>>> Paul Angus
> >>>>>
> >>>>> paul.angus@shapeblue.com
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>
> >> ]<
> >>>> http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>>
> >>>>> Shapeblue - The CloudStack Company
> >>> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> >>> urce=g>
> >>> <http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>> CSForge is
> >>>> a framework developed by ShapeBlue to deli
> >>>> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> >>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
> >>>> standardised ...
> >>>>>
> >>>>>
> >>>>>
> >>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Parth Patel <pa...@gmail.com>
> >>>>> Sent: 12 March 2018 17:35
> >>>>> To: users@cloudstack.apache.org
> >>>>> Subject: Re: KVM HostHA
> >>>>>
> >>>>>> Hi Jon,
> >>>>>>
> >>>>>> As I said, in my case, making the host HA didn't work but by just
> >>>>>> having a HA VM running on host and executing - (WARNING) "echo c
> >>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> >>>>>> management server registered it as down and started the VM on
> >>>>>> another host. I know I've suggested this before but I insist you
> >>>>>> give this a try. Also, you don't need to completely power off the
> >>>>>> machine manually but just plugging out the network cable works
> >>>>>> fine. The cloudstack agent after losing connection to management
> >>>>>> server auto reboots because of KVM heartbeat check shell script
> >>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
> >> thread.
> >>>>>>
> >>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> >>> wrote:
> >>>>>> Hi Paul
> >>>>>>
> >>>>>>
> >>>>>> Thanks for the response.
> >>>>>>
> >>>>>>
> >>>>>> I think I am not understanding how it was meant to work then. My
> >>>>>> understanding was that the manager used ipmitool to just keep
> >>>>>> querying the compute nodes as to their status so I assumed it
> >>>>>> didn't matter how you shut the node down, once it was down the
> >>>>>> manager would get no response and mark it as down (which it does).
> >>>>>>
> >>>>>>
> >>>>>> I am in testing mode so I think I will just go and pull the power
> >>>>>> and see what happens :)
> >>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>>
> >>>>>> Jon
> >>>>>>
> >>>>>>
> >>>>>> ________________________________
> >>>>>> From: Paul Angus <pa...@shapeblue.com>
> >>>>>> Sent: 12 March 2018 15:31
> >>>>>> To: users@cloudstack.apache.org
> >>>>>> Subject: RE: KVM HostHA
> >>>>>> Hi Jon,
> >>>>>>
> >>>>>> I think that what you guys are finding, is that a controlled host
> >>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
> >>>>>> considered an HA event. I wouldn't expect CloudStack to take any
> >>>>>> action if you shut down a host, only if the host (agent) stops
> >>>> responding.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Kind regards,
> >>>>>>
> >>>>>> Paul Angus
> >>>>>>
> >>>>>> paul.angus@shapeblue.com
> >>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>
> >> ]<
> >>>> http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>>
> >>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>> CSForge
> >>> is
> >>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>> of a standardised ...
> >>>>>
> >>>>>
> >>>>>
> >>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>
> >> ]
> >>>>>
> >>>>> ]<
> >>>>>> http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>
> >> ]<
> >>>> http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>>
> >>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>> CSForge
> >>> is
> >>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>> of a standardised ...
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>
> >> ]<
> >>>> http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>>
> >>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>> CSForge
> >>> is
> >>>> a framework developed by ShapeBlue to deliver the rapid deployment
> >>>> of a standardised ...
> >>>>>
> >>>>>
> >>>>>
> >>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> >>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >>
> >> ]<
> >>>> http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>>
> >>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >>>>> CSForge
> >>> is
> >>>> a framework developed by ShapeBlue to deliver <
> >>> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> >>> &entry=gmail&source=g
> >>>>
> >>>> the rapid deployment of a standardised ...
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> >> CSForge
> >>>>>> is a framework developed by ShapeBlue to deliver the rapid
> >> deployment
> >>>>>> of a standardised ...
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
> >>>>>> Sent: 12 March 2018 15:15
> >>>>>> To: users@cloudstack.apache.org
> >>>>>> Subject: Re: KVM HostHA
> >>>>>>
> >>>>>> I have the same issue here and am not entirely sure what the
> >> behaviour
> >>>>>> should be.
> >>>>>>
> >>>>>>
> >>>>>> I have one manager node and 2 compute nodes running 4.11 with ipmi
> >>>> working
> >>>>>> correctly.
> >>>>>>
> >>>>>>
> >>>>>> From the UI under HA -
> >>>>>>
> >>>>>>
> >>>>>> HA Enabled Yes
> >>>>>> HA State Available
> >>>>>> HA Provider kvmhaprovider
> >>>>>>
> >>>>>>
> >>>>>> although interestingly from the "Details" tab it shows -
> >>>>>>
> >>>>>>
> >>>>>> HA enabled No
> >>>>>>
> >>>>>>
> >>>>>> which I assume is a cosmetic issue ?
> >>>>>>
> >>>>>>
> >>>>>> On each compute node I have one HA enabled VM and one non HA enabled
> >>> VM.
> >>>>>>
> >>>>>>
> >>>>>> I power off a compute node and the UI updates the host status and
> >> the
> >>>> VMs
> >>>>>> on that node stop responding but they never fail over to the other
> >>> node.
> >>>>>>
> >>>>>>
> >>>>>> Couple of things I noticed -
> >>>>>>
> >>>>>>
> >>>>>> 1) as soon as i power off the compute node the HA state on the other
> >>>> node
> >>>>>> shows "Ineligible"
> >>>>>>
> >>>>>>
> >>>>>> 2) In the UI the instances all still show as green even though two
> >> of
> >>>> them
> >>>>>> are not available
> >>>>>>
> >>>>>>
> >>>>>> Any help much appreciated
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ________________________________
> >>>>>> From: victor <vi...@ihnetworks.com>
> >>>>>> Sent: 07 March 2018 17:01
> >>>>>> To: users@cloudstack.apache.org
> >>>>>> Subject: KVM HostHA
> >>>>>>
> >>>>>> Hello Guys,
> >>>>>>
> >>>>>> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> >>>> have
> >>>>>> added. I have also added ipmi successfully (using ipmi driver).
> >>>>>> The hosts are showing like the following.
> >>>>>>
> >>>>>> =======
> >>>>>>
> >>>>>> HA Enabled Yes
> >>>>>> HA State Available
> >>>>>> HA Provider kvmhaprovider
> >>>>>>
> >>>>>> ======
> >>>>>>
> >>>>>> Also the host is showing the following correctly
> >>>>>>
> >>>>>> Resource state --> Enabled
> >>>>>> State --> UP
> >>>>>> Power state --> On
> >>>>>>
> >>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >>>>>> working. I have waited for half an hour. But nothing has happened.
> >>> What
> >>>>>> will happen to the VM's in that host, if the host failed to back up.
> >>>>>> There isn't much from logs.
> >>>>>>
> >>>>>> Regards
> >>>>>> Victor
> >>>>>>
> >>>>
> >>>>
> >>>
> >>
> >
> >
> >
> > --
> >
> > Andrija Panić
>
>


-- 

Andrija Panić

Re: KVM HostHA

Posted by Boris Stoyanov <bo...@shapeblue.com>.
Hi Andrija, 

There’s two types of checks Host-HA is doing to determine if host if healthy.

1. Health checks - pings the host as soon as there’s connection issues with the agent

If that fails, 

2. Activity checks - checks if there are any writing operations on the Disks of the VMs that are running on the hosts. This is to determine if the VMs are actually alive and executing processes. Only if no disk operations are executed on the shared storage, only then it’s trying to Recover the host with IPMI call, if that eventually fails, it migrates the VMs to a healthy host and Fences the faulty one. 

Hope that explains your case. 

Boris.


boris.stoyanov@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

> On 14 Mar 2018, at 13:53, Andrija Panic <an...@gmail.com> wrote:
> 
> Hi Paul,
> 
> sorry to bump in the middle of the thread, but just curious about the idea
> behing host-HA and why it behaves the way you exlained above:
> 
> 
> Would it be more sense (or not?), that when MGMT detects agents is
> unreachable or host unreachable (or after unsuccessful i.e. agent restart,
> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> making sure no VMS running and then to really start all HA-enabled VMs on
> other hosts ?
> 
> I'm just trying to make parallel to the corosync/pacemaker as clustering
> suite/services in Linux (RHEL and others), where when majority of nodes
> detect that one node is down, a common thing (especially for shared
> storage) is to STONITH that node, make sure it;s down, then move "resource"
> (in our case VMs) to other cluster nodes ?
> 
> I see it's  actually much broader setup per
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but again -
> whole idea (in my head at least...) is when host get's down, we make sure
> it's down (avoid VM corruption, by doint STONITH to that node) and then
> start HA VMs on ohter hosts.
> 
> I understand there might be exceptions as I have right now (4.8) - libvirt
> get stuck (librbd exception or similar) so agent get's disconnected, but
> VMs are still running fine... (except DB get messed up, all NICs loose
> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> 
> 
> Thanks
> Andrija
> 
> 
> 
> 
> On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk> wrote:
> 
>> That would make sense.
>> 
>> 
>> I have another server being used for something else at the moment so I
>> will add that in and update this thread when I have tested
>> 
>> 
>> Jon
>> 
>> 
>> ________________________________
>> From: Paul Angus <pa...@shapeblue.com>
>> Sent: 14 March 2018 09:16
>> To: users@cloudstack.apache.org
>> Subject: RE: KVM HostHA
>> 
>> I'd need to do some testing, but I suspect that your problem is that you
>> only have two hosts.  At the point that one host is deemed out of service,
>> you only have one host left.  With only one host, CloudStack will show the
>> cluster as ineligible.
>> 
>> It is extremely common for any system working as a cluster to require a
>> minimum starting point of 3 nodes to be able to function.
>> 
>> 
>> Kind regards,
>> 
>> Paul Angus
>> 
>> paul.angus@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Jon Marshall <jm...@hotmail.co.uk>
>> Sent: 14 March 2018 08:36
>> To: users@cloudstack.apache.org
>> Subject: Re: KVM HostHA
>> 
>> Hi Paul
>> 
>> 
>> My testing does indeed end up with the failed host in maintenance mode but
>> the VMs are never migrated. As I posted earlier the management server seems
>> to be saying there is no other host that the VM can be migrated to.
>> 
>> 
>> Couple of questions if you have the time to respond -
>> 
>> 
>> 1) this article seems to suggest a reboot or powering off a host will end
>> result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so
>> does Host HA do something different
>> 
>> 
>> 2) Whenever one of my two nodes is taken down in testing the active
>> compute nodes HA status goes from Available to Ineligible. Should this
>> happen ie. is it going to Ineligible stopping the manager from migrating
>> the VMs.
>> 
>> 
>> Apologies for all the questions but I just can't get this to work at the
>> moment. If I do eventually get it working I will do a write up for others
>> with same issue :)
>> 
>> 
>> ________________________________
>> From: Paul Angus <pa...@shapeblue.com>
>> Sent: 14 March 2018 07:45
>> To: users@cloudstack.apache.org
>> Subject: RE: KVM HostHA
>> 
>> Hi Parth,
>> 
>> Two answer your questions, VM-HA does not restart VMs on an alternate host
>> if the original host goes down.  The management server (without host-HA)
>> cannot tell what happened to the host.  It cannot tell if there was a
>> failure in the agent, loss of connectivity to the management NIC or if the
>> host is truly down.  In the first two scenarios, the guest VMs can still be
>> running perfectly well, and to restart them elsewhere would be very
>> dangerous.  Therefore, the correct thing to do is - nothing but alert the
>> operator.  These scenarios are what Host-HA was introduced for.
>> 
>> Wrt to STONITH, if no disk activity is detected on the host, host-HA will
>> try to restart (via IPMI) the host. If, after a configurable number of
>> attempts, the host agent still does not check in, then host-HA will shut
>> down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.
>> 
>> 
>> 
>> paul.angus@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Parth Patel <pa...@gmail.com>
>> Sent: 14 March 2018 05:05
>> To: users@cloudstack.apache.org
>> Subject: Re: KVM HostHA
>> 
>> Hi Paul,
>> 
>> Thanks for the clarification. I currently don't have an ipmi enabled
>> hardware (in test environment), but it will be beneficial if you can help
>> me clear out some basic concepts of it:
>> - If HA-enabled VMs are autostarted on another host when current host goes
>> down, what is the need or purpose of HA-host? (other than management server
>> able to remotely control it's power interfaces)
>> - I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach
>> ACS uses to fence the host, but I couldn't find what mechanism or events
>> trigger this?
>> 
>> Thanks and regards,
>> Parth Patel
>> 
>> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:
>> 
>>> The management server doesn't ping the host through IPMI.   However if
>>> IPMI is not available, you will not be able to use Host HA, as there
>>> is no way for CloudStack to 'fence' the host - that is shut it down to
>>> be sure that a VM cannot start again on that host.
>>> 
>>> I can explain why that is necessary if you wish.
>>> 
>>> 
>>> Kind regards,
>>> 
>>> Paul Angus
>>> 
>>> paul.angus@shapeblue.com
>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>> 
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: Parth Patel <pa...@gmail.com>
>>> Sent: 13 March 2018 16:57
>>> To: users@cloudstack.apache.org
>>> Cc: Jon Marshall <jm...@hotmail.co.uk>
>>> Subject: Re: KVM HostHA
>>> 
>>> Hi Jon and Victor,
>>> 
>>> I think the management server pings your host using ipmi (I really
>>> don't hope this is the case).
>>> In my case, I did not have OOBM enabled at all (my hardware didn't
>>> support
>>> it)
>>> I think you could disable OOBM and/or HA-Host and give that a try :)
>>> 
>>> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>>> 
>>>> Hello Guys,
>>>> 
>>>> I have tried the following two cases.
>>>> 
>>>> 1, "echo c > /proc/sysrq-trigger"
>>>> 
>>>> 2, Pulled the network cable of one of the host
>>>> 
>>>> In both cases, the following happened.
>>>> 
>>>> =====
>>>> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
>>>> nodes of to disconnect
>>>> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
>>>> disconnecting with event AgentDisconnected
>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
>>>> Alert
>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
>>>> for
>>>> 4 with state Alert
>>>> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
>>>> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
>>>> =====
>>>> 
>>>> But nothing happened for the  vm's in that node. I have waited for
>>>> one hour and the VM's in that node has been migrated to the other
>>>> available hosts. I think the issue is that the management server
>>>> still thinks that the VM's in that host is running. Please check the
>>>> following logs
>>>> 
>>>> =======
>>>> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
>>>> 4
>>>> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
>>>> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
>>>> running on host 4 ========
>>>> 
>>>> 
>>>> On 03/13/2018 04:20 PM, Jon Marshall wrote:
>>>>> I tried "echo c > /proc/sysrq-trigger" which stopped me getting
>>>>> into the
>>>> server but it did not stop the server responding to an ipmitool
>>>> request on the manager eg -
>>>>> 
>>>>> 
>>>>> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
>>> status"
>>>>> 
>>>>> 
>>>>> from the management server got an answer saying the chassis power
>>>>> was on
>>>> so CS never registered the compute node as down.
>>>>> 
>>>>> 
>>>>> I am obviously doing something wrong but cannot work it out.
>>>>> 
>>>>> 
>>>>> The management server has one NIC - 172.16.7.4
>>>>> 
>>>>> 
>>>>> Each compute node has 3 NICs -
>>>>> 
>>>>> 
>>>>>                                        cnode1
>>>> cnode2
>>>>> 
>>>>> 
>>>>> mangement NIC        172.16.7.5                   172.16.7.6
>>>>> 
>>>>> vm NIC                      172.16.6.130                 172.16.6.131
>>>>> 
>>>>> storage -                     172.16.250.4               172.16.250.5
>>>>> 
>>>>> 
>>>>> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
>>>>> 
>>>>> 
>>>>> the dell LOM IPs are the ones used to configure OOBM  in the UI
>>>>> 
>>>>> 
>>>>> 
>>>>> If I pull the storage NIC presumably nothing will happen as the
>>>>> ipmitool
>>>> check is running across the management NIC so I need to pull both ?
>>>>> 
>>>>> My understanding of host HA was the management server monitored
>>>>> the
>>>> compute nodes using ipmitool and if it did not get a response
>>>> because the host was down it would fence off that host and move the
>>>> VMs to an active compute node.
>>>>> 
>>>>> This is obviously too simplistic so could someone explain how it
>>>>> is
>>>> meant to work and what it is protecting against ?
>>>>> 
>>>>> ________________________________
>>>>> From: Paul Angus <pa...@shapeblue.com>
>>>>> Sent: 13 March 2018 07:01
>>>>> To: users@cloudstack.apache.org
>>>>> Subject: RE: KVM HostHA
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> One small note, unplugging the management NIC will only cause an
>>>>> HA
>>>> event if the storage is running over that NIC also.
>>>>> 
>>>>> Is the storage is over a separate NIC then, the guest VMs will
>>>>> continue
>>>> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
>>>> activity and conclude that there is nothing it can do, as the VMs
>>>> are still running other than mark the hosts as degraded.
>>>>> 
>>>>> 
>>>>> Kind regards,
>>>>> 
>>>>> Paul Angus
>>>>> 
>>>>> paul.angus@shapeblue.com
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> 
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]<
>>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> 
>>>>> Shapeblue - The CloudStack Company
>>> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
>>> urce=g>
>>> <http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>> CSForge is
>>>> a framework developed by ShapeBlue to deli
>>>> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
>>>> +d eli&entry=gmail&source=g>ver the rapid deployment of a
>>>> standardised ...
>>>>> 
>>>>> 
>>>>> 
>>>>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Parth Patel <pa...@gmail.com>
>>>>> Sent: 12 March 2018 17:35
>>>>> To: users@cloudstack.apache.org
>>>>> Subject: Re: KVM HostHA
>>>>> 
>>>>>> Hi Jon,
>>>>>> 
>>>>>> As I said, in my case, making the host HA didn't work but by just
>>>>>> having a HA VM running on host and executing - (WARNING) "echo c
>>>>>>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
>>>>>> management server registered it as down and started the VM on
>>>>>> another host. I know I've suggested this before but I insist you
>>>>>> give this a try. Also, you don't need to completely power off the
>>>>>> machine manually but just plugging out the network cable works
>>>>>> fine. The cloudstack agent after losing connection to management
>>>>>> server auto reboots because of KVM heartbeat check shell script
>>>>>> mentioned by Rohit Yadav to one of my earlier queries in other
>> thread.
>>>>>> 
>>>>>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
>>> wrote:
>>>>>> Hi Paul
>>>>>> 
>>>>>> 
>>>>>> Thanks for the response.
>>>>>> 
>>>>>> 
>>>>>> I think I am not understanding how it was meant to work then. My
>>>>>> understanding was that the manager used ipmitool to just keep
>>>>>> querying the compute nodes as to their status so I assumed it
>>>>>> didn't matter how you shut the node down, once it was down the
>>>>>> manager would get no response and mark it as down (which it does).
>>>>>> 
>>>>>> 
>>>>>> I am in testing mode so I think I will just go and pull the power
>>>>>> and see what happens :)
>>>>>> 
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> 
>>>>>> Jon
>>>>>> 
>>>>>> 
>>>>>> ________________________________
>>>>>> From: Paul Angus <pa...@shapeblue.com>
>>>>>> Sent: 12 March 2018 15:31
>>>>>> To: users@cloudstack.apache.org
>>>>>> Subject: RE: KVM HostHA
>>>>>> Hi Jon,
>>>>>> 
>>>>>> I think that what you guys are finding, is that a controlled host
>>>>>> shutdown, which will cause the agent to shutdown cleanly; Is not
>>>>>> considered an HA event. I wouldn't expect CloudStack to take any
>>>>>> action if you shut down a host, only if the host (agent) stops
>>>> responding.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Kind regards,
>>>>>> 
>>>>>> Paul Angus
>>>>>> 
>>>>>> paul.angus@shapeblue.com
>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> 
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]<
>>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> 
>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>> CSForge
>>> is
>>>> a framework developed by ShapeBlue to deliver the rapid deployment
>>>> of a standardised ...
>>>>> 
>>>>> 
>>>>> 
>>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> 
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]
>>>>> 
>>>>> ]<
>>>>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> 
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]<
>>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> 
>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>> CSForge
>>> is
>>>> a framework developed by ShapeBlue to deliver the rapid deployment
>>>> of a standardised ...
>>>>> 
>>>>> 
>>>>> 
>>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> 
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]<
>>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> 
>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>> CSForge
>>> is
>>>> a framework developed by ShapeBlue to deliver the rapid deployment
>>>> of a standardised ...
>>>>> 
>>>>> 
>>>>> 
>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>> 
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>> 
>> ]<
>>>> http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> 
>>>>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>> www.shapeblue.com<http://www.shapeblue.com>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>>>>> CSForge
>>> is
>>>> a framework developed by ShapeBlue to deliver <
>>> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
>>> &entry=gmail&source=g
>>>> 
>>>> the rapid deployment of a standardised ...
>>>>> 
>>>>> 
>>>>> 
>>>>>> Rapid deployment framework for Apache CloudStack IaaS Clouds.
>> CSForge
>>>>>> is a framework developed by ShapeBlue to deliver the rapid
>> deployment
>>>>>> of a standardised ...
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Jon Marshall <jm...@hotmail.co.uk>
>>>>>> Sent: 12 March 2018 15:15
>>>>>> To: users@cloudstack.apache.org
>>>>>> Subject: Re: KVM HostHA
>>>>>> 
>>>>>> I have the same issue here and am not entirely sure what the
>> behaviour
>>>>>> should be.
>>>>>> 
>>>>>> 
>>>>>> I have one manager node and 2 compute nodes running 4.11 with ipmi
>>>> working
>>>>>> correctly.
>>>>>> 
>>>>>> 
>>>>>> From the UI under HA -
>>>>>> 
>>>>>> 
>>>>>> HA Enabled Yes
>>>>>> HA State Available
>>>>>> HA Provider kvmhaprovider
>>>>>> 
>>>>>> 
>>>>>> although interestingly from the "Details" tab it shows -
>>>>>> 
>>>>>> 
>>>>>> HA enabled No
>>>>>> 
>>>>>> 
>>>>>> which I assume is a cosmetic issue ?
>>>>>> 
>>>>>> 
>>>>>> On each compute node I have one HA enabled VM and one non HA enabled
>>> VM.
>>>>>> 
>>>>>> 
>>>>>> I power off a compute node and the UI updates the host status and
>> the
>>>> VMs
>>>>>> on that node stop responding but they never fail over to the other
>>> node.
>>>>>> 
>>>>>> 
>>>>>> Couple of things I noticed -
>>>>>> 
>>>>>> 
>>>>>> 1) as soon as i power off the compute node the HA state on the other
>>>> node
>>>>>> shows "Ineligible"
>>>>>> 
>>>>>> 
>>>>>> 2) In the UI the instances all still show as green even though two
>> of
>>>> them
>>>>>> are not available
>>>>>> 
>>>>>> 
>>>>>> Any help much appreciated
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ________________________________
>>>>>> From: victor <vi...@ihnetworks.com>
>>>>>> Sent: 07 March 2018 17:01
>>>>>> To: users@cloudstack.apache.org
>>>>>> Subject: KVM HostHA
>>>>>> 
>>>>>> Hello Guys,
>>>>>> 
>>>>>> I have installed cloudstack 4.11. I have enabled HA for each hosts I
>>>> have
>>>>>> added. I have also added ipmi successfully (using ipmi driver).
>>>>>> The hosts are showing like the following.
>>>>>> 
>>>>>> =======
>>>>>> 
>>>>>> HA Enabled Yes
>>>>>> HA State Available
>>>>>> HA Provider kvmhaprovider
>>>>>> 
>>>>>> ======
>>>>>> 
>>>>>> Also the host is showing the following correctly
>>>>>> 
>>>>>> Resource state --> Enabled
>>>>>> State --> UP
>>>>>> Power state --> On
>>>>>> 
>>>>>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
>>>>>> working. I have waited for half an hour. But nothing has happened.
>>> What
>>>>>> will happen to the VM's in that host, if the host failed to back up.
>>>>>> There isn't much from logs.
>>>>>> 
>>>>>> Regards
>>>>>> Victor
>>>>>> 
>>>> 
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> 
> Andrija Panić


Re: KVM HostHA

Posted by Andrija Panic <an...@gmail.com>.
Hi Paul,

sorry to bump in the middle of the thread, but just curious about the idea
behing host-HA and why it behaves the way you exlained above:


Would it be more sense (or not?), that when MGMT detects agents is
unreachable or host unreachable (or after unsuccessful i.e. agent restart,
etc...,to be defined), to actually use IPMI to STONITH the node, thus
making sure no VMS running and then to really start all HA-enabled VMs on
other hosts ?

I'm just trying to make parallel to the corosync/pacemaker as clustering
suite/services in Linux (RHEL and others), where when majority of nodes
detect that one node is down, a common thing (especially for shared
storage) is to STONITH that node, make sure it;s down, then move "resource"
(in our case VMs) to other cluster nodes ?

I see it's  actually much broader setup per
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but again -
whole idea (in my head at least...) is when host get's down, we make sure
it's down (avoid VM corruption, by doint STONITH to that node) and then
start HA VMs on ohter hosts.

I understand there might be exceptions as I have right now (4.8) - libvirt
get stuck (librbd exception or similar) so agent get's disconnected, but
VMs are still running fine... (except DB get messed up, all NICs loose
isolation_uri, VR's loose MAC addresses and other IP addresses etc...)


Thanks
Andrija




On 14 March 2018 at 10:57, Jon Marshall <jm...@hotmail.co.uk> wrote:

> That would make sense.
>
>
> I have another server being used for something else at the moment so I
> will add that in and update this thread when I have tested
>
>
> Jon
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 14 March 2018 09:16
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
>
> I'd need to do some testing, but I suspect that your problem is that you
> only have two hosts.  At the point that one host is deemed out of service,
> you only have one host left.  With only one host, CloudStack will show the
> cluster as ineligible.
>
> It is extremely common for any system working as a cluster to require a
> minimum starting point of 3 nodes to be able to function.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Jon Marshall <jm...@hotmail.co.uk>
> Sent: 14 March 2018 08:36
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> Hi Paul
>
>
> My testing does indeed end up with the failed host in maintenance mode but
> the VMs are never migrated. As I posted earlier the management server seems
> to be saying there is no other host that the VM can be migrated to.
>
>
> Couple of questions if you have the time to respond -
>
>
> 1) this article seems to suggest a reboot or powering off a host will end
> result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so
> does Host HA do something different
>
>
> 2) Whenever one of my two nodes is taken down in testing the active
> compute nodes HA status goes from Available to Ineligible. Should this
> happen ie. is it going to Ineligible stopping the manager from migrating
> the VMs.
>
>
> Apologies for all the questions but I just can't get this to work at the
> moment. If I do eventually get it working I will do a write up for others
> with same issue :)
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 14 March 2018 07:45
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
>
> Hi Parth,
>
> Two answer your questions, VM-HA does not restart VMs on an alternate host
> if the original host goes down.  The management server (without host-HA)
> cannot tell what happened to the host.  It cannot tell if there was a
> failure in the agent, loss of connectivity to the management NIC or if the
> host is truly down.  In the first two scenarios, the guest VMs can still be
> running perfectly well, and to restart them elsewhere would be very
> dangerous.  Therefore, the correct thing to do is - nothing but alert the
> operator.  These scenarios are what Host-HA was introduced for.
>
> Wrt to STONITH, if no disk activity is detected on the host, host-HA will
> try to restart (via IPMI) the host. If, after a configurable number of
> attempts, the host agent still does not check in, then host-HA will shut
> down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.
>
>
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 14 March 2018 05:05
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> Hi Paul,
>
> Thanks for the clarification. I currently don't have an ipmi enabled
> hardware (in test environment), but it will be beneficial if you can help
> me clear out some basic concepts of it:
> - If HA-enabled VMs are autostarted on another host when current host goes
> down, what is the need or purpose of HA-host? (other than management server
> able to remotely control it's power interfaces)
> - I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach
> ACS uses to fence the host, but I couldn't find what mechanism or events
> trigger this?
>
> Thanks and regards,
> Parth Patel
>
> On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:
>
> > The management server doesn't ping the host through IPMI.   However if
> > IPMI is not available, you will not be able to use Host HA, as there
> > is no way for CloudStack to 'fence' the host - that is shut it down to
> > be sure that a VM cannot start again on that host.
> >
> > I can explain why that is necessary if you wish.
> >
> >
> > Kind regards,
> >
> > Paul Angus
> >
> > paul.angus@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >
> >
> >
> >
> > -----Original Message-----
> > From: Parth Patel <pa...@gmail.com>
> > Sent: 13 March 2018 16:57
> > To: users@cloudstack.apache.org
> > Cc: Jon Marshall <jm...@hotmail.co.uk>
> > Subject: Re: KVM HostHA
> >
> > Hi Jon and Victor,
> >
> > I think the management server pings your host using ipmi (I really
> > don't hope this is the case).
> > In my case, I did not have OOBM enabled at all (my hardware didn't
> > support
> > it)
> > I think you could disable OOBM and/or HA-Host and give that a try :)
> >
> > On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
> >
> > > Hello Guys,
> > >
> > > I have tried the following two cases.
> > >
> > > 1, "echo c > /proc/sysrq-trigger"
> > >
> > > 2, Pulled the network cable of one of the host
> > >
> > > In both cases, the following happened.
> > >
> > > =====
> > > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> > > nodes of to disconnect
> > > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > > disconnecting with event AgentDisconnected
> > > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > > Alert
> > > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > > for
> > > 4 with state Alert
> > > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > > =====
> > >
> > > But nothing happened for the  vm's in that node. I have waited for
> > > one hour and the VM's in that node has been migrated to the other
> > > available hosts. I think the issue is that the management server
> > > still thinks that the VM's in that host is running. Please check the
> > > following logs
> > >
> > > =======
> > > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
> > > 4
> > > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > > running on host 4 ========
> > >
> > >
> > > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> > > > into the
> > > server but it did not stop the server responding to an ipmitool
> > > request on the manager eg -
> > > >
> > > >
> > > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> > status"
> > > >
> > > >
> > > > from the management server got an answer saying the chassis power
> > > > was on
> > > so CS never registered the compute node as down.
> > > >
> > > >
> > > > I am obviously doing something wrong but cannot work it out.
> > > >
> > > >
> > > > The management server has one NIC - 172.16.7.4
> > > >
> > > >
> > > > Each compute node has 3 NICs -
> > > >
> > > >
> > > >                                         cnode1
> > > cnode2
> > > >
> > > >
> > > > mangement NIC        172.16.7.5                   172.16.7.6
> > > >
> > > > vm NIC                      172.16.6.130                 172.16.6.131
> > > >
> > > > storage -                     172.16.250.4               172.16.250.5
> > > >
> > > >
> > > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > > >
> > > >
> > > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > > >
> > > >
> > > >
> > > > If I pull the storage NIC presumably nothing will happen as the
> > > > ipmitool
> > > check is running across the management NIC so I need to pull both ?
> > > >
> > > > My understanding of host HA was the management server monitored
> > > > the
> > > compute nodes using ipmitool and if it did not get a response
> > > because the host was down it would fence off that host and move the
> > > VMs to an active compute node.
> > > >
> > > > This is obviously too simplistic so could someone explain how it
> > > > is
> > > meant to work and what it is protecting against ?
> > > >
> > > > ________________________________
> > > > From: Paul Angus <pa...@shapeblue.com>
> > > > Sent: 13 March 2018 07:01
> > > > To: users@cloudstack.apache.org
> > > > Subject: RE: KVM HostHA
> > > >
> > > > Hi all,
> > > >
> > > > One small note, unplugging the management NIC will only cause an
> > > > HA
> > > event if the storage is running over that NIC also.
> > > >
> > > > Is the storage is over a separate NIC then, the guest VMs will
> > > > continue
> > > to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> > > activity and conclude that there is nothing it can do, as the VMs
> > > are still running other than mark the hosts as degraded.
> > > >
> > > >
> > > > Kind regards,
> > > >
> > > > Paul Angus
> > > >
> > > > paul.angus@shapeblue.com
> > > > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
> > > http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > >
> > > > Shapeblue - The CloudStack Company
> > <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> > urce=g>
> > <http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > > CSForge is
> > > a framework developed by ShapeBlue to deli
> > > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> > > +d eli&entry=gmail&source=g>ver the rapid deployment of a
> > > standardised ...
> > > >
> > > >
> > > >
> > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > > >
> > > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Parth Patel <pa...@gmail.com>
> > > > Sent: 12 March 2018 17:35
> > > > To: users@cloudstack.apache.org
> > > > Subject: Re: KVM HostHA
> > > >
> > > >> Hi Jon,
> > > >>
> > > >> As I said, in my case, making the host HA didn't work but by just
> > > >> having a HA VM running on host and executing - (WARNING) "echo c
> > > >> > /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > > >> management server registered it as down and started the VM on
> > > >> another host. I know I've suggested this before but I insist you
> > > >> give this a try. Also, you don't need to completely power off the
> > > >> machine manually but just plugging out the network cable works
> > > >> fine. The cloudstack agent after losing connection to management
> > > >> server auto reboots because of KVM heartbeat check shell script
> > > >> mentioned by Rohit Yadav to one of my earlier queries in other
> thread.
> > > >>
> > > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> > wrote:
> > > >> Hi Paul
> > > >>
> > > >>
> > > >> Thanks for the response.
> > > >>
> > > >>
> > > >> I think I am not understanding how it was meant to work then. My
> > > >> understanding was that the manager used ipmitool to just keep
> > > >> querying the compute nodes as to their status so I assumed it
> > > >> didn't matter how you shut the node down, once it was down the
> > > >> manager would get no response and mark it as down (which it does).
> > > >>
> > > >>
> > > >> I am in testing mode so I think I will just go and pull the power
> > > >> and see what happens :)
> > > >>
> > > >>
> > > >> Thanks
> > > >>
> > > >>
> > > >> Jon
> > > >>
> > > >>
> > > >> ________________________________
> > > >> From: Paul Angus <pa...@shapeblue.com>
> > > >> Sent: 12 March 2018 15:31
> > > >> To: users@cloudstack.apache.org
> > > >> Subject: RE: KVM HostHA
> > > >> Hi Jon,
> > > >>
> > > >> I think that what you guys are finding, is that a controlled host
> > > >> shutdown, which will cause the agent to shutdown cleanly; Is not
> > > >> considered an HA event. I wouldn't expect CloudStack to take any
> > > >> action if you shut down a host, only if the host (agent) stops
> > > responding.
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> Kind regards,
> > > >>
> > > >> Paul Angus
> > > >>
> > > >> paul.angus@shapeblue.com
> > > >> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
> > > http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > > CSForge
> > is
> > > a framework developed by ShapeBlue to deliver the rapid deployment
> > > of a standardised ...
> > > >
> > > >
> > > >
> > > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]
> > > >
> > > > ]<
> > > >> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
> > > http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > > CSForge
> > is
> > > a framework developed by ShapeBlue to deliver the rapid deployment
> > > of a standardised ...
> > > >
> > > >
> > > >
> > > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
> > > http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > > CSForge
> > is
> > > a framework developed by ShapeBlue to deliver the rapid deployment
> > > of a standardised ...
> > > >
> > > >
> > > >
> > > >> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
> > > http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> > > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > > CSForge
> > is
> > > a framework developed by ShapeBlue to deliver <
> > https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> > &entry=gmail&source=g
> > >
> > > the rapid deployment of a standardised ...
> > > >
> > > >
> > > >
> > > >> Rapid deployment framework for Apache CloudStack IaaS Clouds.
> CSForge
> > > >> is a framework developed by ShapeBlue to deliver the rapid
> deployment
> > > >> of a standardised ...
> > > >>
> > > >>
> > > >>
> > > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> -----Original Message-----
> > > >> From: Jon Marshall <jm...@hotmail.co.uk>
> > > >> Sent: 12 March 2018 15:15
> > > >> To: users@cloudstack.apache.org
> > > >> Subject: Re: KVM HostHA
> > > >>
> > > >> I have the same issue here and am not entirely sure what the
> behaviour
> > > >> should be.
> > > >>
> > > >>
> > > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > > working
> > > >> correctly.
> > > >>
> > > >>
> > > >>  From the UI under HA -
> > > >>
> > > >>
> > > >> HA Enabled Yes
> > > >> HA State Available
> > > >> HA Provider kvmhaprovider
> > > >>
> > > >>
> > > >> although interestingly from the "Details" tab it shows -
> > > >>
> > > >>
> > > >> HA enabled No
> > > >>
> > > >>
> > > >> which I assume is a cosmetic issue ?
> > > >>
> > > >>
> > > >> On each compute node I have one HA enabled VM and one non HA enabled
> > VM.
> > > >>
> > > >>
> > > >> I power off a compute node and the UI updates the host status and
> the
> > > VMs
> > > >> on that node stop responding but they never fail over to the other
> > node.
> > > >>
> > > >>
> > > >> Couple of things I noticed -
> > > >>
> > > >>
> > > >> 1) as soon as i power off the compute node the HA state on the other
> > > node
> > > >> shows "Ineligible"
> > > >>
> > > >>
> > > >> 2) In the UI the instances all still show as green even though two
> of
> > > them
> > > >> are not available
> > > >>
> > > >>
> > > >> Any help much appreciated
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> ________________________________
> > > >> From: victor <vi...@ihnetworks.com>
> > > >> Sent: 07 March 2018 17:01
> > > >> To: users@cloudstack.apache.org
> > > >> Subject: KVM HostHA
> > > >>
> > > >> Hello Guys,
> > > >>
> > > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > > have
> > > >> added. I have also added ipmi successfully (using ipmi driver).
> > > >> The hosts are showing like the following.
> > > >>
> > > >> =======
> > > >>
> > > >> HA Enabled Yes
> > > >> HA State Available
> > > >> HA Provider kvmhaprovider
> > > >>
> > > >> ======
> > > >>
> > > >> Also the host is showing the following correctly
> > > >>
> > > >> Resource state --> Enabled
> > > >> State --> UP
> > > >> Power state --> On
> > > >>
> > > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > > >> working. I have waited for half an hour. But nothing has happened.
> > What
> > > >> will happen to the VM's in that host, if the host failed to back up.
> > > >> There isn't much from logs.
> > > >>
> > > >> Regards
> > > >> Victor
> > > >>
> > >
> > >
> >
>



-- 

Andrija Panić

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
That would make sense.


I have another server being used for something else at the moment so I will add that in and update this thread when I have tested


Jon


________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 14 March 2018 09:16
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

I'd need to do some testing, but I suspect that your problem is that you only have two hosts.  At the point that one host is deemed out of service, you only have one host left.  With only one host, CloudStack will show the cluster as ineligible.

It is extremely common for any system working as a cluster to require a minimum starting point of 3 nodes to be able to function.


Kind regards,

Paul Angus

paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-----Original Message-----
From: Jon Marshall <jm...@hotmail.co.uk>
Sent: 14 March 2018 08:36
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the VMs are never migrated. As I posted earlier the management server seems to be saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute nodes HA status goes from Available to Ineligible. Should this happen ie. is it going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the moment. If I do eventually get it working I will do a write up for others with same issue :)


________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if the original host goes down.  The management server (without host-HA) cannot tell what happened to the host.  It cannot tell if there was a failure in the agent, loss of connectivity to the management NIC or if the host is truly down.  In the first two scenarios, the guest VMs can still be running perfectly well, and to restart them elsewhere would be very dangerous.  Therefore, the correct thing to do is - nothing but alert the operator.  These scenarios are what Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try to restart (via IPMI) the host. If, after a configurable number of attempts, the host agent still does not check in, then host-HA will shut down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue




-----Original Message-----
From: Parth Patel <pa...@gmail.com>
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware (in test environment), but it will be beneficial if you can help me clear out some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes down, what is the need or purpose of HA-host? (other than management server able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS uses to fence the host, but I couldn't find what mechanism or events trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there
> is no way for CloudStack to 'fence' the host - that is shut it down to
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall <jm...@hotmail.co.uk>
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really
> don't hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't
> support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =====
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other
> > nodes of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > =====
> >
> > But nothing happened for the  vm's in that node. I have waited for
> > one hour and the VM's in that node has been migrated to the other
> > available hosts. I think the issue is that the management server
> > still thinks that the VM's in that host is running. Please check the
> > following logs
> >
> > =======
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host
> > 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > running on host 4 ========
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting
> > > into the
> > server but it did not stop the server responding to an ipmitool
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > > I am obviously doing something wrong but cannot work it out.
> > >
> > >
> > > The management server has one NIC - 172.16.7.4
> > >
> > >
> > > Each compute node has 3 NICs -
> > >
> > >
> > >                                         cnode1
> > cnode2
> > >
> > >
> > > mangement NIC        172.16.7.5                   172.16.7.6
> > >
> > > vm NIC                      172.16.6.130                 172.16.6.131
> > >
> > > storage -                     172.16.250.4               172.16.250.5
> > >
> > >
> > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >
> > >
> > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >
> > >
> > >
> > > If I pull the storage NIC presumably nothing will happen as the
> > > ipmitool
> > check is running across the management NIC so I need to pull both ?
> > >
> > > My understanding of host HA was the management server monitored
> > > the
> > compute nodes using ipmitool and if it did not get a response
> > because the host was down it would fence off that host and move the
> > VMs to an active compute node.
> > >
> > > This is obviously too simplistic so could someone explain how it
> > > is
> > meant to work and what it is protecting against ?
> > >
> > > ________________________________
> > > From: Paul Angus <pa...@shapeblue.com>
> > > Sent: 13 March 2018 07:01
> > > To: users@cloudstack.apache.org
> > > Subject: RE: KVM HostHA
> > >
> > > Hi all,
> > >
> > > One small note, unplugging the management NIC will only cause an
> > > HA
> > event if the storage is running over that NIC also.
> > >
> > > Is the storage is over a separate NIC then, the guest VMs will
> > > continue
> > to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> > activity and conclude that there is nothing it can do, as the VMs
> > are still running other than mark the hosts as degraded.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.angus@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company
> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> urce=g>
> <http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge is
> > a framework developed by ShapeBlue to deli
> > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> > +d eli&entry=gmail&source=g>ver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Parth Patel <pa...@gmail.com>
> > > Sent: 12 March 2018 17:35
> > > To: users@cloudstack.apache.org
> > > Subject: Re: KVM HostHA
> > >
> > >> Hi Jon,
> > >>
> > >> As I said, in my case, making the host HA didn't work but by just
> > >> having a HA VM running on host and executing - (WARNING) "echo c
> > >> > /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >> management server registered it as down and started the VM on
> > >> another host. I know I've suggested this before but I insist you
> > >> give this a try. Also, you don't need to completely power off the
> > >> machine manually but just plugging out the network cable works
> > >> fine. The cloudstack agent after losing connection to management
> > >> server auto reboots because of KVM heartbeat check shell script
> > >> mentioned by Rohit Yadav to one of my earlier queries in other thread.
> > >>
> > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> wrote:
> > >> Hi Paul
> > >>
> > >>
> > >> Thanks for the response.
> > >>
> > >>
> > >> I think I am not understanding how it was meant to work then. My
> > >> understanding was that the manager used ipmitool to just keep
> > >> querying the compute nodes as to their status so I assumed it
> > >> didn't matter how you shut the node down, once it was down the
> > >> manager would get no response and mark it as down (which it does).
> > >>
> > >>
> > >> I am in testing mode so I think I will just go and pull the power
> > >> and see what happens :)
> > >>
> > >>
> > >> Thanks
> > >>
> > >>
> > >> Jon
> > >>
> > >>
> > >> ________________________________
> > >> From: Paul Angus <pa...@shapeblue.com>
> > >> Sent: 12 March 2018 15:31
> > >> To: users@cloudstack.apache.org
> > >> Subject: RE: KVM HostHA
> > >> Hi Jon,
> > >>
> > >> I think that what you guys are finding, is that a controlled host
> > >> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >> considered an HA event. I wouldn't expect CloudStack to take any
> > >> action if you shut down a host, only if the host (agent) stops
> > responding.
> > >>
> > >>
> > >>
> > >>
> > >> Kind regards,
> > >>
> > >> Paul Angus
> > >>
> > >> paul.angus@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment
> > of a standardised ...
> > >
> > >
> > >
> > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]
> > >
> > > ]<
> > >> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment
> > of a standardised ...
> > >
> > >
> > >
> > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment
> > of a standardised ...
> > >
> > >
> > >
> > >> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver <
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> &entry=gmail&source=g
> >
> > the rapid deployment of a standardised ...
> > >
> > >
> > >
> > >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> > >> is a framework developed by ShapeBlue to deliver the rapid deployment
> > >> of a standardised ...
> > >>
> > >>
> > >>
> > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jon Marshall <jm...@hotmail.co.uk>
> > >> Sent: 12 March 2018 15:15
> > >> To: users@cloudstack.apache.org
> > >> Subject: Re: KVM HostHA
> > >>
> > >> I have the same issue here and am not entirely sure what the behaviour
> > >> should be.
> > >>
> > >>
> > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > working
> > >> correctly.
> > >>
> > >>
> > >>  From the UI under HA -
> > >>
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >>
> > >> although interestingly from the "Details" tab it shows -
> > >>
> > >>
> > >> HA enabled No
> > >>
> > >>
> > >> which I assume is a cosmetic issue ?
> > >>
> > >>
> > >> On each compute node I have one HA enabled VM and one non HA enabled
> VM.
> > >>
> > >>
> > >> I power off a compute node and the UI updates the host status and the
> > VMs
> > >> on that node stop responding but they never fail over to the other
> node.
> > >>
> > >>
> > >> Couple of things I noticed -
> > >>
> > >>
> > >> 1) as soon as i power off the compute node the HA state on the other
> > node
> > >> shows "Ineligible"
> > >>
> > >>
> > >> 2) In the UI the instances all still show as green even though two of
> > them
> > >> are not available
> > >>
> > >>
> > >> Any help much appreciated
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: victor <vi...@ihnetworks.com>
> > >> Sent: 07 March 2018 17:01
> > >> To: users@cloudstack.apache.org
> > >> Subject: KVM HostHA
> > >>
> > >> Hello Guys,
> > >>
> > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > have
> > >> added. I have also added ipmi successfully (using ipmi driver).
> > >> The hosts are showing like the following.
> > >>
> > >> =======
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >> ======
> > >>
> > >> Also the host is showing the following correctly
> > >>
> > >> Resource state --> Enabled
> > >> State --> UP
> > >> Power state --> On
> > >>
> > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > >> working. I have waited for half an hour. But nothing has happened.
> What
> > >> will happen to the VM's in that host, if the host failed to back up.
> > >> There isn't much from logs.
> > >>
> > >> Regards
> > >> Victor
> > >>
> >
> >
>

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
I'd need to do some testing, but I suspect that your problem is that you only have two hosts.  At the point that one host is deemed out of service, you only have one host left.  With only one host, CloudStack will show the cluster as ineligible.

It is extremely common for any system working as a cluster to require a minimum starting point of 3 nodes to be able to function.


Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Jon Marshall <jm...@hotmail.co.uk> 
Sent: 14 March 2018 08:36
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the VMs are never migrated. As I posted earlier the management server seems to be saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute nodes HA status goes from Available to Ineligible. Should this happen ie. is it going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the moment. If I do eventually get it working I will do a write up for others with same issue :)


________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if the original host goes down.  The management server (without host-HA) cannot tell what happened to the host.  It cannot tell if there was a failure in the agent, loss of connectivity to the management NIC or if the host is truly down.  In the first two scenarios, the guest VMs can still be running perfectly well, and to restart them elsewhere would be very dangerous.  Therefore, the correct thing to do is - nothing but alert the operator.  These scenarios are what Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try to restart (via IPMI) the host. If, after a configurable number of attempts, the host agent still does not check in, then host-HA will shut down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue




-----Original Message-----
From: Parth Patel <pa...@gmail.com>
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware (in test environment), but it will be beneficial if you can help me clear out some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes down, what is the need or purpose of HA-host? (other than management server able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS uses to fence the host, but I couldn't find what mechanism or events trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there 
> is no way for CloudStack to 'fence' the host - that is shut it down to 
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall <jm...@hotmail.co.uk>
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really 
> don't hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't 
> support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =====
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other 
> > nodes of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is 
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already 
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link 
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4 
> > =====
> >
> > But nothing happened for the  vm's in that node. I have waited for 
> > one hour and the VM's in that node has been migrated to the other 
> > available hosts. I think the issue is that the management server 
> > still thinks that the VM's in that host is running. Please check the 
> > following logs
> >
> > =======
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 
> > 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not 
> > running on host 4 ========
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting 
> > > into the
> > server but it did not stop the server responding to an ipmitool 
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power 
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > > I am obviously doing something wrong but cannot work it out.
> > >
> > >
> > > The management server has one NIC - 172.16.7.4
> > >
> > >
> > > Each compute node has 3 NICs -
> > >
> > >
> > >                                         cnode1
> > cnode2
> > >
> > >
> > > mangement NIC        172.16.7.5                   172.16.7.6
> > >
> > > vm NIC                      172.16.6.130                 172.16.6.131
> > >
> > > storage -                     172.16.250.4               172.16.250.5
> > >
> > >
> > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >
> > >
> > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >
> > >
> > >
> > > If I pull the storage NIC presumably nothing will happen as the 
> > > ipmitool
> > check is running across the management NIC so I need to pull both ?
> > >
> > > My understanding of host HA was the management server monitored 
> > > the
> > compute nodes using ipmitool and if it did not get a response 
> > because the host was down it would fence off that host and move the 
> > VMs to an active compute node.
> > >
> > > This is obviously too simplistic so could someone explain how it 
> > > is
> > meant to work and what it is protecting against ?
> > >
> > > ________________________________
> > > From: Paul Angus <pa...@shapeblue.com>
> > > Sent: 13 March 2018 07:01
> > > To: users@cloudstack.apache.org
> > > Subject: RE: KVM HostHA
> > >
> > > Hi all,
> > >
> > > One small note, unplugging the management NIC will only cause an 
> > > HA
> > event if the storage is running over that NIC also.
> > >
> > > Is the storage is over a separate NIC then, the guest VMs will 
> > > continue
> > to run when the mgmt. NIC is unplugged, Host HA will detect the disk 
> > activity and conclude that there is nothing it can do, as the VMs 
> > are still running other than mark the hosts as degraded.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.angus@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company
> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&so
> urce=g>
> <http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge is
> > a framework developed by ShapeBlue to deli 
> > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to
> > +d eli&entry=gmail&source=g>ver the rapid deployment of a 
> > standardised ...
> > >
> > >
> > >
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Parth Patel <pa...@gmail.com>
> > > Sent: 12 March 2018 17:35
> > > To: users@cloudstack.apache.org
> > > Subject: Re: KVM HostHA
> > >
> > >> Hi Jon,
> > >>
> > >> As I said, in my case, making the host HA didn't work but by just 
> > >> having a HA VM running on host and executing - (WARNING) "echo c 
> > >> > /proc/sysrq-trigger" to simulate a kernel crash on host, the 
> > >> management server registered it as down and started the VM on 
> > >> another host. I know I've suggested this before but I insist you 
> > >> give this a try. Also, you don't need to completely power off the 
> > >> machine manually but just plugging out the network cable works 
> > >> fine. The cloudstack agent after losing connection to management 
> > >> server auto reboots because of KVM heartbeat check shell script 
> > >> mentioned by Rohit Yadav to one of my earlier queries in other thread.
> > >>
> > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> wrote:
> > >> Hi Paul
> > >>
> > >>
> > >> Thanks for the response.
> > >>
> > >>
> > >> I think I am not understanding how it was meant to work then. My 
> > >> understanding was that the manager used ipmitool to just keep 
> > >> querying the compute nodes as to their status so I assumed it 
> > >> didn't matter how you shut the node down, once it was down the 
> > >> manager would get no response and mark it as down (which it does).
> > >>
> > >>
> > >> I am in testing mode so I think I will just go and pull the power 
> > >> and see what happens :)
> > >>
> > >>
> > >> Thanks
> > >>
> > >>
> > >> Jon
> > >>
> > >>
> > >> ________________________________
> > >> From: Paul Angus <pa...@shapeblue.com>
> > >> Sent: 12 March 2018 15:31
> > >> To: users@cloudstack.apache.org
> > >> Subject: RE: KVM HostHA
> > >> Hi Jon,
> > >>
> > >> I think that what you guys are finding, is that a controlled host 
> > >> shutdown, which will cause the agent to shutdown cleanly; Is not 
> > >> considered an HA event. I wouldn't expect CloudStack to take any 
> > >> action if you shut down a host, only if the host (agent) stops
> > responding.
> > >>
> > >>
> > >>
> > >>
> > >> Kind regards,
> > >>
> > >> Paul Angus
> > >>
> > >> paul.angus@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. 
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment 
> > of a standardised ...
> > >
> > >
> > >
> > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]
> > >
> > > ]<
> > >> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. 
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment 
> > of a standardised ...
> > >
> > >
> > >
> > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. 
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment 
> > of a standardised ...
> > >
> > >
> > >
> > >> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. 
> > > CSForge
> is
> > a framework developed by ShapeBlue to deliver <
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver
> &entry=gmail&source=g
> >
> > the rapid deployment of a standardised ...
> > >
> > >
> > >
> > >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> > >> is a framework developed by ShapeBlue to deliver the rapid deployment
> > >> of a standardised ...
> > >>
> > >>
> > >>
> > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jon Marshall <jm...@hotmail.co.uk>
> > >> Sent: 12 March 2018 15:15
> > >> To: users@cloudstack.apache.org
> > >> Subject: Re: KVM HostHA
> > >>
> > >> I have the same issue here and am not entirely sure what the behaviour
> > >> should be.
> > >>
> > >>
> > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > working
> > >> correctly.
> > >>
> > >>
> > >>  From the UI under HA -
> > >>
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >>
> > >> although interestingly from the "Details" tab it shows -
> > >>
> > >>
> > >> HA enabled No
> > >>
> > >>
> > >> which I assume is a cosmetic issue ?
> > >>
> > >>
> > >> On each compute node I have one HA enabled VM and one non HA enabled
> VM.
> > >>
> > >>
> > >> I power off a compute node and the UI updates the host status and the
> > VMs
> > >> on that node stop responding but they never fail over to the other
> node.
> > >>
> > >>
> > >> Couple of things I noticed -
> > >>
> > >>
> > >> 1) as soon as i power off the compute node the HA state on the other
> > node
> > >> shows "Ineligible"
> > >>
> > >>
> > >> 2) In the UI the instances all still show as green even though two of
> > them
> > >> are not available
> > >>
> > >>
> > >> Any help much appreciated
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: victor <vi...@ihnetworks.com>
> > >> Sent: 07 March 2018 17:01
> > >> To: users@cloudstack.apache.org
> > >> Subject: KVM HostHA
> > >>
> > >> Hello Guys,
> > >>
> > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > have
> > >> added. I have also added ipmi successfully (using ipmi driver).
> > >> The hosts are showing like the following.
> > >>
> > >> =======
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >> ======
> > >>
> > >> Also the host is showing the following correctly
> > >>
> > >> Resource state --> Enabled
> > >> State --> UP
> > >> Power state --> On
> > >>
> > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > >> working. I have waited for half an hour. But nothing has happened.
> What
> > >> will happen to the VM's in that host, if the host failed to back up.
> > >> There isn't much from logs.
> > >>
> > >> Regards
> > >> Victor
> > >>
> >
> >
>

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the VMs are never migrated. As I posted earlier the management server seems to be saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute nodes HA status goes from Available to Ineligible. Should this happen ie. is it going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the moment. If I do eventually get it working I will do a write up for others with same issue :)


________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if the original host goes down.  The management server (without host-HA) cannot tell what happened to the host.  It cannot tell if there was a failure in the agent, loss of connectivity to the management NIC or if the host is truly down.  In the first two scenarios, the guest VMs can still be running perfectly well, and to restart them elsewhere would be very dangerous.  Therefore, the correct thing to do is - nothing but alert the operator.  These scenarios are what Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try to restart (via IPMI) the host. If, after a configurable number of attempts, the host agent still does not check in, then host-HA will shut down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-----Original Message-----
From: Parth Patel <pa...@gmail.com>
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware (in test environment), but it will be beneficial if you can help me clear out some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes down, what is the need or purpose of HA-host? (other than management server able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS uses to fence the host, but I couldn't find what mechanism or events trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there
> is no way for CloudStack to 'fence' the host - that is shut it down to
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall <jm...@hotmail.co.uk>
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really don't
> hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =====
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> > of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > =====
> >
> > But nothing happened for the  vm's in that node. I have waited for one
> > hour and the VM's in that node has been migrated to the other
> > available hosts. I think the issue is that the management server still
> > thinks that the VM's in that host is running. Please check the
> > following logs
> >
> > =======
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > running on host 4 ========
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into
> > > the
> > server but it did not stop the server responding to an ipmitool
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > > I am obviously doing something wrong but cannot work it out.
> > >
> > >
> > > The management server has one NIC - 172.16.7.4
> > >
> > >
> > > Each compute node has 3 NICs -
> > >
> > >
> > >                                         cnode1
> > cnode2
> > >
> > >
> > > mangement NIC        172.16.7.5                   172.16.7.6
> > >
> > > vm NIC                      172.16.6.130                 172.16.6.131
> > >
> > > storage -                     172.16.250.4               172.16.250.5
> > >
> > >
> > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >
> > >
> > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >
> > >
> > >
> > > If I pull the storage NIC presumably nothing will happen as the
> > > ipmitool
> > check is running across the management NIC so I need to pull both ?
> > >
> > > My understanding of host HA was the management server monitored the
> > compute nodes using ipmitool and if it did not get a response because
> > the host was down it would fence off that host and move the VMs to an
> > active compute node.
> > >
> > > This is obviously too simplistic so could someone explain how it is
> > meant to work and what it is protecting against ?
> > >
> > > ________________________________
> > > From: Paul Angus <pa...@shapeblue.com>
> > > Sent: 13 March 2018 07:01
> > > To: users@cloudstack.apache.org
> > > Subject: RE: KVM HostHA
> > >
> > > Hi all,
> > >
> > > One small note, unplugging the management NIC will only cause an HA
> > event if the storage is running over that NIC also.
> > >
> > > Is the storage is over a separate NIC then, the guest VMs will
> > > continue
> > to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> > activity and conclude that there is nothing it can do, as the VMs are
> > still running other than mark the hosts as degraded.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.angus@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company
> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&source=g>
> <http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge is
> > a framework developed by ShapeBlue to deli
> > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
> > eli&entry=gmail&source=g>ver the rapid deployment of a standardised
> > ...
> > >
> > >
> > >
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Parth Patel <pa...@gmail.com>
> > > Sent: 12 March 2018 17:35
> > > To: users@cloudstack.apache.org
> > > Subject: Re: KVM HostHA
> > >
> > >> Hi Jon,
> > >>
> > >> As I said, in my case, making the host HA didn't work but by just
> > >> having a HA VM running on host and executing - (WARNING) "echo c >
> > >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >> management server registered it as down and started the VM on another
> > >> host. I know I've suggested this before but I insist you give this a
> > >> try. Also, you don't need to completely power off the machine manually
> > >> but just plugging out the network cable works fine. The cloudstack
> > >> agent after losing connection to management server auto reboots
> > >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> > >> to one of my earlier queries in other thread.
> > >>
> > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> wrote:
> > >> Hi Paul
> > >>
> > >>
> > >> Thanks for the response.
> > >>
> > >>
> > >> I think I am not understanding how it was meant to work then. My
> > >> understanding was that the manager used ipmitool to just keep querying
> > >> the compute nodes as to their status so I assumed it didn't matter how
> > >> you shut the node down, once it was down the manager would get no
> > >> response and mark it as down (which it does).
> > >>
> > >>
> > >> I am in testing mode so I think I will just go and pull the power and
> > >> see what happens :)
> > >>
> > >>
> > >> Thanks
> > >>
> > >>
> > >> Jon
> > >>
> > >>
> > >> ________________________________
> > >> From: Paul Angus <pa...@shapeblue.com>
> > >> Sent: 12 March 2018 15:31
> > >> To: users@cloudstack.apache.org
> > >> Subject: RE: KVM HostHA
> > >> Hi Jon,
> > >>
> > >> I think that what you guys are finding, is that a controlled host
> > >> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >> considered an HA event. I wouldn't expect CloudStack to take any
> > >> action if you shut down a host, only if the host (agent) stops
> > responding.
> > >>
> > >>
> > >>
> > >>
> > >> Kind regards,
> > >>
> > >> Paul Angus
> > >>
> > >> paul.angus@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]
> > >
> > > ]<
> > >> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> > http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver
> > <
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g
> >
> > the rapid deployment of a standardised ...
> > >
> > >
> > >
> > >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> > >> is a framework developed by ShapeBlue to deliver the rapid deployment
> > >> of a standardised ...
> > >>
> > >>
> > >>
> > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jon Marshall <jm...@hotmail.co.uk>
> > >> Sent: 12 March 2018 15:15
> > >> To: users@cloudstack.apache.org
> > >> Subject: Re: KVM HostHA
> > >>
> > >> I have the same issue here and am not entirely sure what the behaviour
> > >> should be.
> > >>
> > >>
> > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > working
> > >> correctly.
> > >>
> > >>
> > >>  From the UI under HA -
> > >>
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >>
> > >> although interestingly from the "Details" tab it shows -
> > >>
> > >>
> > >> HA enabled No
> > >>
> > >>
> > >> which I assume is a cosmetic issue ?
> > >>
> > >>
> > >> On each compute node I have one HA enabled VM and one non HA enabled
> VM.
> > >>
> > >>
> > >> I power off a compute node and the UI updates the host status and the
> > VMs
> > >> on that node stop responding but they never fail over to the other
> node.
> > >>
> > >>
> > >> Couple of things I noticed -
> > >>
> > >>
> > >> 1) as soon as i power off the compute node the HA state on the other
> > node
> > >> shows "Ineligible"
> > >>
> > >>
> > >> 2) In the UI the instances all still show as green even though two of
> > them
> > >> are not available
> > >>
> > >>
> > >> Any help much appreciated
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: victor <vi...@ihnetworks.com>
> > >> Sent: 07 March 2018 17:01
> > >> To: users@cloudstack.apache.org
> > >> Subject: KVM HostHA
> > >>
> > >> Hello Guys,
> > >>
> > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > have
> > >> added. I have also added ipmi successfully (using ipmi driver).
> > >> The hosts are showing like the following.
> > >>
> > >> =======
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >> ======
> > >>
> > >> Also the host is showing the following correctly
> > >>
> > >> Resource state --> Enabled
> > >> State --> UP
> > >> Power state --> On
> > >>
> > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > >> working. I have waited for half an hour. But nothing has happened.
> What
> > >> will happen to the VM's in that host, if the host failed to back up.
> > >> There isn't much from logs.
> > >>
> > >> Regards
> > >> Victor
> > >>
> >
> >
>

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if the original host goes down.  The management server (without host-HA) cannot tell what happened to the host.  It cannot tell if there was a failure in the agent, loss of connectivity to the management NIC or if the host is truly down.  In the first two scenarios, the guest VMs can still be running perfectly well, and to restart them elsewhere would be very dangerous.  Therefore, the correct thing to do is - nothing but alert the operator.  These scenarios are what Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try to restart (via IPMI) the host. If, after a configurable number of attempts, the host agent still does not check in, then host-HA will shut down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.

 

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Parth Patel <pa...@gmail.com> 
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware (in test environment), but it will be beneficial if you can help me clear out some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes down, what is the need or purpose of HA-host? (other than management server able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS uses to fence the host, but I couldn't find what mechanism or events trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there 
> is no way for CloudStack to 'fence' the host - that is shut it down to 
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall <jm...@hotmail.co.uk>
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really don't
> hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =====
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> > of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > =====
> >
> > But nothing happened for the  vm's in that node. I have waited for one
> > hour and the VM's in that node has been migrated to the other
> > available hosts. I think the issue is that the management server still
> > thinks that the VM's in that host is running. Please check the
> > following logs
> >
> > =======
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > running on host 4 ========
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into
> > > the
> > server but it did not stop the server responding to an ipmitool
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > > I am obviously doing something wrong but cannot work it out.
> > >
> > >
> > > The management server has one NIC - 172.16.7.4
> > >
> > >
> > > Each compute node has 3 NICs -
> > >
> > >
> > >                                         cnode1
> > cnode2
> > >
> > >
> > > mangement NIC        172.16.7.5                   172.16.7.6
> > >
> > > vm NIC                      172.16.6.130                 172.16.6.131
> > >
> > > storage -                     172.16.250.4               172.16.250.5
> > >
> > >
> > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >
> > >
> > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >
> > >
> > >
> > > If I pull the storage NIC presumably nothing will happen as the
> > > ipmitool
> > check is running across the management NIC so I need to pull both ?
> > >
> > > My understanding of host HA was the management server monitored the
> > compute nodes using ipmitool and if it did not get a response because
> > the host was down it would fence off that host and move the VMs to an
> > active compute node.
> > >
> > > This is obviously too simplistic so could someone explain how it is
> > meant to work and what it is protecting against ?
> > >
> > > ________________________________
> > > From: Paul Angus <pa...@shapeblue.com>
> > > Sent: 13 March 2018 07:01
> > > To: users@cloudstack.apache.org
> > > Subject: RE: KVM HostHA
> > >
> > > Hi all,
> > >
> > > One small note, unplugging the management NIC will only cause an HA
> > event if the storage is running over that NIC also.
> > >
> > > Is the storage is over a separate NIC then, the guest VMs will
> > > continue
> > to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> > activity and conclude that there is nothing it can do, as the VMs are
> > still running other than mark the hosts as degraded.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.angus@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company
> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&source=g>
> <http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge is
> > a framework developed by ShapeBlue to deli
> > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
> > eli&entry=gmail&source=g>ver the rapid deployment of a standardised
> > ...
> > >
> > >
> > >
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Parth Patel <pa...@gmail.com>
> > > Sent: 12 March 2018 17:35
> > > To: users@cloudstack.apache.org
> > > Subject: Re: KVM HostHA
> > >
> > >> Hi Jon,
> > >>
> > >> As I said, in my case, making the host HA didn't work but by just
> > >> having a HA VM running on host and executing - (WARNING) "echo c >
> > >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >> management server registered it as down and started the VM on another
> > >> host. I know I've suggested this before but I insist you give this a
> > >> try. Also, you don't need to completely power off the machine manually
> > >> but just plugging out the network cable works fine. The cloudstack
> > >> agent after losing connection to management server auto reboots
> > >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> > >> to one of my earlier queries in other thread.
> > >>
> > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> wrote:
> > >> Hi Paul
> > >>
> > >>
> > >> Thanks for the response.
> > >>
> > >>
> > >> I think I am not understanding how it was meant to work then. My
> > >> understanding was that the manager used ipmitool to just keep querying
> > >> the compute nodes as to their status so I assumed it didn't matter how
> > >> you shut the node down, once it was down the manager would get no
> > >> response and mark it as down (which it does).
> > >>
> > >>
> > >> I am in testing mode so I think I will just go and pull the power and
> > >> see what happens :)
> > >>
> > >>
> > >> Thanks
> > >>
> > >>
> > >> Jon
> > >>
> > >>
> > >> ________________________________
> > >> From: Paul Angus <pa...@shapeblue.com>
> > >> Sent: 12 March 2018 15:31
> > >> To: users@cloudstack.apache.org
> > >> Subject: RE: KVM HostHA
> > >> Hi Jon,
> > >>
> > >> I think that what you guys are finding, is that a controlled host
> > >> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >> considered an HA event. I wouldn't expect CloudStack to take any
> > >> action if you shut down a host, only if the host (agent) stops
> > responding.
> > >>
> > >>
> > >>
> > >>
> > >> Kind regards,
> > >>
> > >> Paul Angus
> > >>
> > >> paul.angus@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >
> > > ]<
> > >> http://www.shapeblue.com/>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver
> > <
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g
> >
> > the rapid deployment of a standardised ...
> > >
> > >
> > >
> > >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> > >> is a framework developed by ShapeBlue to deliver the rapid deployment
> > >> of a standardised ...
> > >>
> > >>
> > >>
> > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jon Marshall <jm...@hotmail.co.uk>
> > >> Sent: 12 March 2018 15:15
> > >> To: users@cloudstack.apache.org
> > >> Subject: Re: KVM HostHA
> > >>
> > >> I have the same issue here and am not entirely sure what the behaviour
> > >> should be.
> > >>
> > >>
> > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > working
> > >> correctly.
> > >>
> > >>
> > >>  From the UI under HA -
> > >>
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >>
> > >> although interestingly from the "Details" tab it shows -
> > >>
> > >>
> > >> HA enabled No
> > >>
> > >>
> > >> which I assume is a cosmetic issue ?
> > >>
> > >>
> > >> On each compute node I have one HA enabled VM and one non HA enabled
> VM.
> > >>
> > >>
> > >> I power off a compute node and the UI updates the host status and the
> > VMs
> > >> on that node stop responding but they never fail over to the other
> node.
> > >>
> > >>
> > >> Couple of things I noticed -
> > >>
> > >>
> > >> 1) as soon as i power off the compute node the HA state on the other
> > node
> > >> shows "Ineligible"
> > >>
> > >>
> > >> 2) In the UI the instances all still show as green even though two of
> > them
> > >> are not available
> > >>
> > >>
> > >> Any help much appreciated
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: victor <vi...@ihnetworks.com>
> > >> Sent: 07 March 2018 17:01
> > >> To: users@cloudstack.apache.org
> > >> Subject: KVM HostHA
> > >>
> > >> Hello Guys,
> > >>
> > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > have
> > >> added. I have also added ipmi successfully (using ipmi driver).
> > >> The hosts are showing like the following.
> > >>
> > >> =======
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >> ======
> > >>
> > >> Also the host is showing the following correctly
> > >>
> > >> Resource state --> Enabled
> > >> State --> UP
> > >> Power state --> On
> > >>
> > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > >> working. I have waited for half an hour. But nothing has happened.
> What
> > >> will happen to the VM's in that host, if the host failed to back up.
> > >> There isn't much from logs.
> > >>
> > >> Regards
> > >> Victor
> > >>
> >
> >
>

Re: KVM HostHA

Posted by Parth Patel <pa...@gmail.com>.
Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled
hardware (in test environment), but it will be beneficial if you can help
me clear out some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes
down, what is the need or purpose of HA-host? (other than management server
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach
ACS uses to fence the host, but I couldn't find what mechanism or events
trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus <pa...@shapeblue.com> wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there is no
> way for CloudStack to 'fence' the host - that is shut it down to be sure
> that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall <jm...@hotmail.co.uk>
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really don't
> hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =====
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> > of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > =====
> >
> > But nothing happened for the  vm's in that node. I have waited for one
> > hour and the VM's in that node has been migrated to the other
> > available hosts. I think the issue is that the management server still
> > thinks that the VM's in that host is running. Please check the
> > following logs
> >
> > =======
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > running on host 4 ========
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into
> > > the
> > server but it did not stop the server responding to an ipmitool
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > > I am obviously doing something wrong but cannot work it out.
> > >
> > >
> > > The management server has one NIC - 172.16.7.4
> > >
> > >
> > > Each compute node has 3 NICs -
> > >
> > >
> > >                                         cnode1
> > cnode2
> > >
> > >
> > > mangement NIC        172.16.7.5                   172.16.7.6
> > >
> > > vm NIC                      172.16.6.130                 172.16.6.131
> > >
> > > storage -                     172.16.250.4               172.16.250.5
> > >
> > >
> > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >
> > >
> > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >
> > >
> > >
> > > If I pull the storage NIC presumably nothing will happen as the
> > > ipmitool
> > check is running across the management NIC so I need to pull both ?
> > >
> > > My understanding of host HA was the management server monitored the
> > compute nodes using ipmitool and if it did not get a response because
> > the host was down it would fence off that host and move the VMs to an
> > active compute node.
> > >
> > > This is obviously too simplistic so could someone explain how it is
> > meant to work and what it is protecting against ?
> > >
> > > ________________________________
> > > From: Paul Angus <pa...@shapeblue.com>
> > > Sent: 13 March 2018 07:01
> > > To: users@cloudstack.apache.org
> > > Subject: RE: KVM HostHA
> > >
> > > Hi all,
> > >
> > > One small note, unplugging the management NIC will only cause an HA
> > event if the storage is running over that NIC also.
> > >
> > > Is the storage is over a separate NIC then, the guest VMs will
> > > continue
> > to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> > activity and conclude that there is nothing it can do, as the VMs are
> > still running other than mark the hosts as degraded.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.angus@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company
> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&source=g>
> <http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge is
> > a framework developed by ShapeBlue to deli
> > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
> > eli&entry=gmail&source=g>ver the rapid deployment of a standardised
> > ...
> > >
> > >
> > >
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Parth Patel <pa...@gmail.com>
> > > Sent: 12 March 2018 17:35
> > > To: users@cloudstack.apache.org
> > > Subject: Re: KVM HostHA
> > >
> > >> Hi Jon,
> > >>
> > >> As I said, in my case, making the host HA didn't work but by just
> > >> having a HA VM running on host and executing - (WARNING) "echo c >
> > >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >> management server registered it as down and started the VM on another
> > >> host. I know I've suggested this before but I insist you give this a
> > >> try. Also, you don't need to completely power off the machine manually
> > >> but just plugging out the network cable works fine. The cloudstack
> > >> agent after losing connection to management server auto reboots
> > >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> > >> to one of my earlier queries in other thread.
> > >>
> > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk>
> wrote:
> > >> Hi Paul
> > >>
> > >>
> > >> Thanks for the response.
> > >>
> > >>
> > >> I think I am not understanding how it was meant to work then. My
> > >> understanding was that the manager used ipmitool to just keep querying
> > >> the compute nodes as to their status so I assumed it didn't matter how
> > >> you shut the node down, once it was down the manager would get no
> > >> response and mark it as down (which it does).
> > >>
> > >>
> > >> I am in testing mode so I think I will just go and pull the power and
> > >> see what happens :)
> > >>
> > >>
> > >> Thanks
> > >>
> > >>
> > >> Jon
> > >>
> > >>
> > >> ________________________________
> > >> From: Paul Angus <pa...@shapeblue.com>
> > >> Sent: 12 March 2018 15:31
> > >> To: users@cloudstack.apache.org
> > >> Subject: RE: KVM HostHA
> > >> Hi Jon,
> > >>
> > >> I think that what you guys are finding, is that a controlled host
> > >> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >> considered an HA event. I wouldn't expect CloudStack to take any
> > >> action if you shut down a host, only if the host (agent) stops
> > responding.
> > >>
> > >>
> > >>
> > >>
> > >> Kind regards,
> > >>
> > >> Paul Angus
> > >>
> > >> paul.angus@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >
> > > ]<
> > >> http://www.shapeblue.com/>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver
> > <
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g
> >
> > the rapid deployment of a standardised ...
> > >
> > >
> > >
> > >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> > >> is a framework developed by ShapeBlue to deliver the rapid deployment
> > >> of a standardised ...
> > >>
> > >>
> > >>
> > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jon Marshall <jm...@hotmail.co.uk>
> > >> Sent: 12 March 2018 15:15
> > >> To: users@cloudstack.apache.org
> > >> Subject: Re: KVM HostHA
> > >>
> > >> I have the same issue here and am not entirely sure what the behaviour
> > >> should be.
> > >>
> > >>
> > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > working
> > >> correctly.
> > >>
> > >>
> > >>  From the UI under HA -
> > >>
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >>
> > >> although interestingly from the "Details" tab it shows -
> > >>
> > >>
> > >> HA enabled No
> > >>
> > >>
> > >> which I assume is a cosmetic issue ?
> > >>
> > >>
> > >> On each compute node I have one HA enabled VM and one non HA enabled
> VM.
> > >>
> > >>
> > >> I power off a compute node and the UI updates the host status and the
> > VMs
> > >> on that node stop responding but they never fail over to the other
> node.
> > >>
> > >>
> > >> Couple of things I noticed -
> > >>
> > >>
> > >> 1) as soon as i power off the compute node the HA state on the other
> > node
> > >> shows "Ineligible"
> > >>
> > >>
> > >> 2) In the UI the instances all still show as green even though two of
> > them
> > >> are not available
> > >>
> > >>
> > >> Any help much appreciated
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: victor <vi...@ihnetworks.com>
> > >> Sent: 07 March 2018 17:01
> > >> To: users@cloudstack.apache.org
> > >> Subject: KVM HostHA
> > >>
> > >> Hello Guys,
> > >>
> > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > have
> > >> added. I have also added ipmi successfully (using ipmi driver).
> > >> The hosts are showing like the following.
> > >>
> > >> =======
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >> ======
> > >>
> > >> Also the host is showing the following correctly
> > >>
> > >> Resource state --> Enabled
> > >> State --> UP
> > >> Power state --> On
> > >>
> > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > >> working. I have waited for half an hour. But nothing has happened.
> What
> > >> will happen to the VM's in that host, if the host failed to back up.
> > >> There isn't much from logs.
> > >>
> > >> Regards
> > >> Victor
> > >>
> >
> >
>

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
The management server doesn't ping the host through IPMI.   However if IPMI is not available, you will not be able to use Host HA, as there is no way for CloudStack to 'fence' the host - that is shut it down to be sure that a VM cannot start again on that host.

I can explain why that is necessary if you wish.


Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Parth Patel <pa...@gmail.com> 
Sent: 13 March 2018 16:57
To: users@cloudstack.apache.org
Cc: Jon Marshall <jm...@hotmail.co.uk>
Subject: Re: KVM HostHA

Hi Jon and Victor,

I think the management server pings your host using ipmi (I really don't hope this is the case).
In my case, I did not have OOBM enabled at all (my hardware didn't support
it)
I think you could disable OOBM and/or HA-Host and give that a try :)

On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:

> Hello Guys,
>
> I have tried the following two cases.
>
> 1, "echo c > /proc/sysrq-trigger"
>
> 2, Pulled the network cable of one of the host
>
> In both cases, the following happened.
>
> =====
> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes 
> of to disconnect
> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is 
> disconnecting with event AgentDisconnected
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already 
> Alert
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link 
> for
> 4 with state Alert
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4 
> =====
>
> But nothing happened for the  vm's in that node. I have waited for one 
> hour and the VM's in that node has been migrated to the other 
> available hosts. I think the issue is that the management server still 
> thinks that the VM's in that host is running. Please check the 
> following logs
>
> =======
> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not 
> running on host 4 ========
>
>
> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into 
> > the
> server but it did not stop the server responding to an ipmitool 
> request on the manager eg -
> >
> >
> > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis status"
> >
> >
> > from the management server got an answer saying the chassis power 
> > was on
> so CS never registered the compute node as down.
> >
> >
> > I am obviously doing something wrong but cannot work it out.
> >
> >
> > The management server has one NIC - 172.16.7.4
> >
> >
> > Each compute node has 3 NICs -
> >
> >
> >                                         cnode1
> cnode2
> >
> >
> > mangement NIC        172.16.7.5                   172.16.7.6
> >
> > vm NIC                      172.16.6.130                 172.16.6.131
> >
> > storage -                     172.16.250.4               172.16.250.5
> >
> >
> > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> >
> >
> > the dell LOM IPs are the ones used to configure OOBM  in the UI
> >
> >
> >
> > If I pull the storage NIC presumably nothing will happen as the 
> > ipmitool
> check is running across the management NIC so I need to pull both ?
> >
> > My understanding of host HA was the management server monitored the
> compute nodes using ipmitool and if it did not get a response because 
> the host was down it would fence off that host and move the VMs to an 
> active compute node.
> >
> > This is obviously too simplistic so could someone explain how it is
> meant to work and what it is protecting against ?
> >
> > ________________________________
> > From: Paul Angus <pa...@shapeblue.com>
> > Sent: 13 March 2018 07:01
> > To: users@cloudstack.apache.org
> > Subject: RE: KVM HostHA
> >
> > Hi all,
> >
> > One small note, unplugging the management NIC will only cause an HA
> event if the storage is running over that NIC also.
> >
> > Is the storage is over a separate NIC then, the guest VMs will 
> > continue
> to run when the mgmt. NIC is unplugged, Host HA will detect the disk 
> activity and conclude that there is nothing it can do, as the VMs are 
> still running other than mark the hosts as degraded.
> >
> >
> > Kind regards,
> >
> > Paul Angus
> >
> > paul.angus@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. 
> > CSForge is
> a framework developed by ShapeBlue to deli 
> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
> eli&entry=gmail&source=g>ver the rapid deployment of a standardised 
> ...
> >
> >
> >
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >
> >
> >
> >
> > -----Original Message-----
> > From: Parth Patel <pa...@gmail.com>
> > Sent: 12 March 2018 17:35
> > To: users@cloudstack.apache.org
> > Subject: Re: KVM HostHA
> >
> >> Hi Jon,
> >>
> >> As I said, in my case, making the host HA didn't work but by just
> >> having a HA VM running on host and executing - (WARNING) "echo c >
> >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> >> management server registered it as down and started the VM on another
> >> host. I know I've suggested this before but I insist you give this a
> >> try. Also, you don't need to completely power off the machine manually
> >> but just plugging out the network cable works fine. The cloudstack
> >> agent after losing connection to management server auto reboots
> >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> >> to one of my earlier queries in other thread.
> >>
> >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
> >> Hi Paul
> >>
> >>
> >> Thanks for the response.
> >>
> >>
> >> I think I am not understanding how it was meant to work then. My
> >> understanding was that the manager used ipmitool to just keep querying
> >> the compute nodes as to their status so I assumed it didn't matter how
> >> you shut the node down, once it was down the manager would get no
> >> response and mark it as down (which it does).
> >>
> >>
> >> I am in testing mode so I think I will just go and pull the power and
> >> see what happens :)
> >>
> >>
> >> Thanks
> >>
> >>
> >> Jon
> >>
> >>
> >> ________________________________
> >> From: Paul Angus <pa...@shapeblue.com>
> >> Sent: 12 March 2018 15:31
> >> To: users@cloudstack.apache.org
> >> Subject: RE: KVM HostHA
> >> Hi Jon,
> >>
> >> I think that what you guys are finding, is that a controlled host
> >> shutdown, which will cause the agent to shutdown cleanly; Is not
> >> considered an HA event. I wouldn't expect CloudStack to take any
> >> action if you shut down a host, only if the host (agent) stops
> responding.
> >>
> >>
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >> paul.angus@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >
> > ]<
> >> http://www.shapeblue.com/>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver
> <https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g>
> the rapid deployment of a standardised ...
> >
> >
> >
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a framework developed by ShapeBlue to deliver the rapid deployment
> >> of a standardised ...
> >>
> >>
> >>
> >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Jon Marshall <jm...@hotmail.co.uk>
> >> Sent: 12 March 2018 15:15
> >> To: users@cloudstack.apache.org
> >> Subject: Re: KVM HostHA
> >>
> >> I have the same issue here and am not entirely sure what the behaviour
> >> should be.
> >>
> >>
> >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> working
> >> correctly.
> >>
> >>
> >>  From the UI under HA -
> >>
> >>
> >> HA Enabled Yes
> >> HA State Available
> >> HA Provider kvmhaprovider
> >>
> >>
> >> although interestingly from the "Details" tab it shows -
> >>
> >>
> >> HA enabled No
> >>
> >>
> >> which I assume is a cosmetic issue ?
> >>
> >>
> >> On each compute node I have one HA enabled VM and one non HA enabled VM.
> >>
> >>
> >> I power off a compute node and the UI updates the host status and the
> VMs
> >> on that node stop responding but they never fail over to the other node.
> >>
> >>
> >> Couple of things I noticed -
> >>
> >>
> >> 1) as soon as i power off the compute node the HA state on the other
> node
> >> shows "Ineligible"
> >>
> >>
> >> 2) In the UI the instances all still show as green even though two of
> them
> >> are not available
> >>
> >>
> >> Any help much appreciated
> >>
> >>
> >>
> >>
> >> ________________________________
> >> From: victor <vi...@ihnetworks.com>
> >> Sent: 07 March 2018 17:01
> >> To: users@cloudstack.apache.org
> >> Subject: KVM HostHA
> >>
> >> Hello Guys,
> >>
> >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> have
> >> added. I have also added ipmi successfully (using ipmi driver).
> >> The hosts are showing like the following.
> >>
> >> =======
> >>
> >> HA Enabled Yes
> >> HA State Available
> >> HA Provider kvmhaprovider
> >>
> >> ======
> >>
> >> Also the host is showing the following correctly
> >>
> >> Resource state --> Enabled
> >> State --> UP
> >> Power state --> On
> >>
> >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >> working. I have waited for half an hour. But nothing has happened. What
> >> will happen to the VM's in that host, if the host failed to back up.
> >> There isn't much from logs.
> >>
> >> Regards
> >> Victor
> >>
>
>

Re: KVM HostHA

Posted by Parth Patel <pa...@gmail.com>.
Hi Jon and Victor,

I think the management server pings your host using ipmi (I really don't
hope this is the case).
In my case, I did not have OOBM enabled at all (my hardware didn't support
it)
I think you could disable OOBM and/or HA-Host and give that a try :)

On Tue, 13 Mar 2018 at 20:40 victor <vi...@ihnetworks.com> wrote:

> Hello Guys,
>
> I have tried the following two cases.
>
> 1, "echo c > /proc/sysrq-trigger"
>
> 2, Pulled the network cable of one of the host
>
> In both cases, the following happened.
>
> =====
> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> of to disconnect
> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is disconnecting
> with event AgentDisconnected
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already Alert
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link for
> 4 with state Alert
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> =====
>
> But nothing happened for the  vm's in that node. I have waited for one
> hour and the VM's in that node has been migrated to the other available
> hosts. I think the issue is that the management server still thinks that
> the VM's in that host is running. Please check the following logs
>
> =======
> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not running
> on host 4
> ========
>
>
> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into the
> server but it did not stop the server responding to an ipmitool request on
> the manager eg -
> >
> >
> > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis status"
> >
> >
> > from the management server got an answer saying the chassis power was on
> so CS never registered the compute node as down.
> >
> >
> > I am obviously doing something wrong but cannot work it out.
> >
> >
> > The management server has one NIC - 172.16.7.4
> >
> >
> > Each compute node has 3 NICs -
> >
> >
> >                                         cnode1
> cnode2
> >
> >
> > mangement NIC        172.16.7.5                   172.16.7.6
> >
> > vm NIC                      172.16.6.130                 172.16.6.131
> >
> > storage -                     172.16.250.4               172.16.250.5
> >
> >
> > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> >
> >
> > the dell LOM IPs are the ones used to configure OOBM  in the UI
> >
> >
> >
> > If I pull the storage NIC presumably nothing will happen as the ipmitool
> check is running across the management NIC so I need to pull both ?
> >
> > My understanding of host HA was the management server monitored the
> compute nodes using ipmitool and if it did not get a response because the
> host was down it would fence off that host and move the VMs to an active
> compute node.
> >
> > This is obviously too simplistic so could someone explain how it is
> meant to work and what it is protecting against ?
> >
> > ________________________________
> > From: Paul Angus <pa...@shapeblue.com>
> > Sent: 13 March 2018 07:01
> > To: users@cloudstack.apache.org
> > Subject: RE: KVM HostHA
> >
> > Hi all,
> >
> > One small note, unplugging the management NIC will only cause an HA
> event if the storage is running over that NIC also.
> >
> > Is the storage is over a separate NIC then, the guest VMs will continue
> to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> activity and conclude that there is nothing it can do, as the VMs are still
> running other than mark the hosts as degraded.
> >
> >
> > Kind regards,
> >
> > Paul Angus
> >
> > paul.angus@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deli
> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+deli&entry=gmail&source=g>ver
> the rapid deployment of a standardised ...
> >
> >
> >
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
> > -----Original Message-----
> > From: Parth Patel <pa...@gmail.com>
> > Sent: 12 March 2018 17:35
> > To: users@cloudstack.apache.org
> > Subject: Re: KVM HostHA
> >
> >> Hi Jon,
> >>
> >> As I said, in my case, making the host HA didn't work but by just
> >> having a HA VM running on host and executing - (WARNING) "echo c >
> >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> >> management server registered it as down and started the VM on another
> >> host. I know I've suggested this before but I insist you give this a
> >> try. Also, you don't need to completely power off the machine manually
> >> but just plugging out the network cable works fine. The cloudstack
> >> agent after losing connection to management server auto reboots
> >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> >> to one of my earlier queries in other thread.
> >>
> >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
> >> Hi Paul
> >>
> >>
> >> Thanks for the response.
> >>
> >>
> >> I think I am not understanding how it was meant to work then. My
> >> understanding was that the manager used ipmitool to just keep querying
> >> the compute nodes as to their status so I assumed it didn't matter how
> >> you shut the node down, once it was down the manager would get no
> >> response and mark it as down (which it does).
> >>
> >>
> >> I am in testing mode so I think I will just go and pull the power and
> >> see what happens :)
> >>
> >>
> >> Thanks
> >>
> >>
> >> Jon
> >>
> >>
> >> ________________________________
> >> From: Paul Angus <pa...@shapeblue.com>
> >> Sent: 12 March 2018 15:31
> >> To: users@cloudstack.apache.org
> >> Subject: RE: KVM HostHA
> >> Hi Jon,
> >>
> >> I think that what you guys are finding, is that a controlled host
> >> shutdown, which will cause the agent to shutdown cleanly; Is not
> >> considered an HA event. I wouldn't expect CloudStack to take any
> >> action if you shut down a host, only if the host (agent) stops
> responding.
> >>
> >>
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >> paul.angus@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >
> > ]<
> >> http://www.shapeblue.com/>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver
> <https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g>
> the rapid deployment of a standardised ...
> >
> >
> >
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a framework developed by ShapeBlue to deliver the rapid deployment
> >> of a standardised ...
> >>
> >>
> >>
> >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Jon Marshall <jm...@hotmail.co.uk>
> >> Sent: 12 March 2018 15:15
> >> To: users@cloudstack.apache.org
> >> Subject: Re: KVM HostHA
> >>
> >> I have the same issue here and am not entirely sure what the behaviour
> >> should be.
> >>
> >>
> >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> working
> >> correctly.
> >>
> >>
> >>  From the UI under HA -
> >>
> >>
> >> HA Enabled Yes
> >> HA State Available
> >> HA Provider kvmhaprovider
> >>
> >>
> >> although interestingly from the "Details" tab it shows -
> >>
> >>
> >> HA enabled No
> >>
> >>
> >> which I assume is a cosmetic issue ?
> >>
> >>
> >> On each compute node I have one HA enabled VM and one non HA enabled VM.
> >>
> >>
> >> I power off a compute node and the UI updates the host status and the
> VMs
> >> on that node stop responding but they never fail over to the other node.
> >>
> >>
> >> Couple of things I noticed -
> >>
> >>
> >> 1) as soon as i power off the compute node the HA state on the other
> node
> >> shows "Ineligible"
> >>
> >>
> >> 2) In the UI the instances all still show as green even though two of
> them
> >> are not available
> >>
> >>
> >> Any help much appreciated
> >>
> >>
> >>
> >>
> >> ________________________________
> >> From: victor <vi...@ihnetworks.com>
> >> Sent: 07 March 2018 17:01
> >> To: users@cloudstack.apache.org
> >> Subject: KVM HostHA
> >>
> >> Hello Guys,
> >>
> >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> have
> >> added. I have also added ipmi successfully (using ipmi driver).
> >> The hosts are showing like the following.
> >>
> >> =======
> >>
> >> HA Enabled Yes
> >> HA State Available
> >> HA Provider kvmhaprovider
> >>
> >> ======
> >>
> >> Also the host is showing the following correctly
> >>
> >> Resource state --> Enabled
> >> State --> UP
> >> Power state --> On
> >>
> >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >> working. I have waited for half an hour. But nothing has happened. What
> >> will happen to the VM's in that host, if the host failed to back up.
> >> There isn't much from logs.
> >>
> >> Regards
> >> Victor
> >>
>
>

Re: KVM HostHA

Posted by victor <vi...@ihnetworks.com>.
Hello Guys,

I have tried the following two cases.

1, "echo c > /proc/sysrq-trigger"

2, Pulled the network cable of one of the host

In both cases, the following happened.

=====
2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes 
of to disconnect
2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl] 
(AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is disconnecting 
with event AgentDisconnected
2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already Alert
2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link for 
4 with state Alert
2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
=====

But nothing happened for the  vm's in that node. I have waited for one 
hour and the VM's in that node has been migrated to the other available 
hosts. I think the issue is that the management server still thinks that 
the VM's in that host is running. Please check the following logs

=======
2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl] 
(CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl] 
(CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not running 
on host 4
========


On 03/13/2018 04:20 PM, Jon Marshall wrote:
> I tried "echo c > /proc/sysrq-trigger" which stopped me getting into the server but it did not stop the server responding to an ipmitool request on the manager eg -
>
>
> "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis status"
>
>
> from the management server got an answer saying the chassis power was on so CS never registered the compute node as down.
>
>
> I am obviously doing something wrong but cannot work it out.
>
>
> The management server has one NIC - 172.16.7.4
>
>
> Each compute node has 3 NICs -
>
>
>                                         cnode1                        cnode2
>
>
> mangement NIC        172.16.7.5                   172.16.7.6
>
> vm NIC                      172.16.6.130                 172.16.6.131
>
> storage -                     172.16.250.4               172.16.250.5
>
>
> Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
>
>
> the dell LOM IPs are the ones used to configure OOBM  in the UI
>
>
>
> If I pull the storage NIC presumably nothing will happen as the ipmitool check is running across the management NIC so I need to pull both ?
>
> My understanding of host HA was the management server monitored the compute nodes using ipmitool and if it did not get a response because the host was down it would fence off that host and move the VMs to an active compute node.
>
> This is obviously too simplistic so could someone explain how it is meant to work and what it is protecting against ?
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 13 March 2018 07:01
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
>
> Hi all,
>
> One small note, unplugging the management NIC will only cause an HA event if the storage is running over that NIC also.
>
> Is the storage is over a separate NIC then, the guest VMs will continue to run when the mgmt. NIC is unplugged, Host HA will detect the disk activity and conclude that there is nothing it can do, as the VMs are still running other than mark the hosts as degraded.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <pa...@gmail.com>
> Sent: 12 March 2018 17:35
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
>> Hi Jon,
>>
>> As I said, in my case, making the host HA didn't work but by just
>> having a HA VM running on host and executing - (WARNING) "echo c >
>> /proc/sysrq-trigger" to simulate a kernel crash on host, the
>> management server registered it as down and started the VM on another
>> host. I know I've suggested this before but I insist you give this a
>> try. Also, you don't need to completely power off the machine manually
>> but just plugging out the network cable works fine. The cloudstack
>> agent after losing connection to management server auto reboots
>> because of KVM heartbeat check shell script mentioned by Rohit Yadav
>> to one of my earlier queries in other thread.
>>
>> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
>> Hi Paul
>>
>>
>> Thanks for the response.
>>
>>
>> I think I am not understanding how it was meant to work then. My
>> understanding was that the manager used ipmitool to just keep querying
>> the compute nodes as to their status so I assumed it didn't matter how
>> you shut the node down, once it was down the manager would get no
>> response and mark it as down (which it does).
>>
>>
>> I am in testing mode so I think I will just go and pull the power and
>> see what happens :)
>>
>>
>> Thanks
>>
>>
>> Jon
>>
>>
>> ________________________________
>> From: Paul Angus <pa...@shapeblue.com>
>> Sent: 12 March 2018 15:31
>> To: users@cloudstack.apache.org
>> Subject: RE: KVM HostHA
>> Hi Jon,
>>
>> I think that what you guys are finding, is that a controlled host
>> shutdown, which will cause the agent to shutdown cleanly; Is not
>> considered an HA event. I wouldn't expect CloudStack to take any
>> action if you shut down a host, only if the host (agent) stops responding.
>>
>>
>>
>>
>> Kind regards,
>>
>> Paul Angus
>>
>> paul.angus@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
>
> ]<
>> http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...
>
>
>
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a framework developed by ShapeBlue to deliver the rapid deployment
>> of a standardised ...
>>
>>
>>
>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>>
>>
>>
>>
>> -----Original Message-----
>> From: Jon Marshall <jm...@hotmail.co.uk>
>> Sent: 12 March 2018 15:15
>> To: users@cloudstack.apache.org
>> Subject: Re: KVM HostHA
>>
>> I have the same issue here and am not entirely sure what the behaviour
>> should be.
>>
>>
>> I have one manager node and 2 compute nodes running 4.11 with ipmi working
>> correctly.
>>
>>
>>  From the UI under HA -
>>
>>
>> HA Enabled Yes
>> HA State Available
>> HA Provider kvmhaprovider
>>
>>
>> although interestingly from the "Details" tab it shows -
>>
>>
>> HA enabled No
>>
>>
>> which I assume is a cosmetic issue ?
>>
>>
>> On each compute node I have one HA enabled VM and one non HA enabled VM.
>>
>>
>> I power off a compute node and the UI updates the host status and the VMs
>> on that node stop responding but they never fail over to the other node.
>>
>>
>> Couple of things I noticed -
>>
>>
>> 1) as soon as i power off the compute node the HA state on the other node
>> shows "Ineligible"
>>
>>
>> 2) In the UI the instances all still show as green even though two of them
>> are not available
>>
>>
>> Any help much appreciated
>>
>>
>>
>>
>> ________________________________
>> From: victor <vi...@ihnetworks.com>
>> Sent: 07 March 2018 17:01
>> To: users@cloudstack.apache.org
>> Subject: KVM HostHA
>>
>> Hello Guys,
>>
>> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
>> added. I have also added ipmi successfully (using ipmi driver).
>> The hosts are showing like the following.
>>
>> =======
>>
>> HA Enabled Yes
>> HA State Available
>> HA Provider kvmhaprovider
>>
>> ======
>>
>> Also the host is showing the following correctly
>>
>> Resource state --> Enabled
>> State --> UP
>> Power state --> On
>>
>> So I have shutdown one of the hosts to see how the KVM hosts Ha is
>> working. I have waited for half an hour. But nothing has happened. What
>> will happen to the VM's in that host, if the host failed to back up.
>> There isn't much from logs.
>>
>> Regards
>> Victor
>>


Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Update on below.


I pulled the NICs for both management and storage from cnode 1.


1) The UI immediately showed the power state as Unknown but the state was Up.

2) The HA state on cnode1 showed as suspect. The HA state on cnode2 showed as available.

3) After about 4 mins the state on cnode1 went from Up to Alert

4) The HA state on cnode1 showed as Fencing and the HA state on cnode2 showed as Ineligible.


The HA enabled VMs on cnode1 never switched over to the working node cnode2.


Any ideas ?


________________________________
From: Jon Marshall <jm...@hotmail.co.uk>
Sent: 13 March 2018 10:50
To: users@cloudstack.apache.org
Subject: Re: KVM HostHAtot stop the server responding to an ipmitool request on the manager eg -


"ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis status"


from the management server got an answer saying the chassis power was on so CS never registered the compute node as down.


I am obviously doing something wrong but cannot work it out.


The management server has one NIC - 172.16.7.4


Each compute node has 3 NICs -


                                       cnode1                        cnode2


mangement NIC        172.16.7.5                   172.16.7.6

vm NIC                      172.16.6.130                 172.16.6.131

storage -                     172.16.250.4               172.16.250.5


Dell LOM (for Idrac)   172.16.7.29                172.16.7.30


the dell LOM IPs are the ones used to configure OOBM  in the UI



If I pull the storage NIC presumably nothing will happen as the ipmitool check is running across the management NIC so I need to pull both ?

My understanding of host HA was the management server monitored the compute nodes using ipmitool and if it did not get a response because the host was down it would fence off that host and move the VMs to an active compute node.

This is obviously too simplistic so could someone explain how it is meant to work and what it is protecting against ?

________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 13 March 2018 07:01
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi all,

One small note, unplugging the management NIC will only cause an HA event if the storage is running over that NIC also.

Is the storage is over a separate NIC then, the guest VMs will continue to run when the mgmt. NIC is unplugged, Host HA will detect the disk activity and conclude that there is nothing it can do, as the VMs are still running other than mark the hosts as degraded.


Kind regards,

Paul Angus

paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-----Original Message-----
From: Parth Patel <pa...@gmail.com>
Sent: 12 March 2018 17:35
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

>
> Hi Jon,
>
> As I said, in my case, making the host HA didn't work but by just
> having a HA VM running on host and executing - (WARNING) "echo c >
> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> management server registered it as down and started the VM on another
> host. I know I've suggested this before but I insist you give this a
> try. Also, you don't need to completely power off the machine manually
> but just plugging out the network cable works fine. The cloudstack
> agent after losing connection to management server auto reboots
> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> to one of my earlier queries in other thread.
>
> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
> Hi Paul
>
>
> Thanks for the response.
>
>
> I think I am not understanding how it was meant to work then. My
> understanding was that the manager used ipmitool to just keep querying
> the compute nodes as to their status so I assumed it didn't matter how
> you shut the node down, once it was down the manager would get no
> response and mark it as down (which it does).
>
>
> I am in testing mode so I think I will just go and pull the power and
> see what happens :)
>
>
> Thanks
>
>
> Jon
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 12 March 2018 15:31
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
> Hi Jon,
>
> I think that what you guys are finding, is that a controlled host
> shutdown, which will cause the agent to shutdown cleanly; Is not
> considered an HA event. I wouldn't expect CloudStack to take any
> action if you shut down a host, only if the host (agent) stops responding.
>
>
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]

]<
> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a framework developed by ShapeBlue to deliver the rapid deployment
> of a standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Jon Marshall <jm...@hotmail.co.uk>
> Sent: 12 March 2018 15:15
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> I have the same issue here and am not entirely sure what the behaviour
> should be.
>
>
> I have one manager node and 2 compute nodes running 4.11 with ipmi working
> correctly.
>
>
> From the UI under HA -
>
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
>
> although interestingly from the "Details" tab it shows -
>
>
> HA enabled No
>
>
> which I assume is a cosmetic issue ?
>
>
> On each compute node I have one HA enabled VM and one non HA enabled VM.
>
>
> I power off a compute node and the UI updates the host status and the VMs
> on that node stop responding but they never fail over to the other node.
>
>
> Couple of things I noticed -
>
>
> 1) as soon as i power off the compute node the HA state on the other node
> shows "Ineligible"
>
>
> 2) In the UI the instances all still show as green even though two of them
> are not available
>
>
> Any help much appreciated
>
>
>
>
> ________________________________
> From: victor <vi...@ihnetworks.com>
> Sent: 07 March 2018 17:01
> To: users@cloudstack.apache.org
> Subject: KVM HostHA
>
> Hello Guys,
>
> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
> added. I have also added ipmi successfully (using ipmi driver).
> The hosts are showing like the following.
>
> =======
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
> ======
>
> Also the host is showing the following correctly
>
> Resource state --> Enabled
> State --> UP
> Power state --> On
>
> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> working. I have waited for half an hour. But nothing has happened. What
> will happen to the VM's in that host, if the host failed to back up.
> There isn't much from logs.
>
> Regards
> Victor
>

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
I tried "echo c > /proc/sysrq-trigger" which stopped me getting into the server but it did not stop the server responding to an ipmitool request on the manager eg -


"ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis status"


from the management server got an answer saying the chassis power was on so CS never registered the compute node as down.


I am obviously doing something wrong but cannot work it out.


The management server has one NIC - 172.16.7.4


Each compute node has 3 NICs -


                                       cnode1                        cnode2


mangement NIC        172.16.7.5                   172.16.7.6

vm NIC                      172.16.6.130                 172.16.6.131

storage -                     172.16.250.4               172.16.250.5


Dell LOM (for Idrac)   172.16.7.29                172.16.7.30


the dell LOM IPs are the ones used to configure OOBM  in the UI



If I pull the storage NIC presumably nothing will happen as the ipmitool check is running across the management NIC so I need to pull both ?

My understanding of host HA was the management server monitored the compute nodes using ipmitool and if it did not get a response because the host was down it would fence off that host and move the VMs to an active compute node.

This is obviously too simplistic so could someone explain how it is meant to work and what it is protecting against ?

________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 13 March 2018 07:01
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi all,

One small note, unplugging the management NIC will only cause an HA event if the storage is running over that NIC also.

Is the storage is over a separate NIC then, the guest VMs will continue to run when the mgmt. NIC is unplugged, Host HA will detect the disk activity and conclude that there is nothing it can do, as the VMs are still running other than mark the hosts as degraded.


Kind regards,

Paul Angus

paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-----Original Message-----
From: Parth Patel <pa...@gmail.com>
Sent: 12 March 2018 17:35
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

>
> Hi Jon,
>
> As I said, in my case, making the host HA didn't work but by just
> having a HA VM running on host and executing - (WARNING) "echo c >
> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> management server registered it as down and started the VM on another
> host. I know I've suggested this before but I insist you give this a
> try. Also, you don't need to completely power off the machine manually
> but just plugging out the network cable works fine. The cloudstack
> agent after losing connection to management server auto reboots
> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> to one of my earlier queries in other thread.
>
> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
> Hi Paul
>
>
> Thanks for the response.
>
>
> I think I am not understanding how it was meant to work then. My
> understanding was that the manager used ipmitool to just keep querying
> the compute nodes as to their status so I assumed it didn't matter how
> you shut the node down, once it was down the manager would get no
> response and mark it as down (which it does).
>
>
> I am in testing mode so I think I will just go and pull the power and
> see what happens :)
>
>
> Thanks
>
>
> Jon
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 12 March 2018 15:31
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
> Hi Jon,
>
> I think that what you guys are finding, is that a controlled host
> shutdown, which will cause the agent to shutdown cleanly; Is not
> considered an HA event. I wouldn't expect CloudStack to take any
> action if you shut down a host, only if the host (agent) stops responding.
>
>
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

]<
> http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a framework developed by ShapeBlue to deliver the rapid deployment
> of a standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Jon Marshall <jm...@hotmail.co.uk>
> Sent: 12 March 2018 15:15
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> I have the same issue here and am not entirely sure what the behaviour
> should be.
>
>
> I have one manager node and 2 compute nodes running 4.11 with ipmi working
> correctly.
>
>
> From the UI under HA -
>
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
>
> although interestingly from the "Details" tab it shows -
>
>
> HA enabled No
>
>
> which I assume is a cosmetic issue ?
>
>
> On each compute node I have one HA enabled VM and one non HA enabled VM.
>
>
> I power off a compute node and the UI updates the host status and the VMs
> on that node stop responding but they never fail over to the other node.
>
>
> Couple of things I noticed -
>
>
> 1) as soon as i power off the compute node the HA state on the other node
> shows "Ineligible"
>
>
> 2) In the UI the instances all still show as green even though two of them
> are not available
>
>
> Any help much appreciated
>
>
>
>
> ________________________________
> From: victor <vi...@ihnetworks.com>
> Sent: 07 March 2018 17:01
> To: users@cloudstack.apache.org
> Subject: KVM HostHA
>
> Hello Guys,
>
> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
> added. I have also added ipmi successfully (using ipmi driver).
> The hosts are showing like the following.
>
> =======
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
> ======
>
> Also the host is showing the following correctly
>
> Resource state --> Enabled
> State --> UP
> Power state --> On
>
> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> working. I have waited for half an hour. But nothing has happened. What
> will happen to the VM's in that host, if the host failed to back up.
> There isn't much from logs.
>
> Regards
> Victor
>

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
Hi all,

One small note, unplugging the management NIC will only cause an HA event if the storage is running over that NIC also.

Is the storage is over a separate NIC then, the guest VMs will continue to run when the mgmt. NIC is unplugged, Host HA will detect the disk activity and conclude that there is nothing it can do, as the VMs are still running other than mark the hosts as degraded.


Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Parth Patel <pa...@gmail.com> 
Sent: 12 March 2018 17:35
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

>
> Hi Jon,
>
> As I said, in my case, making the host HA didn't work but by just 
> having a HA VM running on host and executing - (WARNING) "echo c > 
> /proc/sysrq-trigger" to simulate a kernel crash on host, the 
> management server registered it as down and started the VM on another 
> host. I know I've suggested this before but I insist you give this a 
> try. Also, you don't need to completely power off the machine manually 
> but just plugging out the network cable works fine. The cloudstack 
> agent after losing connection to management server auto reboots 
> because of KVM heartbeat check shell script mentioned by Rohit Yadav 
> to one of my earlier queries in other thread.
>
> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
> Hi Paul
>
>
> Thanks for the response.
>
>
> I think I am not understanding how it was meant to work then. My 
> understanding was that the manager used ipmitool to just keep querying 
> the compute nodes as to their status so I assumed it didn't matter how 
> you shut the node down, once it was down the manager would get no 
> response and mark it as down (which it does).
>
>
> I am in testing mode so I think I will just go and pull the power and 
> see what happens :)
>
>
> Thanks
>
>
> Jon
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 12 March 2018 15:31
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
> Hi Jon,
>
> I think that what you guys are finding, is that a controlled host 
> shutdown, which will cause the agent to shutdown cleanly; Is not 
> considered an HA event. I wouldn't expect CloudStack to take any 
> action if you shut down a host, only if the host (agent) stops responding.
>
>
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge 
> is a framework developed by ShapeBlue to deliver the rapid deployment 
> of a standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
>
>
>
>
> -----Original Message-----
> From: Jon Marshall <jm...@hotmail.co.uk>
> Sent: 12 March 2018 15:15
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> I have the same issue here and am not entirely sure what the behaviour
> should be.
>
>
> I have one manager node and 2 compute nodes running 4.11 with ipmi working
> correctly.
>
>
> From the UI under HA -
>
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
>
> although interestingly from the "Details" tab it shows -
>
>
> HA enabled No
>
>
> which I assume is a cosmetic issue ?
>
>
> On each compute node I have one HA enabled VM and one non HA enabled VM.
>
>
> I power off a compute node and the UI updates the host status and the VMs
> on that node stop responding but they never fail over to the other node.
>
>
> Couple of things I noticed -
>
>
> 1) as soon as i power off the compute node the HA state on the other node
> shows "Ineligible"
>
>
> 2) In the UI the instances all still show as green even though two of them
> are not available
>
>
> Any help much appreciated
>
>
>
>
> ________________________________
> From: victor <vi...@ihnetworks.com>
> Sent: 07 March 2018 17:01
> To: users@cloudstack.apache.org
> Subject: KVM HostHA
>
> Hello Guys,
>
> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
> added. I have also added ipmi successfully (using ipmi driver).
> The hosts are showing like the following.
>
> =======
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
> ======
>
> Also the host is showing the following correctly
>
> Resource state --> Enabled
> State --> UP
> Power state --> On
>
> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> working. I have waited for half an hour. But nothing has happened. What
> will happen to the VM's in that host, if the host failed to back up.
> There isn't much from logs.
>
> Regards
> Victor
>

Re: KVM HostHA

Posted by Parth Patel <pa...@gmail.com>.
>
> Hi Jon,
>
> As I said, in my case, making the host HA didn't work but by just having a
> HA VM running on host and executing - (WARNING) "echo c >
> /proc/sysrq-trigger" to simulate a kernel crash on host, the management
> server registered it as down and started the VM on another host. I know
> I've suggested this before but I insist you give this a try. Also, you
> don't need to completely power off the machine manually but just plugging
> out the network cable works fine. The cloudstack agent after losing
> connection to management server auto reboots because of KVM heartbeat check
> shell script mentioned by Rohit Yadav to one of my earlier queries in other
> thread.
>
> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jm...@hotmail.co.uk> wrote:
> Hi Paul
>
>
> Thanks for the response.
>
>
> I think I am not understanding how it was meant to work then. My
> understanding was that the manager used ipmitool to just keep querying the
> compute nodes as to their status so I assumed it didn't matter how you shut
> the node down, once it was down the manager would get no response and mark
> it as down (which it does).
>
>
> I am in testing mode so I think I will just go and pull the power and see
> what happens :)
>
>
> Thanks
>
>
> Jon
>
>
> ________________________________
> From: Paul Angus <pa...@shapeblue.com>
> Sent: 12 March 2018 15:31
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
> Hi Jon,
>
> I think that what you guys are finding, is that a controlled host
> shutdown, which will cause the agent to shutdown cleanly; Is not considered
> an HA event. I wouldn't expect CloudStack to take any action if you shut
> down a host, only if the host (agent) stops responding.
>
>
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Jon Marshall <jm...@hotmail.co.uk>
> Sent: 12 March 2018 15:15
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> I have the same issue here and am not entirely sure what the behaviour
> should be.
>
>
> I have one manager node and 2 compute nodes running 4.11 with ipmi working
> correctly.
>
>
> From the UI under HA -
>
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
>
> although interestingly from the "Details" tab it shows -
>
>
> HA enabled No
>
>
> which I assume is a cosmetic issue ?
>
>
> On each compute node I have one HA enabled VM and one non HA enabled VM.
>
>
> I power off a compute node and the UI updates the host status and the VMs
> on that node stop responding but they never fail over to the other node.
>
>
> Couple of things I noticed -
>
>
> 1) as soon as i power off the compute node the HA state on the other node
> shows "Ineligible"
>
>
> 2) In the UI the instances all still show as green even though two of them
> are not available
>
>
> Any help much appreciated
>
>
>
>
> ________________________________
> From: victor <vi...@ihnetworks.com>
> Sent: 07 March 2018 17:01
> To: users@cloudstack.apache.org
> Subject: KVM HostHA
>
> Hello Guys,
>
> I have installed cloudstack 4.11. I have enabled HA for each hosts I have
> added. I have also added ipmi successfully (using ipmi driver).
> The hosts are showing like the following.
>
> =======
>
> HA Enabled Yes
> HA State Available
> HA Provider kvmhaprovider
>
> ======
>
> Also the host is showing the following correctly
>
> Resource state --> Enabled
> State --> UP
> Power state --> On
>
> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> working. I have waited for half an hour. But nothing has happened. What
> will happen to the VM's in that host, if the host failed to back up.
> There isn't much from logs.
>
> Regards
> Victor
>

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
I tested using:

echo c > /proc/sysrq-trigger

which will crash the host.

Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Jon Marshall <jm...@hotmail.co.uk> 
Sent: 12 March 2018 15:53
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul


Thanks for the response.


I think I am not understanding how it was meant to work then.  My understanding was that the manager used ipmitool to just keep querying the compute nodes as to their status so I assumed it didn't matter how you shut the node down, once it was down the manager would get no response and mark it as down (which it does).


I am in testing mode so I think I will just go and pull the power and see what happens :)


Thanks


Jon


________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 12 March 2018 15:31
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA
 Hi Jon,

I think that what you guys are finding, is that a controlled host shutdown, which will cause the agent to shutdown cleanly;  Is not considered an HA event.  I wouldn't expect CloudStack to take any action if you shut down a host, only if the host (agent) stops responding.




Kind regards,

Paul Angus

paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue




-----Original Message-----
From: Jon Marshall <jm...@hotmail.co.uk>
Sent: 12 March 2018 15:15
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

I have the same issue here and am not entirely sure what the behaviour should be.


I have one manager node and 2 compute nodes running 4.11 with ipmi working correctly.


From the UI under HA -


HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider


although interestingly from the "Details" tab it shows -


HA enabled   No


which I assume is a cosmetic issue ?


On each compute node I have one HA enabled VM and one non HA enabled VM.


I power off a compute node and the UI updates the host status and the VMs on that node stop responding but they never fail over to the other node.


Couple of things I noticed -


1) as soon as i power off the compute node the HA state on the other node shows "Ineligible"


2) In the UI the instances all still show as green even though two of them are not available


Any help much appreciated




________________________________
From: victor <vi...@ihnetworks.com>
Sent: 07 March 2018 17:01
To: users@cloudstack.apache.org
Subject: KVM HostHA

Hello Guys,

I have installed cloudstack 4.11. I have enabled HA for each hosts I have added. I have also added ipmi successfully (using ipmi driver).
The hosts are showing like the following.

=======

HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider

======

Also the host is showing the following correctly

Resource state --> Enabled
State --> UP
Power state --> On

So I have shutdown one of the hosts to see how the KVM hosts Ha is working.  I have waited for half an hour. But nothing has happened. What will happen to the VM's in that host, if the host failed to back up.
There isn't much from logs.

Regards
Victor

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Paul


Thanks for the response.


I think I am not understanding how it was meant to work then.  My understanding was that the manager used ipmitool to just keep querying the compute nodes as to their status so I assumed it didn't matter how you shut the node down, once it was down the manager would get no response and mark it as down (which it does).


I am in testing mode so I think I will just go and pull the power and see what happens :)


Thanks


Jon


________________________________
From: Paul Angus <pa...@shapeblue.com>
Sent: 12 March 2018 15:31
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA
 Hi Jon,

I think that what you guys are finding, is that a controlled host shutdown, which will cause the agent to shutdown cleanly;  Is not considered an HA event.  I wouldn't expect CloudStack to take any action if you shut down a host, only if the host (agent) stops responding.




Kind regards,

Paul Angus

paul.angus@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed by ShapeBlue to deliver the rapid deployment of a standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-----Original Message-----
From: Jon Marshall <jm...@hotmail.co.uk>
Sent: 12 March 2018 15:15
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

I have the same issue here and am not entirely sure what the behaviour should be.


I have one manager node and 2 compute nodes running 4.11 with ipmi working correctly.


From the UI under HA -


HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider


although interestingly from the "Details" tab it shows -


HA enabled   No


which I assume is a cosmetic issue ?


On each compute node I have one HA enabled VM and one non HA enabled VM.


I power off a compute node and the UI updates the host status and the VMs on that node stop responding but they never fail over to the other node.


Couple of things I noticed -


1) as soon as i power off the compute node the HA state on the other node shows "Ineligible"


2) In the UI the instances all still show as green even though two of them are not available


Any help much appreciated




________________________________
From: victor <vi...@ihnetworks.com>
Sent: 07 March 2018 17:01
To: users@cloudstack.apache.org
Subject: KVM HostHA

Hello Guys,

I have installed cloudstack 4.11. I have enabled HA for each hosts I have added. I have also added ipmi successfully (using ipmi driver).
The hosts are showing like the following.

=======

HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider

======

Also the host is showing the following correctly

Resource state --> Enabled
State --> UP
Power state --> On

So I have shutdown one of the hosts to see how the KVM hosts Ha is working.  I have waited for half an hour. But nothing has happened. What will happen to the VM's in that host, if the host failed to back up.
There isn't much from logs.

Regards
Victor

RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
Hi Jon,

I think that what you guys are finding, is that a controlled host shutdown, which will cause the agent to shutdown cleanly;  Is not considered an HA event.  I wouldn't expect CloudStack to take any action if you shut down a host, only if the host (agent) stops responding.




Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Jon Marshall <jm...@hotmail.co.uk> 
Sent: 12 March 2018 15:15
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

I have the same issue here and am not entirely sure what the behaviour should be.


I have one manager node and 2 compute nodes running 4.11 with ipmi working correctly.


From the UI under HA -


HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider


although interestingly from the "Details" tab it shows -


HA enabled   No


which I assume is a cosmetic issue ?


On each compute node I have one HA enabled VM and one non HA enabled VM.


I power off a compute node and the UI updates the host status and the VMs on that node stop responding but they never fail over to the other node.


Couple of things I noticed -


1) as soon as i power off the compute node the HA state on the other node shows "Ineligible"


2) In the UI the instances all still show as green even though two of them are not available


Any help much appreciated




________________________________
From: victor <vi...@ihnetworks.com>
Sent: 07 March 2018 17:01
To: users@cloudstack.apache.org
Subject: KVM HostHA

Hello Guys,

I have installed cloudstack 4.11. I have enabled HA for each hosts I have added. I have also added ipmi successfully (using ipmi driver).
The hosts are showing like the following.

=======

HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider

======

Also the host is showing the following correctly

Resource state --> Enabled
State --> UP
Power state --> On

So I have shutdown one of the hosts to see how the KVM hosts Ha is working.  I have waited for half an hour. But nothing has happened. What will happen to the VM's in that host, if the host failed to back up.
There isn't much from logs.

Regards
Victor

Re: KVM HostHA

Posted by Jon Marshall <jm...@hotmail.co.uk>.
I have the same issue here and am not entirely sure what the behaviour should be.


I have one manager node and 2 compute nodes running 4.11 with ipmi working correctly.


From the UI under HA -


HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider


although interestingly from the "Details" tab it shows -


HA enabled   No


which I assume is a cosmetic issue ?


On each compute node I have one HA enabled VM and one non HA enabled VM.


I power off a compute node and the UI updates the host status and the VMs on that node stop responding but they never fail over to the other node.


Couple of things I noticed -


1) as soon as i power off the compute node the HA state on the other node shows "Ineligible"


2) In the UI the instances all still show as green even though two of them are not available


Any help much appreciated




________________________________
From: victor <vi...@ihnetworks.com>
Sent: 07 March 2018 17:01
To: users@cloudstack.apache.org
Subject: KVM HostHA

Hello Guys,

I have installed cloudstack 4.11. I have enabled HA for each hosts I
have added. I have also added ipmi successfully (using ipmi driver).
The hosts are showing like the following.

=======

HA Enabled       Yes
HA State         Available
HA Provider      kvmhaprovider

======

Also the host is showing the following correctly

Resource state --> Enabled
State --> UP
Power state --> On

So I have shutdown one of the hosts to see how the KVM hosts Ha is
working.  I have waited for half an hour. But nothing has happened. What
will happen to the VM's in that host, if the host failed to back up.
There isn't much from logs.

Regards
Victor

Re: KVM HostHA

Posted by victor <vi...@ihnetworks.com>.
Hell Paul,

I am using the default values set by cloudstack and didn't changed those 
default values.

Regards
Victor
  On 03/07/2018 11:37 PM, Paul Angus wrote:
> Hi Victor,
>
> What parameters do you have for:
>
> kvm.ha.activity.check.max.attempts
> kvm.ha.activity.check.interval
> kvm.ha.activity.check.timeout
> kvm.ha.health.check.timeout
> kvm.ha.degraded.max.period
>
> the logs should show entries relating to these, BUT.... it's possible that as you performed a clean shutdown, the agent could have sent a shutdown ack to the management server, so the management server may no longer be polling.  I'm not sure about that scenario.
>
> I recently used:
> echo c > /proc/sysrq-trigger
>
> as suggested by Nux, to simulate a host crash.
>
> paul.angus@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>    
>   
>
>
> -----Original Message-----
> From: victor <vi...@ihnetworks.com>
> Sent: 07 March 2018 17:02
> To: users@cloudstack.apache.org
> Subject: KVM HostHA
>
> Hello Guys,
>
> I have installed cloudstack 4.11. I have enabled HA for each hosts I have added. I have also added ipmi successfully (using ipmi driver). The hosts are showing like the following.
>
> =======
>
> HA Enabled 	Yes
> HA State 	Available
> HA Provider 	kvmhaprovider
>
> ======
>
> Also the host is showing the following correctly
>
> Resource state --> Enabled
> State --> UP
> Power state --> On
>
> So I have shutdown one of the hosts to see how the KVM hosts Ha is working.  I have waited for half an hour. But nothing has happened. What will happen to the VM's in that host, if the host failed to back up.
> There isn't much from logs.
>
> Regards
> Victor



RE: KVM HostHA

Posted by Paul Angus <pa...@shapeblue.com>.
Hi Victor,

What parameters do you have for:

kvm.ha.activity.check.max.attempts
kvm.ha.activity.check.interval
kvm.ha.activity.check.timeout
kvm.ha.health.check.timeout
kvm.ha.degraded.max.period

the logs should show entries relating to these, BUT.... it's possible that as you performed a clean shutdown, the agent could have sent a shutdown ack to the management server, so the management server may no longer be polling.  I'm not sure about that scenario.

I recently used:
echo c > /proc/sysrq-trigger

as suggested by Nux, to simulate a host crash.

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: victor <vi...@ihnetworks.com> 
Sent: 07 March 2018 17:02
To: users@cloudstack.apache.org
Subject: KVM HostHA

Hello Guys,

I have installed cloudstack 4.11. I have enabled HA for each hosts I have added. I have also added ipmi successfully (using ipmi driver). The hosts are showing like the following.

=======

HA Enabled 	Yes
HA State 	Available
HA Provider 	kvmhaprovider

======

Also the host is showing the following correctly

Resource state --> Enabled
State --> UP
Power state --> On

So I have shutdown one of the hosts to see how the KVM hosts Ha is working.  I have waited for half an hour. But nothing has happened. What will happen to the VM's in that host, if the host failed to back up. 
There isn't much from logs.

Regards
Victor