You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Cloud Udupi <ud...@gmail.com> on 2020/01/28 12:01:33 UTC

HA doesn't work

Hi,

I have two different ACS 4.13 setups with HA enabled on both. Both the
setups are configured same.

*Setup 1:* Having Dell R230 servers. HA works fine.
*Setup 2:* Having Dell R440 servers. HA doesn't work.

Both the setups have all things configured which are required for HA to
work.

When the host having VMs running on it goes down,
the host stays in disconnected state ideally the host should go in
maintenance mode and VMs should migrate to another available host.

In my case the Cloudstack shows the VMs running on the dead host.
*This is the case only with 2nd setup.*

[image: Screenshot 2020-01-27 at 8.31.28 PM.png]

Events section from ACS dashboard,
[image: Screenshot 2020-01-27 at 8.31.35 PM.png]
Am I missing something?
Please help.

Regards,
Mark

Re: HA doesn't work

Posted by Cloud Udupi <ud...@gmail.com>.
Hi,

We have performed the test again and got these agent logs,
*Note: *Out-of-band Management is configured. We are able to power on or
off the host using Out-of-band options provided in the ACS dashboard.

*2020-01-29 00:29:14,835 DEBUG [o.a.c.o.OutOfBandManagementServiceImpl]
(pool-5-thread-8:ctx-a17ea784) (logid:884f7c26) Out-of-band Management
action (OFF) on host (147c3685-7aaf-407f-a463-cafb786c0c68) failed with
error: Set Chassis Power Control to Down/Off failed: Command not supported
in present state *


*2020-01-29 00:29:14,841 DEBUG [c.c.a.t.Request]
(AgentManager-Handler-4:null) (logid:) Seq 40-5418674776656839806:
Processing:Â  { Ans: , MgmtId: 279278805473977, via: 40, Ver: v1, Flags:
10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"Heart is
beating...","wait":0}}] }*

*2020-01-29 00:29:14,841 DEBUG [c.c.a.t.Request] (pool-2-thread-46:null)
(logid:8af3b1b0) Seq 40-5418674776656839806: Received:Â  { Ans: , MgmtId:
279278805473977, via: 40(node5z5.digi.com <http://node5z5.digi.com>), Ver:
v1, Flags: 10, { Answer } }*

*2020-01-29 00:29:14,841 DEBUG [c.c.a.m.AgentManagerImpl]
(pool-2-thread-46:null) (logid:8af3b1b0) Details from executing class
com.cloud.agent.api.CheckOnHostCommand: Heart is beating...*

*2020-01-29 00:29:14,844 WARNÂ  [o.a.c.k.h.KVMHAProvider]
(pool-5-thread-8:ctx-a17ea784) (logid:884f7c26) OOBM service is not
configured or enabled for this host node11z5.digi.com
<http://node11z5.digi.com> error is Out-of-band Management action (OFF) on
host (147c3685-7aaf-407f-a463-cafb786c0c68) failed with error: Set Chassis
Power Control to Down/Off failed: Command not supported in present state*


*2020-01-29 00:29:14,844 WARNÂ  [o.a.c.h.t.BaseHATask]
(pool-5-thread-15:null) (logid:a0082d5e) Exception occurred while running
FenceTask on a resource:
org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not
configured or enabled for this host node11z5.digi.com
<http://node11z5.digi.com>*

*org.apache.cloudstack.ha.provider.HAFenceException: OOBM service is not
configured or enabled for this host node11z5.digi.com
<http://node11z5.digi.com>*

* at
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:99)*

* at
org.apache.cloudstack.kvm.ha.KVMHAProvider.fence(KVMHAProvider.java:42)*

* at
org.apache.cloudstack.ha.task.FenceTask.performAction(FenceTask.java:42)*

* at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:86)*

* at org.apache.cloudstack.ha.task.BaseHATask$1.call(BaseHATask.java:83)*

* at java.util.concurrent.FutureTask.run(FutureTask.java:266)*

* at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*

* at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*

* at java.lang.Thread.run(Thread.java:748)*

*Caused by: com.cloud.utils.exception.CloudRuntimeException: Out-of-band
Management action (OFF) on host (147c3685-7aaf-407f-a463-cafb786c0c68)
failed with error: Set Chassis Power Control to Down/Off failed: Command
not supported in present state*


* at
org.apache.cloudstack.outofbandmanagement.OutOfBandManagementServiceImpl.executePowerOperation(OutOfBandManagementServiceImpl.java:423)*

* at sun.reflect.GeneratedMethodAccessor320.invoke(Unknown Source)*

* at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*


* ... 21 more*

Regards,
Mark.
On Tue, Jan 28, 2020 at 5:46 PM Cloud Udupi <ud...@gmail.com> wrote:

> Hi Vivek,
>
> I'm using KVM.
> I'm following this guide
> http://docs.cloudstack.apache.org/en/latest/quickinstallationguide/qig.html#quick-installation-guide
>
> This works fine with R230 servers but not with R440 servers.The IPMI
> versions are same on both the setups.
> ACS dashboard shows that "HA fencing is performed" but the dead VMs are
> being shown running on the disconnected host.
>
> Regards,
> Mark
>
> On Tue, Jan 28, 2020 at 5:33 PM Vivek Kumar <vi...@indiqus.com>
> wrote:
>
>> Hello Mark,
>>
>> What hypervisor are you using ?
>>
>> Vivek Kumar
>> Manager - Cloud & DevOps
>> IndiQus Technologies
>> 24*7  O +91 11 4055 1411  |   M +91 7503460090
>> www.indiqus.com <http://indiqus.com/>
>>
>> This message is intended only for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential
>> and/or privileged. If you are not the intended recipient please delete the
>> original message and any copy of it from your computer system. You are
>> hereby notified that any dissemination, distribution or copying of this
>> communication is strictly prohibited unless proper authorization has been
>> obtained for such action. If you have received this communication in error,
>> please notify the sender immediately. Although IndiQus attempts to sweep
>> e-mail and attachments for viruses, it does not guarantee that both are
>> virus-free and accepts no liability for any damage sustained as a result of
>> viruses.
>>
>> > On 28-Jan-2020, at 5:31 PM, Cloud Udupi <ud...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I have two different ACS 4.13 setups with HA enabled on both. Both the
>> setups are configured same.
>> >
>> > Setup 1: Having Dell R230 servers. HA works fine.
>> > Setup 2: Having Dell R440 servers. HA doesn't work.
>> >
>> > Both the setups have all things configured which are required for HA to
>> work.
>> >
>> > When the host having VMs running on it goes down,
>> > the host stays in disconnected state ideally the host should go in
>> maintenance mode and VMs should migrate to another available host.
>> >
>> > In my case the Cloudstack shows the VMs running on the dead host.
>> > This is the case only with 2nd setup.
>> >
>> >
>> >
>> > Events section from ACS dashboard,
>> >
>> > Am I missing something?
>> > Please help.
>> >
>> > Regards,
>> > Mark
>>
>>

Re: HA doesn't work

Posted by Cloud Udupi <ud...@gmail.com>.
Hi Vivek,

I'm using KVM.
I'm following this guide
http://docs.cloudstack.apache.org/en/latest/quickinstallationguide/qig.html#quick-installation-guide

This works fine with R230 servers but not with R440 servers.The IPMI
versions are same on both the setups.
ACS dashboard shows that "HA fencing is performed" but the dead VMs are
being shown running on the disconnected host.

Regards,
Mark

On Tue, Jan 28, 2020 at 5:33 PM Vivek Kumar <vi...@indiqus.com> wrote:

> Hello Mark,
>
> What hypervisor are you using ?
>
> Vivek Kumar
> Manager - Cloud & DevOps
> IndiQus Technologies
> 24*7  O +91 11 4055 1411  |   M +91 7503460090
> www.indiqus.com <http://indiqus.com/>
>
> This message is intended only for the use of the individual or entity to
> which it is addressed and may contain information that is confidential
> and/or privileged. If you are not the intended recipient please delete the
> original message and any copy of it from your computer system. You are
> hereby notified that any dissemination, distribution or copying of this
> communication is strictly prohibited unless proper authorization has been
> obtained for such action. If you have received this communication in error,
> please notify the sender immediately. Although IndiQus attempts to sweep
> e-mail and attachments for viruses, it does not guarantee that both are
> virus-free and accepts no liability for any damage sustained as a result of
> viruses.
>
> > On 28-Jan-2020, at 5:31 PM, Cloud Udupi <ud...@gmail.com> wrote:
> >
> > Hi,
> >
> > I have two different ACS 4.13 setups with HA enabled on both. Both the
> setups are configured same.
> >
> > Setup 1: Having Dell R230 servers. HA works fine.
> > Setup 2: Having Dell R440 servers. HA doesn't work.
> >
> > Both the setups have all things configured which are required for HA to
> work.
> >
> > When the host having VMs running on it goes down,
> > the host stays in disconnected state ideally the host should go in
> maintenance mode and VMs should migrate to another available host.
> >
> > In my case the Cloudstack shows the VMs running on the dead host.
> > This is the case only with 2nd setup.
> >
> >
> >
> > Events section from ACS dashboard,
> >
> > Am I missing something?
> > Please help.
> >
> > Regards,
> > Mark
>
>

Re: HA doesn't work

Posted by Vivek Kumar <vi...@indiqus.com>.
Hello Mark,

What hypervisor are you using ?

Vivek Kumar
Manager - Cloud & DevOps 
IndiQus Technologies
24*7  O +91 11 4055 1411  |   M +91 7503460090 
www.indiqus.com <http://indiqus.com/>

This message is intended only for the use of the individual or entity to which it is addressed and may contain information that is confidential and/or privileged. If you are not the intended recipient please delete the original message and any copy of it from your computer system. You are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited unless proper authorization has been obtained for such action. If you have received this communication in error, please notify the sender immediately. Although IndiQus attempts to sweep e-mail and attachments for viruses, it does not guarantee that both are virus-free and accepts no liability for any damage sustained as a result of viruses.

> On 28-Jan-2020, at 5:31 PM, Cloud Udupi <ud...@gmail.com> wrote:
> 
> Hi,
> 
> I have two different ACS 4.13 setups with HA enabled on both. Both the setups are configured same.
> 
> Setup 1: Having Dell R230 servers. HA works fine.
> Setup 2: Having Dell R440 servers. HA doesn't work.
> 
> Both the setups have all things configured which are required for HA to work.
> 
> When the host having VMs running on it goes down, 
> the host stays in disconnected state ideally the host should go in maintenance mode and VMs should migrate to another available host.
> 
> In my case the Cloudstack shows the VMs running on the dead host.
> This is the case only with 2nd setup.
> 
> 
> 
> Events section from ACS dashboard,
> 
> Am I missing something?
> Please help.
> 
> Regards,
> Mark