You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Kirk Jantzer <ki...@gmail.com> on 2013/07/24 20:07:47 UTC

HA in CS 4.1

Can someone "explain like I'm 5" how HA in CS should work? I have 4.1 setup
with XCP hosts. To simulate a host failure, I've hard powered off a host
through iDRAC and CS doesn't seem to know about it, at all -- the instances
still show as running, but I cannot connect to them, and the host shows as
up in the infrastructure section.

The instances, including the SSVM, that were on there are "gone" and have
not been restarted on any of the other hosts.

-- 
Regards,

Kirk Jantzer
http://about.met/kirkjantzer

RE: HA in CS 4.1

Posted by Geoff Higginbottom <ge...@shapeblue.com>.
Hi Kirk,

alert.wait definitely works for XenServer Hosts as we have used this in various acceptance test sessions.

I know this does not help you, but at least you know the setting is good, and something else must be causing the failure to HA.

Regards

Geoff Higginbottom

D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581

geoff.higginbottom@shapeblue.com

-----Original Message-----
From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
Sent: 25 July 2013 19:04
To: Cloudstack users mailing list
Subject: Re: HA in CS 4.1

I set alert.wait to 60sec, restarted, shut down host, and nothing :-(


On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <pa...@shapeblue.com>wrote:

> Hi Kirk,
>
> I've seen this behaviour with KVM (I filed bug CloudStack-3535) but
> we've found XenServer works fine.
>
> There is a global setting (alert.wait) which is blank by default. This
> is the number of seconds CloudStack waits after it loses communication
> a host before doing anything.  Blank somehow equals 30 mins in this case.
>
>
> Regards,
>
> Paul Angus
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> Sent: 24 July 2013 19:11
> To: Cloudstack users mailing list
> Subject: Re: HA in CS 4.1
>
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
> (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning
> null ('I don't know')
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
> (AgentTaskPool-16:null) could not reach agent, could not reach agent's
> host, returning that we don't have enough information
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (AgentTaskPool-16:null) null unable to determine the state of the host.
>  Moving on.
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (AgentTaskPool-16:null) null unable to determine the state of the host.
>  Moving on.
> 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
> (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
>
>
>
> On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <kirk.jantzer@gmail.com
> >wrote:
>
> > Can someone "explain like I'm 5" how HA in CS should work? I have
> > 4.1 setup with XCP hosts. To simulate a host failure, I've hard
> > powered off a host through iDRAC and CS doesn't seem to know about
> > it, at all
> > -- the instances still show as running, but I cannot connect to
> > them, and the host shows as up in the infrastructure section.
> >
> > The instances, including the SSVM, that were on there are "gone" and
> > have not been restarted on any of the other hosts.
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > http://about.met/kirkjantzer
> >
>
>
>
> --
> Regards,
>
> Kirk Jantzer
> c: (678) 561-5475
> http://about.met/kirkjantzer
> This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is addressed.
> Any views or opinions expressed are solely those of the author and do
> not necessarily represent those of Shape Blue Ltd or related
> companies. If you are not the intended recipient of this email, you
> must neither take any action based upon its contents, nor copy or show
> it to anyone. Please contact the sender if you believe you have
> received this email in error. Shape Blue Ltd is a company incorporated
> in England & Wales. ShapeBlue Services India LLP is operated under
> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
>



--
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer
This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue Services India LLP is operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.

Re: HA in CS 4.1

Posted by Kirk Jantzer <ki...@gmail.com>.
Does HA work in 4.0.2?


On Thu, Jul 25, 2013 at 2:23 PM, Chip Childers <ch...@sungard.com>wrote:

> On Thu, Jul 25, 2013 at 2:20 PM, Paul Angus <paul.angus@shapeblue.com
> >wrote:
>
> > Sounds like https://issues.apache.org/jira/browse/CLOUDSTACK-3535 then.
> >
> > It's been upgraded to a blocker now so 4.1.1 and 4.2 can't be released
> > until it's fixed.  I hope you can get involved in testing the fix.
> >
> >
> Well, we're not going to hold off on 4.1.1 honestly.  And I raised it to
> get further input on the issue, specifically thinking about how to get some
> thoughts pulled together for scope and possible release timing.
>
>
>
> > Regards,
> >
> > Paul Angus
> > S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> > paul.angus@shapeblue.com
> >
> > -----Original Message-----
> > From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> > Sent: 25 July 2013 19:04
> > To: Cloudstack users mailing list
> > Subject: Re: HA in CS 4.1
> >
> > I set alert.wait to 60sec, restarted, shut down host, and nothing :-(
> >
> >
> > On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <paul.angus@shapeblue.com
> > >wrote:
> >
> > > Hi Kirk,
> > >
> > > I've seen this behaviour with KVM (I filed bug CloudStack-3535) but
> > > we've found XenServer works fine.
> > >
> > > There is a global setting (alert.wait) which is blank by default. This
> > > is the number of seconds CloudStack waits after it loses communication
> > > a host before doing anything.  Blank somehow equals 30 mins in this
> case.
> > >
> > >
> > > Regards,
> > >
> > > Paul Angus
> > > S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> > > paul.angus@shapeblue.com
> > >
> > > -----Original Message-----
> > > From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> > > Sent: 24 July 2013 19:11
> > > To: Cloudstack users mailing list
> > > Subject: Re: HA in CS 4.1
> > >
> > > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
> > > (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning
> > > null ('I don't know')
> > > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
> > > (AgentTaskPool-16:null) could not reach agent, could not reach agent's
> > > host, returning that we don't have enough information
> > > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-16:null) null unable to determine the state of the host.
> > >  Moving on.
> > > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-16:null) null unable to determine the state of the host.
> > >  Moving on.
> > > 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
> > > (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
> > >
> > >
> > >
> > > On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <kirk.jantzer@gmail.com
> > > >wrote:
> > >
> > > > Can someone "explain like I'm 5" how HA in CS should work? I have
> > > > 4.1 setup with XCP hosts. To simulate a host failure, I've hard
> > > > powered off a host through iDRAC and CS doesn't seem to know about
> > > > it, at all
> > > > -- the instances still show as running, but I cannot connect to
> > > > them, and the host shows as up in the infrastructure section.
> > > >
> > > > The instances, including the SSVM, that were on there are "gone" and
> > > > have not been restarted on any of the other hosts.
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > Kirk Jantzer
> > > > http://about.met/kirkjantzer
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Kirk Jantzer
> > > c: (678) 561-5475
> > > http://about.met/kirkjantzer
> > > This email and any attachments to it may be confidential and are
> > > intended solely for the use of the individual to whom it is addressed.
> > > Any views or opinions expressed are solely those of the author and do
> > > not necessarily represent those of Shape Blue Ltd or related
> > > companies. If you are not the intended recipient of this email, you
> > > must neither take any action based upon its contents, nor copy or show
> > > it to anyone. Please contact the sender if you believe you have
> > > received this email in error. Shape Blue Ltd is a company incorporated
> > > in England & Wales. ShapeBlue Services India LLP is operated under
> > > license from Shape Blue Ltd. ShapeBlue is a registered trademark.
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > c: (678) 561-5475
> > http://about.met/kirkjantzer
> > This email and any attachments to it may be confidential and are intended
> > solely for the use of the individual to whom it is addressed. Any views
> or
> > opinions expressed are solely those of the author and do not necessarily
> > represent those of Shape Blue Ltd or related companies. If you are not
> the
> > intended recipient of this email, you must neither take any action based
> > upon its contents, nor copy or show it to anyone. Please contact the
> sender
> > if you believe you have received this email in error. Shape Blue Ltd is a
> > company incorporated in England & Wales. ShapeBlue Services India LLP is
> > operated under license from Shape Blue Ltd. ShapeBlue is a registered
> > trademark.
> >
> >
>



-- 
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer

Re: HA in CS 4.1

Posted by Bryan Whitehead <dr...@megahappy.net>.
There should be a big red all caps message on the download page and in
the docs explaining the HA is completely broken and nonfunctional for
KVM users on 4.1.0.

There is no way I'd have updated to 4.1.0 if I had caught this
earlier, once I get the time I'll likely downgrade back to 4.0.2
(Assuming it works in that).

On Thu, Jul 25, 2013 at 11:36 AM, Paul Angus <pa...@shapeblue.com> wrote:
> Noted. Thanks Chip. It has certainly seemed to have become issue of the day.
>
>
> Regards,
>
> Paul Angus
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: Chip Childers [mailto:chip.childers@sungard.com]
> Sent: 25 July 2013 19:23
> To: <us...@cloudstack.apache.org>
> Subject: Re: HA in CS 4.1
>
> On Thu, Jul 25, 2013 at 2:20 PM, Paul Angus <pa...@shapeblue.com>wrote:
>
>> Sounds like https://issues.apache.org/jira/browse/CLOUDSTACK-3535 then.
>>
>> It's been upgraded to a blocker now so 4.1.1 and 4.2 can't be released
>> until it's fixed.  I hope you can get involved in testing the fix.
>>
>>
> Well, we're not going to hold off on 4.1.1 honestly.  And I raised it to get further input on the issue, specifically thinking about how to get some thoughts pulled together for scope and possible release timing.
>
>
>
>> Regards,
>>
>> Paul Angus
>> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
>> paul.angus@shapeblue.com
>>
>> -----Original Message-----
>> From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
>> Sent: 25 July 2013 19:04
>> To: Cloudstack users mailing list
>> Subject: Re: HA in CS 4.1
>>
>> I set alert.wait to 60sec, restarted, shut down host, and nothing :-(
>>
>>
>> On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <paul.angus@shapeblue.com
>> >wrote:
>>
>> > Hi Kirk,
>> >
>> > I've seen this behaviour with KVM (I filed bug CloudStack-3535) but
>> > we've found XenServer works fine.
>> >
>> > There is a global setting (alert.wait) which is blank by default.
>> > This is the number of seconds CloudStack waits after it loses
>> > communication a host before doing anything.  Blank somehow equals 30 mins in this case.
>> >
>> >
>> > Regards,
>> >
>> > Paul Angus
>> > S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
>> > paul.angus@shapeblue.com
>> >
>> > -----Original Message-----
>> > From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
>> > Sent: 24 July 2013 19:11
>> > To: Cloudstack users mailing list
>> > Subject: Re: HA in CS 4.1
>> >
>> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
>> > (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning
>> > null ('I don't know')
>> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
>> > (AgentTaskPool-16:null) could not reach agent, could not reach
>> > agent's host, returning that we don't have enough information
>> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
>> > (AgentTaskPool-16:null) null unable to determine the state of the host.
>> >  Moving on.
>> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
>> > (AgentTaskPool-16:null) null unable to determine the state of the host.
>> >  Moving on.
>> > 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
>> > (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
>> >
>> >
>> >
>> > On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer
>> > <kirk.jantzer@gmail.com
>> > >wrote:
>> >
>> > > Can someone "explain like I'm 5" how HA in CS should work? I have
>> > > 4.1 setup with XCP hosts. To simulate a host failure, I've hard
>> > > powered off a host through iDRAC and CS doesn't seem to know about
>> > > it, at all
>> > > -- the instances still show as running, but I cannot connect to
>> > > them, and the host shows as up in the infrastructure section.
>> > >
>> > > The instances, including the SSVM, that were on there are "gone"
>> > > and have not been restarted on any of the other hosts.
>> > >
>> > > --
>> > > Regards,
>> > >
>> > > Kirk Jantzer
>> > > http://about.met/kirkjantzer
>> > >
>> >
>> >
>> >
>> > --
>> > Regards,
>> >
>> > Kirk Jantzer
>> > c: (678) 561-5475
>> > http://about.met/kirkjantzer
>> > This email and any attachments to it may be confidential and are
>> > intended solely for the use of the individual to whom it is addressed.
>> > Any views or opinions expressed are solely those of the author and
>> > do not necessarily represent those of Shape Blue Ltd or related
>> > companies. If you are not the intended recipient of this email, you
>> > must neither take any action based upon its contents, nor copy or
>> > show it to anyone. Please contact the sender if you believe you have
>> > received this email in error. Shape Blue Ltd is a company
>> > incorporated in England & Wales. ShapeBlue Services India LLP is
>> > operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Kirk Jantzer
>> c: (678) 561-5475
>> http://about.met/kirkjantzer
>> This email and any attachments to it may be confidential and are
>> intended solely for the use of the individual to whom it is addressed.
>> Any views or opinions expressed are solely those of the author and do
>> not necessarily represent those of Shape Blue Ltd or related
>> companies. If you are not the intended recipient of this email, you
>> must neither take any action based upon its contents, nor copy or show
>> it to anyone. Please contact the sender if you believe you have
>> received this email in error. Shape Blue Ltd is a company incorporated
>> in England & Wales. ShapeBlue Services India LLP is operated under
>> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
>>
>>
> This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue Services India LLP is operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.

RE: HA in CS 4.1

Posted by Paul Angus <pa...@shapeblue.com>.
Noted. Thanks Chip. It has certainly seemed to have become issue of the day.


Regards,

Paul Angus
S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
paul.angus@shapeblue.com

-----Original Message-----
From: Chip Childers [mailto:chip.childers@sungard.com]
Sent: 25 July 2013 19:23
To: <us...@cloudstack.apache.org>
Subject: Re: HA in CS 4.1

On Thu, Jul 25, 2013 at 2:20 PM, Paul Angus <pa...@shapeblue.com>wrote:

> Sounds like https://issues.apache.org/jira/browse/CLOUDSTACK-3535 then.
>
> It's been upgraded to a blocker now so 4.1.1 and 4.2 can't be released
> until it's fixed.  I hope you can get involved in testing the fix.
>
>
Well, we're not going to hold off on 4.1.1 honestly.  And I raised it to get further input on the issue, specifically thinking about how to get some thoughts pulled together for scope and possible release timing.



> Regards,
>
> Paul Angus
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> Sent: 25 July 2013 19:04
> To: Cloudstack users mailing list
> Subject: Re: HA in CS 4.1
>
> I set alert.wait to 60sec, restarted, shut down host, and nothing :-(
>
>
> On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <paul.angus@shapeblue.com
> >wrote:
>
> > Hi Kirk,
> >
> > I've seen this behaviour with KVM (I filed bug CloudStack-3535) but
> > we've found XenServer works fine.
> >
> > There is a global setting (alert.wait) which is blank by default.
> > This is the number of seconds CloudStack waits after it loses
> > communication a host before doing anything.  Blank somehow equals 30 mins in this case.
> >
> >
> > Regards,
> >
> > Paul Angus
> > S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> > paul.angus@shapeblue.com
> >
> > -----Original Message-----
> > From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> > Sent: 24 July 2013 19:11
> > To: Cloudstack users mailing list
> > Subject: Re: HA in CS 4.1
> >
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
> > (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning
> > null ('I don't know')
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
> > (AgentTaskPool-16:null) could not reach agent, could not reach
> > agent's host, returning that we don't have enough information
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> > (AgentTaskPool-16:null) null unable to determine the state of the host.
> >  Moving on.
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> > (AgentTaskPool-16:null) null unable to determine the state of the host.
> >  Moving on.
> > 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
> > (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
> >
> >
> >
> > On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer
> > <kirk.jantzer@gmail.com
> > >wrote:
> >
> > > Can someone "explain like I'm 5" how HA in CS should work? I have
> > > 4.1 setup with XCP hosts. To simulate a host failure, I've hard
> > > powered off a host through iDRAC and CS doesn't seem to know about
> > > it, at all
> > > -- the instances still show as running, but I cannot connect to
> > > them, and the host shows as up in the infrastructure section.
> > >
> > > The instances, including the SSVM, that were on there are "gone"
> > > and have not been restarted on any of the other hosts.
> > >
> > > --
> > > Regards,
> > >
> > > Kirk Jantzer
> > > http://about.met/kirkjantzer
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > c: (678) 561-5475
> > http://about.met/kirkjantzer
> > This email and any attachments to it may be confidential and are
> > intended solely for the use of the individual to whom it is addressed.
> > Any views or opinions expressed are solely those of the author and
> > do not necessarily represent those of Shape Blue Ltd or related
> > companies. If you are not the intended recipient of this email, you
> > must neither take any action based upon its contents, nor copy or
> > show it to anyone. Please contact the sender if you believe you have
> > received this email in error. Shape Blue Ltd is a company
> > incorporated in England & Wales. ShapeBlue Services India LLP is
> > operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.
> >
>
>
>
> --
> Regards,
>
> Kirk Jantzer
> c: (678) 561-5475
> http://about.met/kirkjantzer
> This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is addressed.
> Any views or opinions expressed are solely those of the author and do
> not necessarily represent those of Shape Blue Ltd or related
> companies. If you are not the intended recipient of this email, you
> must neither take any action based upon its contents, nor copy or show
> it to anyone. Please contact the sender if you believe you have
> received this email in error. Shape Blue Ltd is a company incorporated
> in England & Wales. ShapeBlue Services India LLP is operated under
> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
>
>
This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue Services India LLP is operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.

Re: HA in CS 4.1

Posted by Chip Childers <ch...@sungard.com>.
On Thu, Jul 25, 2013 at 2:20 PM, Paul Angus <pa...@shapeblue.com>wrote:

> Sounds like https://issues.apache.org/jira/browse/CLOUDSTACK-3535 then.
>
> It's been upgraded to a blocker now so 4.1.1 and 4.2 can't be released
> until it's fixed.  I hope you can get involved in testing the fix.
>
>
Well, we're not going to hold off on 4.1.1 honestly.  And I raised it to
get further input on the issue, specifically thinking about how to get some
thoughts pulled together for scope and possible release timing.



> Regards,
>
> Paul Angus
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> Sent: 25 July 2013 19:04
> To: Cloudstack users mailing list
> Subject: Re: HA in CS 4.1
>
> I set alert.wait to 60sec, restarted, shut down host, and nothing :-(
>
>
> On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <paul.angus@shapeblue.com
> >wrote:
>
> > Hi Kirk,
> >
> > I've seen this behaviour with KVM (I filed bug CloudStack-3535) but
> > we've found XenServer works fine.
> >
> > There is a global setting (alert.wait) which is blank by default. This
> > is the number of seconds CloudStack waits after it loses communication
> > a host before doing anything.  Blank somehow equals 30 mins in this case.
> >
> >
> > Regards,
> >
> > Paul Angus
> > S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> > paul.angus@shapeblue.com
> >
> > -----Original Message-----
> > From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> > Sent: 24 July 2013 19:11
> > To: Cloudstack users mailing list
> > Subject: Re: HA in CS 4.1
> >
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
> > (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning
> > null ('I don't know')
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
> > (AgentTaskPool-16:null) could not reach agent, could not reach agent's
> > host, returning that we don't have enough information
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> > (AgentTaskPool-16:null) null unable to determine the state of the host.
> >  Moving on.
> > 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> > (AgentTaskPool-16:null) null unable to determine the state of the host.
> >  Moving on.
> > 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
> > (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
> >
> >
> >
> > On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <kirk.jantzer@gmail.com
> > >wrote:
> >
> > > Can someone "explain like I'm 5" how HA in CS should work? I have
> > > 4.1 setup with XCP hosts. To simulate a host failure, I've hard
> > > powered off a host through iDRAC and CS doesn't seem to know about
> > > it, at all
> > > -- the instances still show as running, but I cannot connect to
> > > them, and the host shows as up in the infrastructure section.
> > >
> > > The instances, including the SSVM, that were on there are "gone" and
> > > have not been restarted on any of the other hosts.
> > >
> > > --
> > > Regards,
> > >
> > > Kirk Jantzer
> > > http://about.met/kirkjantzer
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > c: (678) 561-5475
> > http://about.met/kirkjantzer
> > This email and any attachments to it may be confidential and are
> > intended solely for the use of the individual to whom it is addressed.
> > Any views or opinions expressed are solely those of the author and do
> > not necessarily represent those of Shape Blue Ltd or related
> > companies. If you are not the intended recipient of this email, you
> > must neither take any action based upon its contents, nor copy or show
> > it to anyone. Please contact the sender if you believe you have
> > received this email in error. Shape Blue Ltd is a company incorporated
> > in England & Wales. ShapeBlue Services India LLP is operated under
> > license from Shape Blue Ltd. ShapeBlue is a registered trademark.
> >
>
>
>
> --
> Regards,
>
> Kirk Jantzer
> c: (678) 561-5475
> http://about.met/kirkjantzer
> This email and any attachments to it may be confidential and are intended
> solely for the use of the individual to whom it is addressed. Any views or
> opinions expressed are solely those of the author and do not necessarily
> represent those of Shape Blue Ltd or related companies. If you are not the
> intended recipient of this email, you must neither take any action based
> upon its contents, nor copy or show it to anyone. Please contact the sender
> if you believe you have received this email in error. Shape Blue Ltd is a
> company incorporated in England & Wales. ShapeBlue Services India LLP is
> operated under license from Shape Blue Ltd. ShapeBlue is a registered
> trademark.
>
>

RE: HA in CS 4.1

Posted by Paul Angus <pa...@shapeblue.com>.
Sounds like https://issues.apache.org/jira/browse/CLOUDSTACK-3535 then.

It's been upgraded to a blocker now so 4.1.1 and 4.2 can't be released until it's fixed.  I hope you can get involved in testing the fix.

Regards,

Paul Angus
S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
paul.angus@shapeblue.com

-----Original Message-----
From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
Sent: 25 July 2013 19:04
To: Cloudstack users mailing list
Subject: Re: HA in CS 4.1

I set alert.wait to 60sec, restarted, shut down host, and nothing :-(


On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <pa...@shapeblue.com>wrote:

> Hi Kirk,
>
> I've seen this behaviour with KVM (I filed bug CloudStack-3535) but
> we've found XenServer works fine.
>
> There is a global setting (alert.wait) which is blank by default. This
> is the number of seconds CloudStack waits after it loses communication
> a host before doing anything.  Blank somehow equals 30 mins in this case.
>
>
> Regards,
>
> Paul Angus
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> Sent: 24 July 2013 19:11
> To: Cloudstack users mailing list
> Subject: Re: HA in CS 4.1
>
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
> (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning
> null ('I don't know')
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
> (AgentTaskPool-16:null) could not reach agent, could not reach agent's
> host, returning that we don't have enough information
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (AgentTaskPool-16:null) null unable to determine the state of the host.
>  Moving on.
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (AgentTaskPool-16:null) null unable to determine the state of the host.
>  Moving on.
> 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
> (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
>
>
>
> On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <kirk.jantzer@gmail.com
> >wrote:
>
> > Can someone "explain like I'm 5" how HA in CS should work? I have
> > 4.1 setup with XCP hosts. To simulate a host failure, I've hard
> > powered off a host through iDRAC and CS doesn't seem to know about
> > it, at all
> > -- the instances still show as running, but I cannot connect to
> > them, and the host shows as up in the infrastructure section.
> >
> > The instances, including the SSVM, that were on there are "gone" and
> > have not been restarted on any of the other hosts.
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > http://about.met/kirkjantzer
> >
>
>
>
> --
> Regards,
>
> Kirk Jantzer
> c: (678) 561-5475
> http://about.met/kirkjantzer
> This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is addressed.
> Any views or opinions expressed are solely those of the author and do
> not necessarily represent those of Shape Blue Ltd or related
> companies. If you are not the intended recipient of this email, you
> must neither take any action based upon its contents, nor copy or show
> it to anyone. Please contact the sender if you believe you have
> received this email in error. Shape Blue Ltd is a company incorporated
> in England & Wales. ShapeBlue Services India LLP is operated under
> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
>



--
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer
This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue Services India LLP is operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.

Re: HA in CS 4.1

Posted by Kirk Jantzer <ki...@gmail.com>.
I set alert.wait to 60sec, restarted, shut down host, and nothing :-(


On Wed, Jul 24, 2013 at 4:22 PM, Paul Angus <pa...@shapeblue.com>wrote:

> Hi Kirk,
>
> I've seen this behaviour with KVM (I filed bug CloudStack-3535) but we've
> found XenServer works fine.
>
> There is a global setting (alert.wait) which is blank by default. This is
> the number of seconds CloudStack waits after it loses communication a host
> before doing anything.  Blank somehow equals 30 mins in this case.
>
>
> Regards,
>
> Paul Angus
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
> Sent: 24 July 2013 19:11
> To: Cloudstack users mailing list
> Subject: Re: HA in CS 4.1
>
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
> (AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning null
> ('I don't know')
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
> (AgentTaskPool-16:null) could not reach agent, could not reach agent's
> host, returning that we don't have enough information
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (AgentTaskPool-16:null) null unable to determine the state of the host.
>  Moving on.
> 2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
> (AgentTaskPool-16:null) null unable to determine the state of the host.
>  Moving on.
> 2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
> (AgentTaskPool-16:null) Agent state cannot be determined, do nothing
>
>
>
> On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <kirk.jantzer@gmail.com
> >wrote:
>
> > Can someone "explain like I'm 5" how HA in CS should work? I have 4.1
> > setup with XCP hosts. To simulate a host failure, I've hard powered
> > off a host through iDRAC and CS doesn't seem to know about it, at all
> > -- the instances still show as running, but I cannot connect to them,
> > and the host shows as up in the infrastructure section.
> >
> > The instances, including the SSVM, that were on there are "gone" and
> > have not been restarted on any of the other hosts.
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > http://about.met/kirkjantzer
> >
>
>
>
> --
> Regards,
>
> Kirk Jantzer
> c: (678) 561-5475
> http://about.met/kirkjantzer
> This email and any attachments to it may be confidential and are intended
> solely for the use of the individual to whom it is addressed. Any views or
> opinions expressed are solely those of the author and do not necessarily
> represent those of Shape Blue Ltd or related companies. If you are not the
> intended recipient of this email, you must neither take any action based
> upon its contents, nor copy or show it to anyone. Please contact the sender
> if you believe you have received this email in error. Shape Blue Ltd is a
> company incorporated in England & Wales. ShapeBlue Services India LLP is
> operated under license from Shape Blue Ltd. ShapeBlue is a registered
> trademark.
>



-- 
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer

RE: HA in CS 4.1

Posted by Paul Angus <pa...@shapeblue.com>.
Hi Kirk,

I've seen this behaviour with KVM (I filed bug CloudStack-3535) but we've found XenServer works fine.

There is a global setting (alert.wait) which is blank by default. This is the number of seconds CloudStack waits after it loses communication a host before doing anything.  Blank somehow equals 30 mins in this case.


Regards,

Paul Angus
S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
paul.angus@shapeblue.com

-----Original Message-----
From: Kirk Jantzer [mailto:kirk.jantzer@gmail.com]
Sent: 24 July 2013 19:11
To: Cloudstack users mailing list
Subject: Re: HA in CS 4.1

2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
(AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning null ('I don't know')
2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
(AgentTaskPool-16:null) could not reach agent, could not reach agent's host, returning that we don't have enough information
2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
(AgentTaskPool-16:null) null unable to determine the state of the host.
 Moving on.
2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
(AgentTaskPool-16:null) null unable to determine the state of the host.
 Moving on.
2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
(AgentTaskPool-16:null) Agent state cannot be determined, do nothing



On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <ki...@gmail.com>wrote:

> Can someone "explain like I'm 5" how HA in CS should work? I have 4.1
> setup with XCP hosts. To simulate a host failure, I've hard powered
> off a host through iDRAC and CS doesn't seem to know about it, at all
> -- the instances still show as running, but I cannot connect to them,
> and the host shows as up in the infrastructure section.
>
> The instances, including the SSVM, that were on there are "gone" and
> have not been restarted on any of the other hosts.
>
> --
> Regards,
>
> Kirk Jantzer
> http://about.met/kirkjantzer
>



--
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer
This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue Services India LLP is operated under license from Shape Blue Ltd. ShapeBlue is a registered trademark.

Re: HA in CS 4.1

Posted by Kirk Jantzer <ki...@gmail.com>.
2013-07-24 10:08:50,973 DEBUG [cloud.ha.AbstractInvestigatorImpl]
(AgentTaskPool-16:null) host (<HOSTIP>) cannot be pinged, returning null
('I don't know')
2013-07-24 10:08:50,973 DEBUG [cloud.ha.UserVmDomRInvestigator]
(AgentTaskPool-16:null) could not reach agent, could not reach agent's
host, returning that we don't have enough information
2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
(AgentTaskPool-16:null) null unable to determine the state of the host.
 Moving on.
2013-07-24 10:08:50,973 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
(AgentTaskPool-16:null) null unable to determine the state of the host.
 Moving on.
2013-07-24 10:08:50,973 WARN  [agent.manager.AgentManagerImpl]
(AgentTaskPool-16:null) Agent state cannot be determined, do nothing



On Wed, Jul 24, 2013 at 2:07 PM, Kirk Jantzer <ki...@gmail.com>wrote:

> Can someone "explain like I'm 5" how HA in CS should work? I have 4.1
> setup with XCP hosts. To simulate a host failure, I've hard powered off a
> host through iDRAC and CS doesn't seem to know about it, at all -- the
> instances still show as running, but I cannot connect to them, and the host
> shows as up in the infrastructure section.
>
> The instances, including the SSVM, that were on there are "gone" and have
> not been restarted on any of the other hosts.
>
> --
> Regards,
>
> Kirk Jantzer
> http://about.met/kirkjantzer
>



-- 
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer

Re: HA in CS 4.1

Posted by Salvatore Sciacco <sc...@iperweb.com>.
Done :-)
Il giorno 29/lug/2013 23:10, "Bryan Whitehead" <dr...@megahappy.net> ha
scritto:

> Salvatore, Please go vote for and add details to this bug:
> https://issues.apache.org/jira/browse/CLOUDSTACK-3535
>
> -Bryan
>
> On Mon, Jul 29, 2013 at 1:30 PM, Salvatore Sciacco <sc...@iperweb.com>
> wrote:
> > I've powered off a host of a KVM cluster to simulate the server failure
> and
> > I'm experiencing the "Agent state cannot be determined, do nothing" loop.
> >
> > How I can tell the manager that the host is died and to start the HA
> > procedure? There is some field on the db I can update?
> >
> >
> >
> >
> >
> > 2013/7/29 Kirk Jantzer <ki...@gmail.com>
> >
> >> Great information, thanks so much for sharing!!
> >>
> >>
> >> On Mon, Jul 29, 2013 at 4:28 AM, Ryan Lei <ry...@cht.com.tw> wrote:
> >>
> >> > FYR, I have done some similar tests several weeks ago to test the
> host HA
> >> > functionality.
> >> >
> >> > CS 4.0.2 + XenServer 6.0.2: Works as expected. HA-enabled VMs
> (including
> >> > System VMs) were automatically restarted on HA-dedicated hosts.
> >> > CS 4.1.0 + XCP 1.6: No HA thing happened at all much like what you
> >> > described. The states of VMs and hosts were just like before, but
> >> > inaccessible. CloudStack just couldn't seem to detect the host down!
> >> > CS 4.1.1 (git) + XenServer 6.1: Works as expected. Just like the first
> >> > case.
> >> >
> >> > I didn't have time to try other combinations, and the logs are gone,
> but
> >> I
> >> > guess the problem more likely has to to with XCP support than CS
> 4.1.0.
> >> >
> >> >
> >> >
> >>
> -------------------------------------------------------------------------------------------
> >> > Yu-Heng (Ryan) Lei, Associate Reasearcher
> >> > Chunghwa Telecom Laboratories / Cloud Computing Laboratory
> >> > ryanlei@cht.com.tw<
> >> >
> >>
> https://email.cht.com.tw/owa/redir.aspx?C=-wE1FEC3G0SWYpVkiWo8SsDdf3ZqO9AIuAPTzRnFYCUi-z4YljtI_hyVKkNHfn9F1Bn-vUWJnQ4.&URL=mailto%3aryanlei%40cht.com.tw
> >> > >
> >> > or
> >> > ryanlei750328@gmail.com
> >> >
> >> >
> >> >
> >> > On Thu, Jul 25, 2013 at 2:07 AM, Kirk Jantzer <kirk.jantzer@gmail.com
> >> > >wrote:
> >> >
> >> > > Can someone "explain like I'm 5" how HA in CS should work? I have
> 4.1
> >> > setup
> >> > > with XCP hosts. To simulate a host failure, I've hard powered off a
> >> host
> >> > > through iDRAC and CS doesn't seem to know about it, at all -- the
> >> > instances
> >> > > still show as running, but I cannot connect to them, and the host
> shows
> >> > as
> >> > > up in the infrastructure section.
> >> > >
> >> > > The instances, including the SSVM, that were on there are "gone" and
> >> have
> >> > > not been restarted on any of the other hosts.
> >> > >
> >> > > --
> >> > > Regards,
> >> > >
> >> > > Kirk Jantzer
> >> > > http://about.met/kirkjantzer
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> Kirk Jantzer
> >> c: (678) 561-5475
> >> http://about.met/kirkjantzer
> >>
>

Re: HA in CS 4.1

Posted by Bryan Whitehead <dr...@megahappy.net>.
Salvatore, Please go vote for and add details to this bug:
https://issues.apache.org/jira/browse/CLOUDSTACK-3535

-Bryan

On Mon, Jul 29, 2013 at 1:30 PM, Salvatore Sciacco <sc...@iperweb.com> wrote:
> I've powered off a host of a KVM cluster to simulate the server failure and
> I'm experiencing the "Agent state cannot be determined, do nothing" loop.
>
> How I can tell the manager that the host is died and to start the HA
> procedure? There is some field on the db I can update?
>
>
>
>
>
> 2013/7/29 Kirk Jantzer <ki...@gmail.com>
>
>> Great information, thanks so much for sharing!!
>>
>>
>> On Mon, Jul 29, 2013 at 4:28 AM, Ryan Lei <ry...@cht.com.tw> wrote:
>>
>> > FYR, I have done some similar tests several weeks ago to test the host HA
>> > functionality.
>> >
>> > CS 4.0.2 + XenServer 6.0.2: Works as expected. HA-enabled VMs (including
>> > System VMs) were automatically restarted on HA-dedicated hosts.
>> > CS 4.1.0 + XCP 1.6: No HA thing happened at all much like what you
>> > described. The states of VMs and hosts were just like before, but
>> > inaccessible. CloudStack just couldn't seem to detect the host down!
>> > CS 4.1.1 (git) + XenServer 6.1: Works as expected. Just like the first
>> > case.
>> >
>> > I didn't have time to try other combinations, and the logs are gone, but
>> I
>> > guess the problem more likely has to to with XCP support than CS 4.1.0.
>> >
>> >
>> >
>> -------------------------------------------------------------------------------------------
>> > Yu-Heng (Ryan) Lei, Associate Reasearcher
>> > Chunghwa Telecom Laboratories / Cloud Computing Laboratory
>> > ryanlei@cht.com.tw<
>> >
>> https://email.cht.com.tw/owa/redir.aspx?C=-wE1FEC3G0SWYpVkiWo8SsDdf3ZqO9AIuAPTzRnFYCUi-z4YljtI_hyVKkNHfn9F1Bn-vUWJnQ4.&URL=mailto%3aryanlei%40cht.com.tw
>> > >
>> > or
>> > ryanlei750328@gmail.com
>> >
>> >
>> >
>> > On Thu, Jul 25, 2013 at 2:07 AM, Kirk Jantzer <kirk.jantzer@gmail.com
>> > >wrote:
>> >
>> > > Can someone "explain like I'm 5" how HA in CS should work? I have 4.1
>> > setup
>> > > with XCP hosts. To simulate a host failure, I've hard powered off a
>> host
>> > > through iDRAC and CS doesn't seem to know about it, at all -- the
>> > instances
>> > > still show as running, but I cannot connect to them, and the host shows
>> > as
>> > > up in the infrastructure section.
>> > >
>> > > The instances, including the SSVM, that were on there are "gone" and
>> have
>> > > not been restarted on any of the other hosts.
>> > >
>> > > --
>> > > Regards,
>> > >
>> > > Kirk Jantzer
>> > > http://about.met/kirkjantzer
>> > >
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Kirk Jantzer
>> c: (678) 561-5475
>> http://about.met/kirkjantzer
>>

Re: HA in CS 4.1

Posted by Salvatore Sciacco <sc...@iperweb.com>.
I've powered off a host of a KVM cluster to simulate the server failure and
I'm experiencing the "Agent state cannot be determined, do nothing" loop.

How I can tell the manager that the host is died and to start the HA
procedure? There is some field on the db I can update?





2013/7/29 Kirk Jantzer <ki...@gmail.com>

> Great information, thanks so much for sharing!!
>
>
> On Mon, Jul 29, 2013 at 4:28 AM, Ryan Lei <ry...@cht.com.tw> wrote:
>
> > FYR, I have done some similar tests several weeks ago to test the host HA
> > functionality.
> >
> > CS 4.0.2 + XenServer 6.0.2: Works as expected. HA-enabled VMs (including
> > System VMs) were automatically restarted on HA-dedicated hosts.
> > CS 4.1.0 + XCP 1.6: No HA thing happened at all much like what you
> > described. The states of VMs and hosts were just like before, but
> > inaccessible. CloudStack just couldn't seem to detect the host down!
> > CS 4.1.1 (git) + XenServer 6.1: Works as expected. Just like the first
> > case.
> >
> > I didn't have time to try other combinations, and the logs are gone, but
> I
> > guess the problem more likely has to to with XCP support than CS 4.1.0.
> >
> >
> >
> -------------------------------------------------------------------------------------------
> > Yu-Heng (Ryan) Lei, Associate Reasearcher
> > Chunghwa Telecom Laboratories / Cloud Computing Laboratory
> > ryanlei@cht.com.tw<
> >
> https://email.cht.com.tw/owa/redir.aspx?C=-wE1FEC3G0SWYpVkiWo8SsDdf3ZqO9AIuAPTzRnFYCUi-z4YljtI_hyVKkNHfn9F1Bn-vUWJnQ4.&URL=mailto%3aryanlei%40cht.com.tw
> > >
> > or
> > ryanlei750328@gmail.com
> >
> >
> >
> > On Thu, Jul 25, 2013 at 2:07 AM, Kirk Jantzer <kirk.jantzer@gmail.com
> > >wrote:
> >
> > > Can someone "explain like I'm 5" how HA in CS should work? I have 4.1
> > setup
> > > with XCP hosts. To simulate a host failure, I've hard powered off a
> host
> > > through iDRAC and CS doesn't seem to know about it, at all -- the
> > instances
> > > still show as running, but I cannot connect to them, and the host shows
> > as
> > > up in the infrastructure section.
> > >
> > > The instances, including the SSVM, that were on there are "gone" and
> have
> > > not been restarted on any of the other hosts.
> > >
> > > --
> > > Regards,
> > >
> > > Kirk Jantzer
> > > http://about.met/kirkjantzer
> > >
> >
>
>
>
> --
> Regards,
>
> Kirk Jantzer
> c: (678) 561-5475
> http://about.met/kirkjantzer
>

Re: HA in CS 4.1

Posted by Kirk Jantzer <ki...@gmail.com>.
Great information, thanks so much for sharing!!


On Mon, Jul 29, 2013 at 4:28 AM, Ryan Lei <ry...@cht.com.tw> wrote:

> FYR, I have done some similar tests several weeks ago to test the host HA
> functionality.
>
> CS 4.0.2 + XenServer 6.0.2: Works as expected. HA-enabled VMs (including
> System VMs) were automatically restarted on HA-dedicated hosts.
> CS 4.1.0 + XCP 1.6: No HA thing happened at all much like what you
> described. The states of VMs and hosts were just like before, but
> inaccessible. CloudStack just couldn't seem to detect the host down!
> CS 4.1.1 (git) + XenServer 6.1: Works as expected. Just like the first
> case.
>
> I didn't have time to try other combinations, and the logs are gone, but I
> guess the problem more likely has to to with XCP support than CS 4.1.0.
>
>
> -------------------------------------------------------------------------------------------
> Yu-Heng (Ryan) Lei, Associate Reasearcher
> Chunghwa Telecom Laboratories / Cloud Computing Laboratory
> ryanlei@cht.com.tw<
> https://email.cht.com.tw/owa/redir.aspx?C=-wE1FEC3G0SWYpVkiWo8SsDdf3ZqO9AIuAPTzRnFYCUi-z4YljtI_hyVKkNHfn9F1Bn-vUWJnQ4.&URL=mailto%3aryanlei%40cht.com.tw
> >
> or
> ryanlei750328@gmail.com
>
>
>
> On Thu, Jul 25, 2013 at 2:07 AM, Kirk Jantzer <kirk.jantzer@gmail.com
> >wrote:
>
> > Can someone "explain like I'm 5" how HA in CS should work? I have 4.1
> setup
> > with XCP hosts. To simulate a host failure, I've hard powered off a host
> > through iDRAC and CS doesn't seem to know about it, at all -- the
> instances
> > still show as running, but I cannot connect to them, and the host shows
> as
> > up in the infrastructure section.
> >
> > The instances, including the SSVM, that were on there are "gone" and have
> > not been restarted on any of the other hosts.
> >
> > --
> > Regards,
> >
> > Kirk Jantzer
> > http://about.met/kirkjantzer
> >
>



-- 
Regards,

Kirk Jantzer
c: (678) 561-5475
http://about.met/kirkjantzer

Re: HA in CS 4.1

Posted by Ryan Lei <ry...@cht.com.tw>.
FYR, I have done some similar tests several weeks ago to test the host HA
functionality.

CS 4.0.2 + XenServer 6.0.2: Works as expected. HA-enabled VMs (including
System VMs) were automatically restarted on HA-dedicated hosts.
CS 4.1.0 + XCP 1.6: No HA thing happened at all much like what you
described. The states of VMs and hosts were just like before, but
inaccessible. CloudStack just couldn't seem to detect the host down!
CS 4.1.1 (git) + XenServer 6.1: Works as expected. Just like the first case.

I didn't have time to try other combinations, and the logs are gone, but I
guess the problem more likely has to to with XCP support than CS 4.1.0.

-------------------------------------------------------------------------------------------
Yu-Heng (Ryan) Lei, Associate Reasearcher
Chunghwa Telecom Laboratories / Cloud Computing Laboratory
ryanlei@cht.com.tw<https://email.cht.com.tw/owa/redir.aspx?C=-wE1FEC3G0SWYpVkiWo8SsDdf3ZqO9AIuAPTzRnFYCUi-z4YljtI_hyVKkNHfn9F1Bn-vUWJnQ4.&URL=mailto%3aryanlei%40cht.com.tw>
or
ryanlei750328@gmail.com



On Thu, Jul 25, 2013 at 2:07 AM, Kirk Jantzer <ki...@gmail.com>wrote:

> Can someone "explain like I'm 5" how HA in CS should work? I have 4.1 setup
> with XCP hosts. To simulate a host failure, I've hard powered off a host
> through iDRAC and CS doesn't seem to know about it, at all -- the instances
> still show as running, but I cannot connect to them, and the host shows as
> up in the infrastructure section.
>
> The instances, including the SSVM, that were on there are "gone" and have
> not been restarted on any of the other hosts.
>
> --
> Regards,
>
> Kirk Jantzer
> http://about.met/kirkjantzer
>